Connect Google Cloud Vision to AI agents

Connect Google Cloud Vision to Claude, Codex, Cursor, or other AI agents for your entire team. Metorial security, governance, observability, and gives your team a unified Magic MCP url to connect.

Supported Tools

detect_landmarks

Detect Landmarks

Identifies well-known natural and human-made landmarks in an image. Returns the landmark name, confidence score, bounding box, and geographical coordinates (latitude/longitude) when available. Useful for travel, geography, and location-based applications.

detect_faces

Detect Faces

Detects faces in an image and returns detailed attributes including emotional expression likelihoods (joy, sorrow, anger, surprise), facial orientation angles, detection confidence, and whether headwear or blur is present. Includes bounding box coordinates for each detected face.

get_crop_hints

Get Crop Hints

Suggests optimal crop regions for an image based on its content. You can specify target aspect ratios (width/height) and the API returns bounding boxes with confidence scores. Useful for automated image cropping, thumbnail generation, and responsive image preparation.

detect_logos

Detect Logos

Recognizes logos of popular brands within an image. Returns the brand name, confidence score, and bounding box for each detected logo. Useful for brand monitoring, market analysis, and detecting brand presence in social media images.

detect_labels

Detect Labels

Identifies general objects, locations, activities, animal species, products, and more within an image. Returns descriptive labels with confidence scores. Useful for image categorization, content tagging, and understanding image contents at a high level.

analyze_image

Analyze Image

Performs multiple Vision API detection features on a single image in one request. Select any combination of features (labels, objects, faces, landmarks, logos, text, safe search, image properties, crop hints, web detection) to analyze an image comprehensively. More efficient than making separate calls for each feature.

detect_image_properties

Detect Image Properties

Analyzes an image to determine its dominant colors, returning RGB values, coverage fraction, and relevance scores. Useful for color palette extraction, design workflows, and image categorization by color.

detect_web

Detect Web Entities

Searches the web for information related to an image. Returns matching web entities, pages containing the image, visually similar images, and best-guess labels describing the image content. Useful for reverse image search, finding image sources, and understanding web presence.

detect_objects

Detect Objects

Detects and localizes multiple objects in an image, returning each object's name, confidence score, and bounding box coordinates. Useful for understanding object positions and spatial relationships within an image. Object names are returned in English only.

detect_safe_search

Detect Safe Search

Analyzes an image for explicit or inappropriate content across five categories: adult, spoof, medical, violence, and racy. Returns a likelihood rating for each category. Useful for content moderation and filtering.

detect_text

Detect Text (OCR)

Extracts text from images using optical character recognition (OCR). Supports two modes: standard text detection for photos and general scenes, and document text detection optimized for dense text, documents, and handwriting. Returns the full extracted text along with individual text blocks and their positions.

More integrations teams use with Google Cloud Vision

GitHub

Manage repositories, issues, and pull requests. Create and configure branches, star repositories, review code, and merge changes. Automate CI/CD workflows with GitHub Actions, manage workflow runs, secrets, and artifacts. Track issues with labels, milestones, and assignees. Search across code, repositories, issues, and users. Manage organizations, teams, and memberships. Create and manage projects, gists, packages, deployments, and environments. Access security alerts including code scanning, secret scanning, and Dependabot alerts. Read and write file contents in repositories. Manage webhooks, notifications, and codespaces.

Sharepoint

Manage SharePoint sites, document libraries, lists, and files. Create, read, update, and delete lists and list items with custom columns. Upload, download, move, copy, and version files in document libraries. Search across sites, files, folders, lists, and list items using Microsoft Search. Manage permissions at site, list, and item levels with granular access control. Define and manage content types and site columns. Subscribe to webhooks for list and library change notifications. Retrieve site properties and search for sites across Microsoft 365.

Salesforce

Manage CRM data including Accounts, Contacts, Leads, Opportunities, Cases, and custom objects. Create, read, update, and delete records. Query data using SOQL and search across objects using SOSL. Perform bulk data operations for large-scale imports, exports, and migrations. Execute composite requests to batch multiple operations in a single API call. Access analytics, reports, and dashboards. Manage files and attachments associated with records. Interact with Chatter feeds, posts, and groups for social collaboration. Subscribe to real-time change events via Change Data Capture and Platform Events. Manage org metadata including custom objects, fields, layouts, and workflows. Query data using GraphQL for precise data retrieval across related objects.

Airtable

Create, read, update, and delete records in Airtable bases and tables. Manage base schemas including creating tables and fields. Filter records using formulas, sort by fields, and scope queries to specific views. Upsert records to find, create, or update in a single call. Upload attachments to records, read and write record comments, list accessible bases, and receive real-time base change events through webhooks.

Confluence

Create, read, update, and delete pages, blog posts, comments, and attachments in Confluence spaces. Manage spaces, permissions, labels, and content restrictions. Search content using Confluence Query Language (CQL). Upload and download file attachments with versioning. Manage users, groups, and group memberships. Create and manage whiteboards, databases, folders, and templates. View and update inline tasks. Access audit logs. Listen for webhooks on page, blog, comment, attachment, space, label, and user events.

Bitbucket

Manage Git repositories, pull requests, and CI/CD pipelines on Bitbucket Cloud. Create, fork, and configure repositories within workspaces and projects. Create, review, approve, merge, and decline pull requests with inline code comments. Browse source code, list commits, and manage branches and tags. Track issues with the built-in issue tracker. Trigger, monitor, and manage Bitbucket Pipelines. List workspace members, configure repository default reviewers and branch restrictions, create and manage repository webhooks, and search code across repositories.

Technical notes for Google Cloud Vision

Analyze images using pre-trained machine learning models. Detect and label objects, faces, landmarks, logos, and text (OCR) in images. Extract text from photos, documents, PDFs, and handwritten content. Detect explicit or unsafe content with SafeSearch. Identify dominant colors and image properties. Localize multiple objects with bounding regions. Get crop hints for optimal image framing. Detect web entities and find visually similar images online. Supports base64-encoded images, Cloud Storage URIs, and public URLs. Run asynchronous batch annotation on up to 2000 images.

Connect Google Cloud Vision to production AI agents

See how Metorial gives Google Cloud Vision access the governance, tracing, and security controls teams need.

Frequently asked questions

Common questions about connecting Google Cloud Vision to AI agents with Metorial.

  1. Can Metorial connect Google Cloud Vision to AI agents?
    Yes. Metorial connects AI agents to Google Cloud Vision through a governed integration layer, so teams can use the provider while keeping access controlled and observable.
  2. Metorial is MCP compatible and lets teams expose approved provider tools to MCP-capable agents and clients through a controlled access layer.
  3. Metorial applies policies across users, groups, providers, agents, and individual tools, then records the context around every agent interaction.
  4. Yes. Metorial records provider activity so teams can inspect tool calls, troubleshoot integrations, and give security teams the visibility they need.