detect_landmarks
Identifies well-known natural and human-made landmarks in an image. Returns the landmark name, confidence score, bounding box, and geographical coordinates (latitude/longitude) when available. Useful for travel, geography, and location-based applications.
detect_faces
Detects faces in an image and returns detailed attributes including emotional expression likelihoods (joy, sorrow, anger, surprise), facial orientation angles, detection confidence, and whether headwear or blur is present. Includes bounding box coordinates for each detected face.
get_crop_hints
Suggests optimal crop regions for an image based on its content. You can specify target aspect ratios (width/height) and the API returns bounding boxes with confidence scores. Useful for automated image cropping, thumbnail generation, and responsive image preparation.
detect_logos
Recognizes logos of popular brands within an image. Returns the brand name, confidence score, and bounding box for each detected logo. Useful for brand monitoring, market analysis, and detecting brand presence in social media images.
detect_labels
Identifies general objects, locations, activities, animal species, products, and more within an image. Returns descriptive labels with confidence scores. Useful for image categorization, content tagging, and understanding image contents at a high level.
analyze_image
Performs multiple Vision API detection features on a single image in one request. Select any combination of features (labels, objects, faces, landmarks, logos, text, safe search, image properties, crop hints, web detection) to analyze an image comprehensively. More efficient than making separate calls for each feature.
detect_image_properties
Analyzes an image to determine its dominant colors, returning RGB values, coverage fraction, and relevance scores. Useful for color palette extraction, design workflows, and image categorization by color.
detect_web
Searches the web for information related to an image. Returns matching web entities, pages containing the image, visually similar images, and best-guess labels describing the image content. Useful for reverse image search, finding image sources, and understanding web presence.
detect_objects
Detects and localizes multiple objects in an image, returning each object's name, confidence score, and bounding box coordinates. Useful for understanding object positions and spatial relationships within an image. Object names are returned in English only.
detect_safe_search
Analyzes an image for explicit or inappropriate content across five categories: adult, spoof, medical, violence, and racy. Returns a likelihood rating for each category. Useful for content moderation and filtering.
detect_text
Extracts text from images using optical character recognition (OCR). Supports two modes: standard text detection for photos and general scenes, and document text detection optimized for dense text, documents, and handwriting. Returns the full extracted text along with individual text blocks and their positions.