list_avatars
List Avatars
List AI avatars with optional filters for title, status, and date range. Also supports fetching detailed information for a specific avatar by ID, including processing and consent verification results. Supports pagination.
list_avatars
List AI avatars with optional filters for title, status, and date range. Also supports fetching detailed information for a specific avatar by ID, including processing and consent verification results. Supports pagination.
create_avatar
Create a new AI avatar from a video recording. The video must be a publicly accessible MP4 file (max 300 MB) with at least 30 seconds of clear footage showing a single face. After creation, consent verification is required before the avatar can be used to generate videos. An optional webhook URL can be provided to receive status updates during processing.
get_photo_avatar_inferences
Retrieve photo avatar video inference details. Can fetch a specific inference by ID or list inferences with optional filters for photo avatar ID, title, status, and date range. Use this to check on the progress of photo avatar video generation.
create_personalized_videos
Generate personalized videos in bulk for a project (Studio API). Each video is customized with per-recipient variable values (e.g., names, custom text). The variable keys depend on how the project was configured. Video generation is asynchronous -- use the returned inference IDs to check status or set up a webhook for notifications. Use **List Projects** to see available project tags/variables.
delete_lipsyncs
Bulk delete lip-sync video inferences by their inference IDs.
create_lipsync
Create a lip-synchronized video by combining a source video with an audio file. The video must contain a visible face and the audio must contain speech. Alternatively, use the audio from the source video itself. Video generation is asynchronous -- use the inference ID to check status. An optional webhook URL can receive completion notifications.
get_sfx_history
Retrieve the history of previously generated sound effects, or fetch audio for a specific sound effect by inference ID. History entries include prompt, generation parameters, and timestamps. Use the inference ID to retrieve the actual audio data.
generate_speech
Convert text into natural-sounding speech using a specified voice. Returns base64-encoded WAV audio data. Supports 22+ Indic languages and English, including code-mixed text. Use the **List Voices** tool first to find available voice IDs.
list_projects
List all video projects within a workspace (Studio API). Returns project details including project type, status, available credits, tags, and video URLs. The project ID is required for creating personalized videos and managing webhooks.
create_photo_avatar
Create an AI avatar from a still photo image. The image must be a publicly accessible URL containing a clear face. Once published, the photo avatar can be used to generate videos with lip-synced speech. An optional webhook URL can receive status updates.
generate_avatar_video
Generate an HD video from a published AI avatar. Provide a text script (up to 2,000 characters) or an audio URL for the avatar to speak. If both are provided, the audio URL takes priority. The avatar must be in "published" status. Video generation is asynchronous -- use the inference ID to check status later.
generate_photo_avatar_video
Generate a video from a published photo avatar. Provide either an audio URL alone, or both a text script and voice sample URL together. The photo avatar will be animated to speak the provided audio or text. Video generation is asynchronous.
generate_sound_effect
Generate audio sound effects from a text description. Configure duration, number of variations, and a creativity slider. Returns base64-encoded WAV audio for each variation. Examples: "thunderstorm rumbling", "birdsong at dawn", "keyboard typing".
get_tts_history
Retrieve the history of previously generated text-to-speech audio. Returns inference IDs, input text, voice names, timestamps, and audio file URLs. Supports pagination.
list_lipsyncs
List lip-sync video inferences with optional filters, or get details for a specific lip-sync inference by ID. Supports filtering by title, status, and date range with pagination.
list_photo_avatars
List photo avatars with optional filters, or get details for a specific photo avatar by ID. Supports filtering by title, status, and date range with pagination.
list_workspaces
Retrieve all workspaces the authenticated user belongs to (Studio API). Returns workspace IDs, titles, and the user's role in each workspace. The workspace ID is required for project and video operations.
list_voices
Retrieve all available voices for text-to-speech and avatar video generation. Returns voice IDs, names, descriptions, and sample audio URLs. Use a voice ID from this list when generating speech or creating avatar videos.
get_avatar_inferences
Retrieve avatar video inference details. Can fetch a specific inference by ID or list all inferences with optional filters for avatar ID, title, status, and date range. Use this to check on the progress of avatar video generation jobs.
get_video_status
Check the generation status of a personalized video (Studio API). Returns the current status, video URL (when complete), permalink, and any error details. Use inference IDs returned from **Create Personalized Videos**.
manage_avatar_consent
Manage the consent verification process for an avatar. Can either retrieve the consent passcode that must be spoken in the consent video, or submit a consent video for verification. Consent is required before an avatar can be used to generate videos.
delete_avatars
Bulk delete avatars and/or avatar video inferences. Deleting an avatar also deletes all associated video inferences. Provide avatar IDs, inference IDs, or both to delete.
Generate personalized videos, text-to-speech audio, AI avatars, lip-synced videos, and sound effects. Create AI avatars from video or photo, then produce HD videos with synthesized speech from text scripts. Convert text to natural-sounding speech in 70+ languages including 22+ Indic languages. Generate lip-synchronized videos by combining audio with source video. Create audio sound effects from text descriptions. Run personalized video campaigns at scale by specifying per-recipient variables for bulk video generation with unique landing pages. Manage voices, avatars, projects, and workspaces. Receive webhook notifications for avatar creation, video generation, and lip-sync job completion status.
Common questions about connecting Ganai to AI agents with Metorial.