delete_configuration
Delete Prompt Configuration
Delete a prompt configuration by its ID.
delete_configuration
Delete a prompt configuration by its ID.
list_runs
List evaluation/experiment runs in a project. Supports filtering by dataset, name, status, and pagination. Use this to find and compare past experiment results.
delete_dataset
Delete a dataset by its ID.
post_feedback
Post user feedback or quality metrics against a specific event or session. Feedback data is used for monitoring, building evaluation datasets, and improving AI outputs over time.
update_event
Update an existing event's metadata, feedback, metrics, outputs, or other fields. Use this to enrich trace data after the fact, such as attaching user feedback or quality metrics.
get_run
Retrieve full details and results of an experiment run by its ID. Includes event IDs, dataset association, configuration, and metric results.
start_session
Start a new tracing session in HoneyHive. A session represents a complete interaction or request and serves as the root of a trace tree. Child events (model, tool, chain) can be attached to it.
compare_runs
Compare two experiment runs side by side. Shows metric differences, common datapoints, and per-event comparison details. Useful for detecting regressions or improvements between runs.
query_events
Search and filter trace events using structured filters. Supports filtering by event type, metadata fields, date ranges, and more. Use for monitoring, debugging, and data exploration.
delete_run
Delete an experiment run by its ID.
update_dataset
Update an existing dataset's name, description, or metadata.
delete_project
Delete a project by name. This removes the project and all associated data.
create_metric
Create a new evaluation metric for a project. Metrics can be code-based (custom), LLM-as-a-judge (model), or human annotation (human). They define quality criteria for scoring AI outputs.
delete_metric
Delete an evaluation metric by its ID.
list_datasets
List datasets in a project, optionally filtering by type or specific dataset ID. Datasets contain curated input/output pairs used for evaluations and experiments.
list_metrics
List all evaluation metrics defined for a project. Metrics are used to score and evaluate AI application outputs during experiments and production monitoring.
create_configuration
Create a new prompt configuration (managed prompt) in a project. Configurations centralize prompt management with versioning and environment-based deployment.
log_event
Log a single trace event (model, tool, or chain) in HoneyHive. Events are nested within sessions to build a distributed trace. Use this to record individual LLM calls, tool invocations, or chain steps.
log_event_batch
Log multiple trace events in a single batch request. Efficient for high-volume ingestion. All events can optionally share a single session.
get_session
Retrieve a session and its full trace tree by session ID. Returns the session event along with all nested child events (model, tool, chain).
list_projects
List all projects in the workspace, optionally filtering by name. Returns project IDs, names, and descriptions.
get_event
Retrieve a single event by its ID, including all nested child events.
create_dataset
Create a new dataset in a project. Datasets hold curated datapoints for running evaluations and experiments.
add_datapoints_to_dataset
Add raw data records to a dataset with a field mapping. The mapping defines which fields from the raw data should be used as inputs, ground truths, and history.
update_configuration
Update an existing prompt configuration's parameters, environments, or other settings.
update_metric
Update an existing metric's definition, criteria, or settings.
delete_event
Delete a single event by its ID.
get_run_result
Retrieve detailed evaluation results for an experiment run, including pass/fail outcomes, metric aggregations, and per-datapoint details.
create_project
Create a new project in HoneyHive. Projects are the top-level organizational unit for traces, evaluations, datasets, and prompts.
create_run
Create a new evaluation/experiment run. Runs associate a set of traced events with a dataset and configuration for structured evaluation and comparison.
list_configurations
List prompt configurations (managed prompts) in a project. Configurations can be filtered by environment and name. Use this to discover available prompts and their deployment status.
update_project
Update an existing project's name or description.
delete_session
Delete a session and all of its events by session ID.
Trace, monitor, and evaluate LLM applications and AI agents. Create and manage projects, log distributed traces of AI application execution including model calls, tool invocations, and chain events. Run offline experiments against curated datasets with custom evaluators. Create, version, and manage datasets and prompts. Post user feedback and custom metrics against sessions or events. Query and analyze production traces with aggregated cost, latency, token usage, and quality metrics. Manage annotation queues for human feedback collection. Configure prompt deployments across environments.
Common questions about connecting Honeyhive to AI agents with Metorial.