Connect Honeyhive to AI agents

Connect Honeyhive to Claude, Codex, Cursor, or other AI agents for your entire team. Metorial security, governance, observability, and gives your team a unified Magic MCP url to connect.

Supported Tools

delete_configuration

Delete Prompt Configuration

Delete a prompt configuration by its ID.

list_runs

List Experiment Runs

List evaluation/experiment runs in a project. Supports filtering by dataset, name, status, and pagination. Use this to find and compare past experiment results.

delete_dataset

Delete Dataset

Delete a dataset by its ID.

post_feedback

Post Feedback

Post user feedback or quality metrics against a specific event or session. Feedback data is used for monitoring, building evaluation datasets, and improving AI outputs over time.

update_event

Update Event

Update an existing event's metadata, feedback, metrics, outputs, or other fields. Use this to enrich trace data after the fact, such as attaching user feedback or quality metrics.

get_run

Get Experiment Run

Retrieve full details and results of an experiment run by its ID. Includes event IDs, dataset association, configuration, and metric results.

start_session

Start Session

Start a new tracing session in HoneyHive. A session represents a complete interaction or request and serves as the root of a trace tree. Child events (model, tool, chain) can be attached to it.

compare_runs

Compare Experiment Runs

Compare two experiment runs side by side. Shows metric differences, common datapoints, and per-event comparison details. Useful for detecting regressions or improvements between runs.

query_events

Query Events

Search and filter trace events using structured filters. Supports filtering by event type, metadata fields, date ranges, and more. Use for monitoring, debugging, and data exploration.

delete_run

Delete Experiment Run

Delete an experiment run by its ID.

update_dataset

Update Dataset

Update an existing dataset's name, description, or metadata.

delete_project

Delete Project

Delete a project by name. This removes the project and all associated data.

create_metric

Create Metric

Create a new evaluation metric for a project. Metrics can be code-based (custom), LLM-as-a-judge (model), or human annotation (human). They define quality criteria for scoring AI outputs.

delete_metric

Delete Metric

Delete an evaluation metric by its ID.

list_datasets

List Datasets

List datasets in a project, optionally filtering by type or specific dataset ID. Datasets contain curated input/output pairs used for evaluations and experiments.

list_metrics

List Metrics

List all evaluation metrics defined for a project. Metrics are used to score and evaluate AI application outputs during experiments and production monitoring.

create_configuration

Create Prompt Configuration

Create a new prompt configuration (managed prompt) in a project. Configurations centralize prompt management with versioning and environment-based deployment.

log_event

Log Event

Log a single trace event (model, tool, or chain) in HoneyHive. Events are nested within sessions to build a distributed trace. Use this to record individual LLM calls, tool invocations, or chain steps.

log_event_batch

Log Event Batch

Log multiple trace events in a single batch request. Efficient for high-volume ingestion. All events can optionally share a single session.

get_session

Get Session

Retrieve a session and its full trace tree by session ID. Returns the session event along with all nested child events (model, tool, chain).

list_projects

List Projects

List all projects in the workspace, optionally filtering by name. Returns project IDs, names, and descriptions.

get_event

Get Event

Retrieve a single event by its ID, including all nested child events.

create_dataset

Create Dataset

Create a new dataset in a project. Datasets hold curated datapoints for running evaluations and experiments.

add_datapoints_to_dataset

Add Datapoints to Dataset

Add raw data records to a dataset with a field mapping. The mapping defines which fields from the raw data should be used as inputs, ground truths, and history.

update_configuration

Update Prompt Configuration

Update an existing prompt configuration's parameters, environments, or other settings.

update_metric

Update Metric

Update an existing metric's definition, criteria, or settings.

delete_event

Delete Event

Delete a single event by its ID.

get_run_result

Get Run Results

Retrieve detailed evaluation results for an experiment run, including pass/fail outcomes, metric aggregations, and per-datapoint details.

create_project

Create Project

Create a new project in HoneyHive. Projects are the top-level organizational unit for traces, evaluations, datasets, and prompts.

create_run

Create Experiment Run

Create a new evaluation/experiment run. Runs associate a set of traced events with a dataset and configuration for structured evaluation and comparison.

list_configurations

List Prompt Configurations

List prompt configurations (managed prompts) in a project. Configurations can be filtered by environment and name. Use this to discover available prompts and their deployment status.

update_project

Update Project

Update an existing project's name or description.

delete_session

Delete Session

Delete a session and all of its events by session ID.

More integrations teams use with Honeyhive

GitHub

Manage repositories, issues, and pull requests. Create and configure branches, star repositories, review code, and merge changes. Automate CI/CD workflows with GitHub Actions, manage workflow runs, secrets, and artifacts. Track issues with labels, milestones, and assignees. Search across code, repositories, issues, and users. Manage organizations, teams, and memberships. Create and manage projects, gists, packages, deployments, and environments. Access security alerts including code scanning, secret scanning, and Dependabot alerts. Read and write file contents in repositories. Manage webhooks, notifications, and codespaces.

Sharepoint

Manage SharePoint sites, document libraries, lists, and files. Create, read, update, and delete lists and list items with custom columns. Upload, download, move, copy, and version files in document libraries. Search across sites, files, folders, lists, and list items using Microsoft Search. Manage permissions at site, list, and item levels with granular access control. Define and manage content types and site columns. Subscribe to webhooks for list and library change notifications. Retrieve site properties and search for sites across Microsoft 365.

Salesforce

Manage CRM data including Accounts, Contacts, Leads, Opportunities, Cases, and custom objects. Create, read, update, and delete records. Query data using SOQL and search across objects using SOSL. Perform bulk data operations for large-scale imports, exports, and migrations. Execute composite requests to batch multiple operations in a single API call. Access analytics, reports, and dashboards. Manage files and attachments associated with records. Interact with Chatter feeds, posts, and groups for social collaboration. Subscribe to real-time change events via Change Data Capture and Platform Events. Manage org metadata including custom objects, fields, layouts, and workflows. Query data using GraphQL for precise data retrieval across related objects.

Airtable

Create, read, update, and delete records in Airtable bases and tables. Manage base schemas including creating tables and fields. Filter records using formulas, sort by fields, and scope queries to specific views. Upsert records to find, create, or update in a single call. Upload attachments to records, read and write record comments, list accessible bases, and receive real-time base change events through webhooks.

Bitbucket

Manage Git repositories, pull requests, and CI/CD pipelines on Bitbucket Cloud. Create, fork, and configure repositories within workspaces and projects. Create, review, approve, merge, and decline pull requests with inline code comments. Browse source code, list commits, and manage branches and tags. Track issues with the built-in issue tracker. Trigger, monitor, and manage Bitbucket Pipelines. List workspace members, configure repository default reviewers and branch restrictions, create and manage repository webhooks, and search code across repositories.

Heroku

Deploy, manage, and scale applications on Heroku's cloud platform. Create and configure apps, scale dynos, provision add-ons (databases, caching, etc.), manage configuration variables, build and release code, add custom domains and SSL certificates, manage collaborators and team permissions, configure pipelines for continuous delivery, set up log drains, and sync data with Salesforce via Heroku Connect. Subscribe to webhooks for real-time notifications on app changes, builds, releases, dyno lifecycle events, and more.

Technical notes for Honeyhive

Trace, monitor, and evaluate LLM applications and AI agents. Create and manage projects, log distributed traces of AI application execution including model calls, tool invocations, and chain events. Run offline experiments against curated datasets with custom evaluators. Create, version, and manage datasets and prompts. Post user feedback and custom metrics against sessions or events. Query and analyze production traces with aggregated cost, latency, token usage, and quality metrics. Manage annotation queues for human feedback collection. Configure prompt deployments across environments.

Connect Honeyhive to production AI agents

See how Metorial gives Honeyhive access the governance, tracing, and security controls teams need.

Frequently asked questions

Common questions about connecting Honeyhive to AI agents with Metorial.

  1. Can Metorial connect Honeyhive to AI agents?
    Yes. Metorial connects AI agents to Honeyhive through a governed integration layer, so teams can use the provider while keeping access controlled and observable.
  2. Metorial is MCP compatible and lets teams expose approved provider tools to MCP-capable agents and clients through a controlled access layer.
  3. Metorial applies policies across users, groups, providers, agents, and individual tools, then records the context around every agent interaction.
  4. Yes. Metorial records provider activity so teams can inspect tool calls, troubleshoot integrations, and give security teams the visibility they need.