Connect Scrapfly to AI agents

Connect Scrapfly to Claude, Codex, Cursor, or other AI agents for your entire team. Metorial security, governance, observability, and gives your team a unified Magic MCP url to connect.

Supported Tools

get_account_info

Get Account Info

Retrieve Scrapfly account information including current project details, usage statistics, remaining credits, concurrency limits, and subscription information.

get_crawl_status

Get Crawl Status

Retrieve the current status and progress of a running or completed crawl job. Returns page counts, credit usage, duration, and completion state.

start_crawl

Start Crawl

Start a recursive website crawl from a given URL. The crawler automatically discovers and follows links, respecting configurable limits for page count, depth, duration, and budget. Supports URL path filtering, external domain control, proxy rotation, anti-bot bypass, and multiple content formats.

get_crawl_results

Get Crawl Results

Retrieve the discovered URLs and their extracted content from a completed or running crawl. Returns the list of crawled URLs with metadata and optionally the page contents in the specified format.

extract_data

Extract Data

Extract structured data from document content using Scrapfly's standalone Extraction API. Supports three extraction methods: AI auto-extraction with predefined models (product, article, review, real estate), LLM prompt-based extraction with natural language instructions, and custom template-based extraction with CSS/XPath/JMESPath rules. Accepts HTML, XML, JSON, CSV, RSS, Markdown, and plain text input.

scrape_webpage

Scrape Webpage

Scrape any web page and retrieve its content. Supports JavaScript rendering for dynamic pages, anti-bot bypass (ASP), proxy rotation across 120+ countries, and multiple output formats (raw HTML, clean HTML, JSON, markdown, text). Can also perform inline data extraction using AI models, LLM prompts, or custom templates during the scrape.

capture_screenshot

Capture Screenshot

Capture a screenshot of any web page. Supports full-page captures, viewport-only captures, or targeting specific elements via CSS selectors. Includes options for ad/banner blocking, dark mode, custom viewport resolution, accessibility testing, and JavaScript execution before capture.

More integrations teams use with Scrapfly

GitHub

Manage repositories, issues, and pull requests. Create and configure branches, star repositories, review code, and merge changes. Automate CI/CD workflows with GitHub Actions, manage workflow runs, secrets, and artifacts. Track issues with labels, milestones, and assignees. Search across code, repositories, issues, and users. Manage organizations, teams, and memberships. Create and manage projects, gists, packages, deployments, and environments. Access security alerts including code scanning, secret scanning, and Dependabot alerts. Read and write file contents in repositories. Manage webhooks, notifications, and codespaces.

Sharepoint

Manage SharePoint sites, document libraries, lists, and files. Create, read, update, and delete lists and list items with custom columns. Upload, download, move, copy, and version files in document libraries. Search across sites, files, folders, lists, and list items using Microsoft Search. Manage permissions at site, list, and item levels with granular access control. Define and manage content types and site columns. Subscribe to webhooks for list and library change notifications. Retrieve site properties and search for sites across Microsoft 365.

Salesforce

Manage CRM data including Accounts, Contacts, Leads, Opportunities, Cases, and custom objects. Create, read, update, and delete records. Query data using SOQL and search across objects using SOSL. Perform bulk data operations for large-scale imports, exports, and migrations. Execute composite requests to batch multiple operations in a single API call. Access analytics, reports, and dashboards. Manage files and attachments associated with records. Interact with Chatter feeds, posts, and groups for social collaboration. Subscribe to real-time change events via Change Data Capture and Platform Events. Manage org metadata including custom objects, fields, layouts, and workflows. Query data using GraphQL for precise data retrieval across related objects.

Airtable

Create, read, update, and delete records in Airtable bases and tables. Manage base schemas including creating tables and fields. Filter records using formulas, sort by fields, and scope queries to specific views. Upsert records to find, create, or update in a single call. Upload attachments to records, read and write record comments, list accessible bases, and receive real-time base change events through webhooks.

Bitbucket

Manage Git repositories, pull requests, and CI/CD pipelines on Bitbucket Cloud. Create, fork, and configure repositories within workspaces and projects. Create, review, approve, merge, and decline pull requests with inline code comments. Browse source code, list commits, and manage branches and tags. Track issues with the built-in issue tracker. Trigger, monitor, and manage Bitbucket Pipelines. List workspace members, configure repository default reviewers and branch restrictions, create and manage repository webhooks, and search code across repositories.

Heroku

Deploy, manage, and scale applications on Heroku's cloud platform. Create and configure apps, scale dynos, provision add-ons (databases, caching, etc.), manage configuration variables, build and release code, add custom domains and SSL certificates, manage collaborators and team permissions, configure pipelines for continuous delivery, set up log drains, and sync data with Salesforce via Heroku Connect. Subscribe to webhooks for real-time notifications on app changes, builds, releases, dyno lifecycle events, and more.

Technical notes for Scrapfly

Scrape web pages with anti-bot bypass, proxy rotation across 120+ countries, and JavaScript rendering. Capture full-page or targeted screenshots of any website. Extract structured data from web content using AI models, LLM prompts, or custom template rules. Crawl entire websites recursively with configurable depth, limits, and URL filtering. Supports browser automation scenarios, multiple output formats (HTML, JSON, markdown, text), session persistence, caching, and asynchronous processing via webhooks. Manage scraping projects with separate API keys, budgets, and quotas.

Connect Scrapfly to production AI agents

See how Metorial gives Scrapfly access the governance, tracing, and security controls teams need.

Frequently asked questions

Common questions about connecting Scrapfly to AI agents with Metorial.

  1. Can Metorial connect Scrapfly to AI agents?
    Yes. Metorial connects AI agents to Scrapfly through a governed integration layer, so teams can use the provider while keeping access controlled and observable.
  2. Metorial is MCP compatible and lets teams expose approved provider tools to MCP-capable agents and clients through a controlled access layer.
  3. Metorial applies policies across users, groups, providers, agents, and individual tools, then records the context around every agent interaction.
  4. Yes. Metorial records provider activity so teams can inspect tool calls, troubleshoot integrations, and give security teams the visibility they need.