Connect Webscraper Io to AI agents

Connect Webscraper Io to Claude, Codex, Cursor, or other AI agents for your entire team. Metorial security, governance, observability, and gives your team a unified Magic MCP url to connect.

Supported Tools

get_sitemap

Get Sitemap

Retrieve a single sitemap by its ID, including its full configuration with selectors and start URLs.

get_account

Get Account

Retrieve account information including the user's email, name, and remaining page credits.

get_data_quality

Get Data Quality

Retrieve the data quality report for a scraping job. Reports whether the scraped data meets configurable thresholds for record count, failed/empty page rates, and column fill rates.

get_problematic_urls

Get Problematic URLs

Retrieve URLs that encountered problems during a scraping job, including empty pages, failed requests, and pages with no extracted values.

delete_sitemap

Delete Sitemap

Permanently delete a sitemap and all its associated configuration. This action cannot be undone.

get_scraping_job

Get Scraping Job

Retrieve the full status and statistics of a scraping job, including page counts, execution progress, and timing information.

delete_scraping_job

Delete Scraping Job

Permanently delete a scraping job and its associated data. This action cannot be undone.

list_scraping_jobs

List Scraping Jobs

List scraping jobs with pagination support. Optionally filter by sitemap ID or tag.

create_scraping_job

Create Scraping Job

Execute a sitemap by creating a new scraping job. Configure the driver, proxy, timing, and optionally override start URLs. The job will begin processing and can be monitored using the Get Scraping Job tool.

manage_scheduler

Manage Scheduler

Enable, disable, or retrieve the cron-based scheduler for a sitemap. When enabled, scraping jobs run automatically at specified intervals. Use action "get" to view current settings, "enable" to configure and activate, or "disable" to turn off.

create_sitemap

Create Sitemap

Create a new sitemap that defines the structure and rules for scraping a website. A sitemap includes start URLs and a tree of CSS selectors that specify what data to extract from each page.

update_sitemap

Update Sitemap

Update an existing sitemap's configuration including its name, start URLs, and selectors. The full sitemap definition must be provided.

list_sitemaps

List Sitemaps

List all sitemaps in your account with pagination support. Optionally filter by tag name.

download_scraped_data

Download Scraped Data

Download the scraped data from a completed scraping job. Returns data as JSON records or raw CSV text based on the chosen format.

More integrations teams use with Webscraper Io

GitHub

Manage repositories, issues, and pull requests. Create and configure branches, star repositories, review code, and merge changes. Automate CI/CD workflows with GitHub Actions, manage workflow runs, secrets, and artifacts. Track issues with labels, milestones, and assignees. Search across code, repositories, issues, and users. Manage organizations, teams, and memberships. Create and manage projects, gists, packages, deployments, and environments. Access security alerts including code scanning, secret scanning, and Dependabot alerts. Read and write file contents in repositories. Manage webhooks, notifications, and codespaces.

Sharepoint

Manage SharePoint sites, document libraries, lists, and files. Create, read, update, and delete lists and list items with custom columns. Upload, download, move, copy, and version files in document libraries. Search across sites, files, folders, lists, and list items using Microsoft Search. Manage permissions at site, list, and item levels with granular access control. Define and manage content types and site columns. Subscribe to webhooks for list and library change notifications. Retrieve site properties and search for sites across Microsoft 365.

Salesforce

Manage CRM data including Accounts, Contacts, Leads, Opportunities, Cases, and custom objects. Create, read, update, and delete records. Query data using SOQL and search across objects using SOSL. Perform bulk data operations for large-scale imports, exports, and migrations. Execute composite requests to batch multiple operations in a single API call. Access analytics, reports, and dashboards. Manage files and attachments associated with records. Interact with Chatter feeds, posts, and groups for social collaboration. Subscribe to real-time change events via Change Data Capture and Platform Events. Manage org metadata including custom objects, fields, layouts, and workflows. Query data using GraphQL for precise data retrieval across related objects.

Airtable

Create, read, update, and delete records in Airtable bases and tables. Manage base schemas including creating tables and fields. Filter records using formulas, sort by fields, and scope queries to specific views. Upsert records to find, create, or update in a single call. Upload attachments to records, read and write record comments, list accessible bases, and receive real-time base change events through webhooks.

Bitbucket

Manage Git repositories, pull requests, and CI/CD pipelines on Bitbucket Cloud. Create, fork, and configure repositories within workspaces and projects. Create, review, approve, merge, and decline pull requests with inline code comments. Browse source code, list commits, and manage branches and tags. Track issues with the built-in issue tracker. Trigger, monitor, and manage Bitbucket Pipelines. List workspace members, configure repository default reviewers and branch restrictions, create and manage repository webhooks, and search code across repositories.

Heroku

Deploy, manage, and scale applications on Heroku's cloud platform. Create and configure apps, scale dynos, provision add-ons (databases, caching, etc.), manage configuration variables, build and release code, add custom domains and SSL certificates, manage collaborators and team permissions, configure pipelines for continuous delivery, set up log drains, and sync data with Salesforce via Heroku Connect. Subscribe to webhooks for real-time notifications on app changes, builds, releases, dyno lifecycle events, and more.

Technical notes for Webscraper Io

Create, manage, and execute web scraping jobs to extract structured data from websites. Define sitemaps with CSS selectors and start URLs, run scraping jobs with configurable drivers (fast or full JavaScript), proxies, and timing controls. Monitor job status and data quality, download scraped data in JSON or CSV format, and schedule recurring scraping jobs using cron expressions. Receive webhook notifications when scraping jobs complete. Retrieve account information and remaining page credits.

Connect Webscraper Io to production AI agents

See how Metorial gives Webscraper Io access the governance, tracing, and security controls teams need.

Frequently asked questions

Common questions about connecting Webscraper Io to AI agents with Metorial.

  1. Can Metorial connect Webscraper Io to AI agents?
    Yes. Metorial connects AI agents to Webscraper Io through a governed integration layer, so teams can use the provider while keeping access controlled and observable.
  2. Metorial is MCP compatible and lets teams expose approved provider tools to MCP-capable agents and clients through a controlled access layer.
  3. Metorial applies policies across users, groups, providers, agents, and individual tools, then records the context around every agent interaction.
  4. Yes. Metorial records provider activity so teams can inspect tool calls, troubleshoot integrations, and give security teams the visibility they need.