get_sitemap
Get Sitemap
Retrieve a single sitemap by its ID, including its full configuration with selectors and start URLs.
get_sitemap
Retrieve a single sitemap by its ID, including its full configuration with selectors and start URLs.
get_account
Retrieve account information including the user's email, name, and remaining page credits.
get_data_quality
Retrieve the data quality report for a scraping job. Reports whether the scraped data meets configurable thresholds for record count, failed/empty page rates, and column fill rates.
get_problematic_urls
Retrieve URLs that encountered problems during a scraping job, including empty pages, failed requests, and pages with no extracted values.
delete_sitemap
Permanently delete a sitemap and all its associated configuration. This action cannot be undone.
get_scraping_job
Retrieve the full status and statistics of a scraping job, including page counts, execution progress, and timing information.
delete_scraping_job
Permanently delete a scraping job and its associated data. This action cannot be undone.
list_scraping_jobs
List scraping jobs with pagination support. Optionally filter by sitemap ID or tag.
create_scraping_job
Execute a sitemap by creating a new scraping job. Configure the driver, proxy, timing, and optionally override start URLs. The job will begin processing and can be monitored using the Get Scraping Job tool.
manage_scheduler
Enable, disable, or retrieve the cron-based scheduler for a sitemap. When enabled, scraping jobs run automatically at specified intervals. Use action "get" to view current settings, "enable" to configure and activate, or "disable" to turn off.
create_sitemap
Create a new sitemap that defines the structure and rules for scraping a website. A sitemap includes start URLs and a tree of CSS selectors that specify what data to extract from each page.
update_sitemap
Update an existing sitemap's configuration including its name, start URLs, and selectors. The full sitemap definition must be provided.
list_sitemaps
List all sitemaps in your account with pagination support. Optionally filter by tag name.
download_scraped_data
Download the scraped data from a completed scraping job. Returns data as JSON records or raw CSV text based on the chosen format.
Create, manage, and execute web scraping jobs to extract structured data from websites. Define sitemaps with CSS selectors and start URLs, run scraping jobs with configurable drivers (fast or full JavaScript), proxies, and timing controls. Monitor job status and data quality, download scraped data in JSON or CSV format, and schedule recurring scraping jobs using cron expressions. Receive webhook notifications when scraping jobs complete. Retrieve account information and remaining page credits.
Common questions about connecting Webscraper Io to AI agents with Metorial.