Connect AI Agents to
Firecrawl

Automate workflows and connect AI agents to Firecrawl. Metorial is built for developers. Handling OAuth, compliance, observability, and more.

Deploy on Metorial

Learn More

Firecrawl on Metorial

The Firecrawl integration lets you scrape websites, extract structured data, and convert web pages into LLM-ready formats directly from your MCP-enabled applications.

Deploy on Metorial

Combine Firecrawl with other tools

Metorial has 600+ integrations available. Here are some related ones you might find interesting.

Hackernews

The Hackernews integration lets you search and retrieve stories, comments, and user data from Hackernews directly within your workflow, enabling you to analyze trends, monitor discussions, and gather insights from the tech community.

Exa

The Exa integration lets you search the web using neural search capabilities and retrieve high-quality, AI-ready content directly within your MCP-enabled applications.

Google Calendar

The Google Calendar integration lets you view, create, and manage calendar events directly from your workflow, enabling seamless scheduling and time management without switching contexts.

Google Drive

The Google Drive integration lets you search, read, create, and manage files and folders in your Drive directly through AI interactions. Use it to organize documents, retrieve file contents, share files, and automate common Drive tasks without switching to your browser.

Microsoft 356

The Microsoft 365 integration lets you access and manage your emails, calendar events, documents, and collaborate across Word, Excel, PowerPoint, and Teams directly from your workspace. Use it to read and send messages, schedule meetings, edit files in OneDrive and SharePoint, and streamline your productivity workflows.

Neon

The Neon integration lets you connect to your Neon Postgres databases to query data, inspect schemas, and manage database operations directly from your AI assistant.

Supabase

The Supabase integration lets you query and manage your database, authentication, and storage directly from your AI assistant, enabling natural language database operations and real-time data access.

Linear

The Linear integration lets you create, update, and search issues directly from your workspace, enabling seamless project management and task tracking without leaving your development environment.

Tavily

The Tavily integration lets you perform AI-powered web searches and retrieve real-time information from across the internet directly within your MCP-enabled applications, enabling your AI assistants to access current data and factual content for more accurate and up-to-date responses.

Connect anything. Anywhere.

Supported tools and capabilities

Metorial helps you connect AI agents to Firecrawl with various tools and resources. Tools allow you to interact with perform specific actions, while resources provide read-only access to data and information.

scrape_url

Scrape a single URL and return its content in various formats (markdown, HTML, links, screenshots, etc.). Supports advanced features like custom actions, JavaScript execution, and structured data extraction.

start_crawl

Start a crawl job to spider an entire website or domain. Supports path filtering, depth control, webhooks, and all scraping options. Returns a crawl job ID for tracking progress.

get_crawl_status

Check the status and progress of a crawl job. Returns current status, number of pages crawled, credits used, and scraped data.

cancel_crawl

Cancel an ongoing crawl job.

scraped-page

Access scraped content of a specific URL

crawl-job

Access a specific crawl job and its results

crawl-pages

Access all pages from a crawl job

crawl-page

Access a specific page from a crawl job by index

Help & Documentation

Find guides and articles to help you get started with Firecrawl on Metorial.

Learn about Metorial APIs

Deploying MCP Servers on Metorial

Observability & Monitoring in Metorial

Firecrawl

How to Retrieve Top Stories and Story Feeds

Firecrawl

Accessing and Navigating Comment Threads

Firecrawl

Getting Started with the Hacker News MCP Server

Firecrawl

Retrieving User Profiles and Activity History

Firecrawl

Monitoring Real-Time Hacker News Updates

Firecrawl

Querying Specific Items by Unique Identifiers

Firecrawl

Using the Server for Trend Analysis and Research

Firecrawl

Troubleshooting Common Connection Issues

More about Firecrawl

Firecrawl MCP Server

The Firecrawl MCP Server provides powerful web scraping and crawling capabilities through the Model Context Protocol. It enables you to extract content from web pages in multiple formats, perform advanced browser automation tasks, and crawl entire websites with sophisticated filtering and control options. Whether you need to scrape a single page or systematically harvest data from an entire domain, this server offers the tools to get structured, clean data from the web.

Overview

Firecrawl is a comprehensive web scraping solution that goes beyond simple HTML retrieval. It handles JavaScript-heavy sites, performs browser automation, extracts structured data using AI, and provides multiple output formats including markdown, HTML, screenshots, and custom JSON schemas. The server supports both single-page scraping and large-scale website crawling with features like proxy rotation, ad-blocking, mobile emulation, and intelligent content extraction.

Tools

scrape_url

Scrape a single URL and return its content in various formats. This is your primary tool for extracting data from individual web pages.

Parameters:

url (required): The URL to scrape
formats: Output formats for the scraped content. Options include:
- markdown: Clean markdown representation of the page
- html: Cleaned HTML content
- rawHtml: Original unprocessed HTML
- links: All links found on the page
- screenshot: Visual capture of the page
- summary: AI-generated summary of the content
- json: Structured data extraction using a custom schema with optional prompt
actions: Sequence of browser automation steps to perform before scraping:
- wait: Pause for a specified duration or until a selector appears
- click: Click on elements matching a CSS selector
- write: Type text into input fields
- press: Press keyboard keys
- scroll: Scroll the page up or down
- screenshot: Capture a screenshot at this point
- executeJavascript: Run custom JavaScript code
- scrape: Extract content at this point
- pdf: Generate a PDF of the page
proxy: Proxy configuration for the request (basic, stealth, or auto)
mobile: Emulate a mobile device
maxAge: Return cached version if younger than specified milliseconds (default: 2 days)
headers: Custom HTTP headers including cookies and user-agent
timeout: Request timeout in milliseconds
waitFor: Delay before fetching content
blockAds: Enable ad-blocking and cookie popup removal
location: Geographic settings with country code and languages
excludeTags: HTML tags to exclude from output
includeTags: HTML tags to include in output
onlyMainContent: Extract only the main content, excluding navigation and footers
removeBase64Images: Strip base64-encoded images from output
skipTlsVerification: Skip TLS certificate verification
storeInCache: Store the page in Firecrawl's cache
zeroDataRetention: Enable zero data retention mode for privacy

start_crawl

Start a crawl job to systematically spider an entire website or domain. This tool initiates a background job that discovers and scrapes multiple pages according to your specifications.

Parameters:

url (required): The starting URL for the crawl
limit: Maximum number of pages to crawl (default: 10000)
includePaths: Array of regex patterns for URL paths to include
excludePaths: Array of regex patterns for URL paths to exclude
maxDiscoveryDepth: Maximum depth to crawl based on link discovery order
allowSubdomains: Follow links to subdomains of the starting domain
crawlEntireDomain: Follow links to sibling and parent URLs, not just child pages
allowExternalLinks: Follow links to external websites
ignoreQueryParameters: Treat URLs with different query parameters as the same page
sitemap: Sitemap strategy (include to use sitemap.xml or skip to discover from links)
delay: Delay in seconds between individual page scrapes
maxConcurrency: Maximum number of pages to scrape simultaneously
webhook: Webhook configuration for receiving real-time crawl events:
- url: Webhook endpoint URL
- events: Events to subscribe to (started, page, completed, failed)
- headers: Custom headers for webhook requests
- metadata: Additional metadata to include in webhook payloads
scrapeOptions: All scraping options from scrape_url apply to each crawled page
prompt: Natural language prompt to automatically generate crawler options
zeroDataRetention: Enable zero data retention mode

get_crawl_status

Check the status and retrieve results from an ongoing or completed crawl job.

Parameters:

crawlId (required): The crawl job ID returned from start_crawl
limit: Maximum number of results to return per page
next: Pagination cursor from a previous response to retrieve additional results

Returns: Current status, progress metrics, credits used, and scraped data from all pages.

cancel_crawl

Stop a crawl job that is currently in progress.

Parameters:

crawlId (required): The crawl job ID to cancel

Resource Templates

The Firecrawl MCP Server provides resource templates for accessing scraped content and crawl job data through a URI-based interface.

scraped-page

Access the scraped content of a specific URL.

URI Template: firecrawl://scraped/{url}

Use this resource to retrieve previously scraped content for a given URL. The URL should be properly encoded.

crawl-job

Access information about a specific crawl job including its status and metadata.

URI Template: firecrawl://crawl/{crawlId}

Retrieve comprehensive information about a crawl job, including its current state, configuration, and summary statistics.

crawl-pages

Access all pages discovered and scraped during a crawl job.

URI Template: firecrawl://crawl/{crawlId}/pages

Get the complete collection of pages from a crawl job, including their content in the requested formats.

crawl-page

Access a specific page from a crawl job by its index position.

URI Template: firecrawl://crawl/{crawlId}/page/{pageIndex}

Retrieve an individual page from a crawl job's results using its zero-based index.

Key Features

Multiple Output Formats

Extract web content in the format that best suits your needs. Convert web pages to clean markdown for LLM consumption, preserve HTML structure for parsing, capture visual screenshots, or extract structured data using custom JSON schemas with AI-powered extraction.

Advanced Browser Automation

Perform complex interactions with web pages before scraping. Click buttons, fill forms, scroll to load dynamic content, wait for elements to appear, and execute custom JavaScript. These actions enable scraping of JavaScript-heavy applications and content behind interactions.

Intelligent Content Extraction

Use AI-powered extraction to get only the content you need. The onlyMainContent option removes navigation, footers, and sidebars automatically. Custom JSON schemas with prompts allow you to extract specific structured data points using natural language instructions.

Large-Scale Crawling

Systematically crawl entire websites with sophisticated control over what gets scraped. Use path filtering with regex patterns to include or exclude specific sections. Control crawl depth, handle subdomains, and manage concurrency for efficient data collection. Real-time webhook notifications keep you informed of progress.

Privacy and Performance Options

Choose your proxy type based on needs: basic for speed, stealth for reliability, or auto for automatic fallback. Enable zero data retention for sensitive operations. Use caching to avoid redundant requests. Block ads and cookie popups for cleaner extraction and faster processing.

Geographic and Device Emulation

Scrape as if you're browsing from different countries with location settings. Emulate mobile devices to see mobile-optimized content. Set custom headers to match specific browser configurations.

Use Cases

This MCP server excels at research and data collection tasks. Use it to monitor competitor websites, aggregate news and articles, extract product information from e-commerce sites, collect real estate listings, gather job postings, archive web content, validate web page changes, or build datasets for machine learning. The combination of single-page scraping and site-wide crawling makes it suitable for both targeted extraction and comprehensive data harvesting operations.

The structured data extraction with custom schemas is particularly powerful for transforming unstructured web content into clean, typed data that can be directly used in applications or analysis pipelines. The browser automation capabilities enable scraping of modern single-page applications that traditional scrapers cannot handle.

Ready to build with Metorial?

Let's take your AI-powered applications to the next level, together.

Get Started

About Metorial

Metorial provides developers with instant access to 600+ MCP servers for building AI agents that can interact with real-world tools and services. Built on MCP, Metorial simplifies agent tool integration by offering pre-configured connections to popular platforms like Google Drive, Slack, GitHub, Notion, and hundreds of other APIs. Our platform supports all major AI agent frameworks—including LangChain, AutoGen, CrewAI, and LangGraph—enabling developers to add tool calling capabilities to their agents in just a few lines of code. By eliminating the need for custom integration code, Metorial helps AI developers move from prototype to production faster while maintaining security and reliability. Whether you're building autonomous research agents, customer service bots, or workflow automation tools, Metorial's MCP server library provides the integrations you need to connect your agents to the real world.