Price Per TokenPrice Per Token
Firecrawl Web Scraping Server

Firecrawl Web Scraping Server

by krieg2065

GitHub 2 198 uses
0

About

Firecrawl is a web scraping and crawling platform that extracts content from websites with JavaScript rendering support and AI-powered structured data extraction capabilities. Key features of Firecrawl: - Scrape single pages or crawl entire websites with automatic URL discovery - Render JavaScript-heavy pages for dynamic content extraction - Batch process URLs with automatic rate limiting and exponential backoff retries - Perform deep research using LLM-powered structured data extraction - Execute web searches with integrated content extraction - Filter extracted content with smart tag inclusion and exclusion rules - Simulate mobile and desktop viewports for testing responsive sites - Monitor API credit usage and handle rate limits automatically - Deploy with cloud-hosted or self-managed instances

Tools 10

firecrawl_scrape

Scrape a single webpage with advanced options for content extraction. Supports various formats including markdown, HTML, and screenshots. Can execute custom actions like clicking or scrolling before scraping.

firecrawl_map

Discover URLs from a starting point. Can use both sitemap.xml and HTML link discovery.

firecrawl_crawl

Start an asynchronous crawl of multiple pages from a starting URL. Supports depth control, path filtering, and webhook notifications.

firecrawl_batch_scrape

Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.

firecrawl_check_batch_status

Check the status of a batch scraping job.

firecrawl_check_crawl_status

Check the status of a crawl job.

firecrawl_search

Search and retrieve content from web pages with optional scraping. Returns SERP results by default (url, title, description) or full page content when scrapeOptions are provided.

firecrawl_extract

Extract structured information from web pages using LLM. Supports both cloud AI and self-hosted LLM extraction.

firecrawl_deep_research

Conduct deep research on a query using web crawling, search, and AI analysis.

firecrawl_generate_llmstxt

Generate standardized LLMs.txt file for a given URL, which provides context about how LLMs should interact with the website.

README

Firecrawl MCP Server

A Model Context Protocol (MCP) server implementation that integrates with Firecrawl for web scraping capabilities.

> Big thanks to @vrknetha, @cawstudios for the initial implementation! > > You can also play around with our MCP Server on MCP.so's playground. Thanks to MCP.so for hosting and @gstarwd for integrating our server.

Features

  • Scrape, crawl, search, extract, deep research and batch scrape support
  • Web scraping with JS rendering
  • URL discovery and crawling
  • Web search with content extraction
  • Automatic retries with exponential backoff
  • - Efficient batch processing with built-in rate limiting
  • Credit usage monitoring for cloud API
  • Comprehensive logging system
  • Support for cloud and self-hosted Firecrawl instances
  • Mobile/Desktop viewport support
  • Smart content filtering with tag inclusion/exclusion
  • Installation

    Running with npx

    env FIRECRAWL_API_KEY=fc-YOUR_API_KEY npx -y firecrawl-mcp
    

    Manual Installation

    npm install -g firecrawl-mcp
    

    Running on Cursor

    Configuring Cursor 🖥️ Note: Requires Cursor version 0.45.6+ For the most up-to-date configuration instructions, please refer to the official Cursor documentation on configuring MCP servers: Cursor MCP Server Configuration Guide

    To configure Firecrawl MCP in Cursor v0.45.6

    1. Open Cursor Settings 2. Go to Features > MCP Servers 3. Click "+ Add New MCP Server" 4. Enter the following: - Name: "firecrawl-mcp" (or your preferred name) - Type: "command" - Command: env FIRECRAWL_API_KEY=your-api-key npx -y firecrawl-mcp

    To configure Firecrawl MCP in Cursor v0.48.6

    1. Open Cursor Settings 2. Go to Features > MCP Servers 3. Click "+ Add new global MCP server" 4. Enter the following code:

       {
         "mcpServers": {
           "firecrawl-mcp": {
             "command": "npx",
             "args": ["-y", "firecrawl-mcp"],
             "env": {
               "FIRECRAWL_API_KEY": "YOUR-API-KEY"
             }
           }
         }
       }
       

    > If you are using Windows and are running into issues, try cmd /c "set FIRECRAWL_API_KEY=your-api-key && npx -y firecrawl-mcp"

    Replace your-api-key with your Firecrawl API key. If you don't have one yet, you can create an account and get it from https://www.firecrawl.dev/app/api-keys

    After adding, refresh the MCP server list to see the new tools. The Composer Agent will automatically use Firecrawl MCP when appropriate, but you can explicitly request it by describing your web scraping needs. Access the Composer via Command+L (Mac), select "Agent" next to the submit button, and enter your query.

    Running on Windsurf

    Add this to your ./codeium/windsurf/model_config.json:

    {
      "mcpServers": {
        "mcp-server-firecrawl": {
          "command": "npx",
          "args": ["-y", "firecrawl-mcp"],
          "env": {
            "FIRECRAWL_API_KEY": "YOUR_API_KEY"
          }
        }
      }
    }
    

    Installing via Smithery (Legacy)

    To install Firecrawl for Claude Desktop automatically via Smithery:

    npx -y @smithery/cli install @mendableai/mcp-server-firecrawl --client claude
    

    Configuration

    Environment Variables

    #### Required for Cloud API

  • FIRECRAWL_API_KEY: Your Firecrawl API key
  • - Required when using cloud API (default) - Optional when using self-hosted instance with FIRECRAWL_API_URL
  • FIRECRAWL_API_URL (Optional): Custom API endpoint for self-hosted instances
  • - Example: https://firecrawl.your-domain.com - If not provided, the cloud API will be used (requires API key)

    #### Optional Configuration

    ##### Retry Configuration

  • FIRECRAWL_RETRY_MAX_ATTEMPTS: Maximum number of retry attempts (default: 3)
  • FIRECRAWL_RETRY_INITIAL_DELAY: Initial delay in milliseconds before first retry (default: 1000)
  • FIRECRAWL_RETRY_MAX_DELAY: Maximum delay in milliseconds between retries (default: 10000)
  • FIRECRAWL_RETRY_BACKOFF_FACTOR: Exponential backoff multiplier (default: 2)
  • ##### Credit Usage Monitoring

  • FIRECRAWL_CREDIT_WARNING_THRESHOLD: Credit usage warning threshold (default: 1000)
  • FIRECRAWL_CREDIT_CRITICAL_THRESHOLD: Credit usage critical threshold (default: 100)
  • Configuration Examples

    For cloud API usage with custom retry and credit monitoring:

    ```bash

    Required for cloud API

    export FIRECRAWL_API_KEY=your-api-key

    Optional retry configuration

    export FIRECRAWL_RETRY_MAX_ATTEMPTS=5 # Increase max retry attempts export FIRECRAWL_RETRY_INITIAL_DELAY=2000 # Start with 2s delay export FIRECRAWL_RETRY_MAX_DELAY=30000 # Maximum 30s delay export FIRECRAWL_RETRY_BACKOFF_FACTOR=3 # More aggressive backoff

    Optional credit monitoring

    export FIRECRAWL_CR

    Related MCP Servers

    AI Research Assistant

    AI Research Assistant

    hamid-vakilzadeh

    AI Research Assistant provides comprehensive access to millions of academic papers through the Semantic Scholar and arXiv databases. This MCP server enables AI coding assistants to perform intelligent literature searches, citation network analysis, and paper content extraction without requiring an API key. Key features include: - Advanced paper search with multi-filter support by year ranges, citation thresholds, field of study, and publication type - Title matching with confidence scoring for finding specific papers - Batch operations supporting up to 500 papers per request - Citation analysis and network exploration for understanding research relationships - Full-text PDF extraction from arXiv and Wiley open-access content (Wiley TDM token required for institutional access) - Rate limits of 100 requests per 5 minutes with options to request higher limits through Semantic Scholar

    Web & Search
    12 8
    Linkup

    Linkup

    LinkupPlatform

    Linkup is a real-time web search and content extraction service that enables AI assistants to search the web and retrieve information from trusted sources. It provides source-backed answers with citations, making it ideal for fact-checking, news gathering, and research tasks. Key features of Linkup: - Real-time web search using natural language queries to find current information, news, and data - Page fetching to extract and read content from any webpage URL - Search depth modes: Standard for direct-answer queries and Deep for complex research across multiple sources - Source-backed results with citations and context from relevant, trustworthy websites - JavaScript rendering support for accessing dynamic content on JavaScript-heavy pages

    Web & Search
    2 24
    Math-MCP

    Math-MCP

    EthanHenrickson

    Math-MCP is a computation server that enables Large Language Models (LLMs) to perform accurate numerical calculations through the Model Context Protocol. It provides precise mathematical operations via a simple API to overcome LLM limitations in arithmetic and statistical reasoning. Key features of Math-MCP: - Basic arithmetic operations: addition, subtraction, multiplication, division, modulo, and bulk summation - Statistical analysis functions: mean, median, mode, minimum, and maximum calculations - Rounding utilities: floor, ceiling, and nearest integer rounding - Trigonometric functions: sine, cosine, tangent, and their inverses with degrees and radians conversion support

    Developer Tools
    22 81