Price Per TokenPrice Per Token

Fetcher MCP Server

by jae-jae

0

About

Fetcher MCP is a web scraping and content extraction server powered by Playwright headless browser technology. It retrieves web pages with full JavaScript execution support and automatically extracts clean, readable content by filtering out ads, navigation elements, and other non-essential page components. Key features of Fetcher MCP: - JavaScript rendering for handling dynamic web applications, SPAs, and modern JavaScript-heavy sites - Intelligent content extraction using the Readability algorithm to isolate main article content - Flexible output formats including clean Markdown and raw HTML - Parallel batch processing via the `fetch_urls` tool for concurrent fetching of multiple URLs - Resource optimization that blocks unnecessary assets (images, stylesheets, fonts, media) to reduce bandwidth - Configurable parameters for timeouts, content extraction behavior, and output formatting

README

中文 | Deutsch | Español | français | 日本語 | 한국어 | Português | Русский

Fetcher MCP

MCP server for fetch web page content using Playwright headless browser.

> 🌟 Recommended: OllaMan - Powerful Ollama AI Model Manager.

Advantages

  • JavaScript Support: Unlike traditional web scrapers, Fetcher MCP uses Playwright to execute JavaScript, making it capable of handling dynamic web content and modern web applications.
  • Intelligent Content Extraction: Built-in Readability algorithm automatically extracts the main content from web pages, removing ads, navigation, and other non-essential elements.
  • Flexible Output Format: Supports both HTML and Markdown output formats, making it easy to integrate with various downstream applications.
  • Parallel Processing: The fetch_urls tool enables concurrent fetching of multiple URLs, significantly improving efficiency for batch operations.
  • Resource Optimization: Automatically blocks unnecessary resources (images, stylesheets, fonts, media) to reduce bandwidth usage and improve performance.
  • Robust Error Handling: Comprehensive error handling and logging ensure reliable operation even when dealing with problematic web pages.
  • Configurable Parameters: Fine-grained control over timeouts, content extraction, and output formatting to suit different use cases.
  • Quick Start

    Run directly with npx:

    npx -y fetcher-mcp
    

    First time setup - install the required browser by running the following command in your terminal:

    npx playwright install chromium
    

    HTTP and SSE Transport

    Use the --transport=http parameter to start both Streamable HTTP endpoint and SSE endpoint services simultaneously:

    npx -y fetcher-mcp --log --transport=http --host=0.0.0.0 --port=3000
    

    After startup, the server provides the following endpoints:

  • /mcp - Streamable HTTP endpoint (modern MCP protocol)
  • /sse - SSE endpoint (legacy MCP protocol)
  • Clients can choose which method to connect based on their needs.

    Debug Mode

    Run with the --debug option to show the browser window for debugging:

    npx -y fetcher-mcp --debug
    

    Configuration MCP

    Configure this MCP server in Claude Desktop:

    On MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json

    On Windows: %APPDATA%/Claude/claude_desktop_config.json

    {
      "mcpServers": {
        "fetcher": {
          "command": "npx",
          "args": ["-y", "fetcher-mcp"]
        }
      }
    }
    

    Docker Deployment

    Running with Docker

    docker run -p 3000:3000 ghcr.io/jae-jae/fetcher-mcp:latest
    

    Deploying with Docker Compose

    Create a docker-compose.yml file:

    version: "3.8"

    services: fetcher-mcp: image: ghcr.io/jae-jae/fetcher-mcp:latest container_name: fetcher-mcp restart: unless-stopped ports: - "3000:3000" environment: - NODE_ENV=production # Using host network mode on Linux hosts can improve browser access efficiency # network_mode: "host" volumes: # For Playwright, may need to share certain system paths - /tmp:/tmp # Health check healthcheck: test: ["CMD", "wget", "--spider", "-q", "http://localhost:3000"] interval: 30s timeout: 10s retries: 3

    Then run:

    docker-compose up -d
    

    Features

  • fetch_url - Retrieve web page content from a specified URL
  • - Uses Playwright headless browser to parse JavaScript - Supports intelligent extraction of main content and conversion to Markdown - Supports the following parameters: - url: The URL of the web page to fetch (required parameter) - timeout: Page loading timeout in milliseconds, default is 30000 (30 seconds) - waitUntil: Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load' - extractContent: Whether to intelligently extract the main content, default is true - maxLength: Maximum length of returned content (in characters), default is no limit - returnHtml: Whether to return HTML content instead of Markdown, default is false - waitForNavigation: Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false - `navigationTimeout

    Related MCP Servers

    AI Research Assistant

    AI Research Assistant

    hamid-vakilzadeh

    AI Research Assistant provides comprehensive access to millions of academic papers through the Semantic Scholar and arXiv databases. This MCP server enables AI coding assistants to perform intelligent literature searches, citation network analysis, and paper content extraction without requiring an API key. Key features include: - Advanced paper search with multi-filter support by year ranges, citation thresholds, field of study, and publication type - Title matching with confidence scoring for finding specific papers - Batch operations supporting up to 500 papers per request - Citation analysis and network exploration for understanding research relationships - Full-text PDF extraction from arXiv and Wiley open-access content (Wiley TDM token required for institutional access) - Rate limits of 100 requests per 5 minutes with options to request higher limits through Semantic Scholar

    Web & Search
    12 8
    Linkup

    Linkup

    LinkupPlatform

    Linkup is a real-time web search and content extraction service that enables AI assistants to search the web and retrieve information from trusted sources. It provides source-backed answers with citations, making it ideal for fact-checking, news gathering, and research tasks. Key features of Linkup: - Real-time web search using natural language queries to find current information, news, and data - Page fetching to extract and read content from any webpage URL - Search depth modes: Standard for direct-answer queries and Deep for complex research across multiple sources - Source-backed results with citations and context from relevant, trustworthy websites - JavaScript rendering support for accessing dynamic content on JavaScript-heavy pages

    Web & Search
    2 24
    Math-MCP

    Math-MCP

    EthanHenrickson

    Math-MCP is a computation server that enables Large Language Models (LLMs) to perform accurate numerical calculations through the Model Context Protocol. It provides precise mathematical operations via a simple API to overcome LLM limitations in arithmetic and statistical reasoning. Key features of Math-MCP: - Basic arithmetic operations: addition, subtraction, multiplication, division, modulo, and bulk summation - Statistical analysis functions: mean, median, mode, minimum, and maximum calculations - Rounding utilities: floor, ceiling, and nearest integer rounding - Trigonometric functions: sine, cosine, tangent, and their inverses with degrees and radians conversion support

    Developer Tools
    22 81