Price Per TokenPrice Per Token
Supadata: Web & Video data API for makers

Supadata: Web & Video data API for makers

by supadata-ai

GitHub 27 2 uses Remote
0

About

Supadata is a data extraction service that converts video content and websites into structured data without the complexity of building custom scrapers or transcription pipelines. Key features of Supadata: - Video transcript extraction from major platforms including YouTube, TikTok, Instagram, and X (Twitter), plus support for direct file URLs - Web scraping capabilities for extracting content from specific pages - Website crawling to extract content from multiple pages automatically - Site mapping to discover all available URLs on a website - Built-in automatic retries and rate limiting for reliable data collection

README

Supadata MCP Server

A Model Context Protocol (MCP) server that integrates with Supadata for video transcript extraction, web scraping, crawling, and site discovery.

Features

  • Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
  • Web scraping, crawling, and URL discovery
  • Automatic retries and rate limiting
  • Installation

    Connect your AI assistant to Supadata's MCP server to enable transcript extraction and web scraping capabilities directly in your workflow.

    Claude Code

    claude mcp add --transport http supadata https://api.supadata.ai/mcp \
      --header "x-api-token: YOUR_SUPADATA_API_TOKEN"
    

    Claude Desktop

    Add to your config file:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • {
      "mcpServers": {
        "supadata": {
          "url": "https://api.supadata.ai/mcp",
          "headers": {
            "x-api-token": "YOUR_SUPADATA_API_TOKEN"
          }
        }
      }
    }
    

    Cursor

    Add to .cursor/mcp.json in your project root (or global config):

    {
      "mcpServers": {
        "supadata": {
          "url": "https://api.supadata.ai/mcp",
          "headers": {
            "x-api-token": "YOUR_SUPADATA_API_TOKEN"
          }
        }
      }
    }
    

    Windsurf

    Add to ~/.codeium/windsurf/mcp_config.json:

    {
      "mcpServers": {
        "supadata": {
          "serverUrl": "https://api.supadata.ai/mcp",
          "headers": {
            "x-api-token": "YOUR_SUPADATA_API_TOKEN"
          }
        }
      }
    }
    

    VS Code + Copilot

    Add to your VS Code settings.json:

    {
      "mcp": {
        "servers": {
          "supadata": {
            "url": "https://api.supadata.ai/mcp",
            "headers": {
              "x-api-token": "YOUR_SUPADATA_API_TOKEN"
            }
          }
        }
      }
    }
    

    Cline (VS Code Extension)

    Open Cline settings and add to the MCP Servers configuration:

    {
      "supadata": {
        "url": "https://api.supadata.ai/mcp",
        "headers": {
          "x-api-token": "YOUR_SUPADATA_API_TOKEN"
        }
      }
    }
    

    ---

    Replace YOUR_SUPADATA_API_TOKEN with your API token from supadata.ai.

    Configuration

    Environment Variables

  • SUPADATA_API_KEY: Your Supadata API key
  • System Configuration

    The server includes configurable retry and rate limiting parameters:

    const CONFIG = {
      retry: {
        maxAttempts: 3,           // Number of retry attempts
        initialDelay: 1000,       // Initial delay (milliseconds)
        maxDelay: 10000,          // Maximum delay between retries (milliseconds)
        backoffFactor: 2          // Exponential backoff multiplier
      }
    };
    

    How to Choose a Tool

    Select the right tool based on your needs:

  • Transcript: Extract video transcripts from platforms and file URLs
  • Scrape: Extract content from a single page when you know the exact URL
  • Map: Discover all available URLs on a website
  • Crawl: Extract content from multiple related pages comprehensively
  • | Tool | Best for | Returns | |------|----------|---------| | transcript | Video transcript extraction | text/markdown | | scrape | Single page content | markdown/html | | map | URL discovery on a site | URL[] | | crawl | Multi-page extraction | markdown/html[] |

    Available Tools

    Transcript (supadata_transcript)

    Extract transcripts from supported video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs.

    Usage:

    supadata_transcript --url "https://youtube.com/watch?v=example" --lang "en"
    

    Check Transcript Status (supadata_check_transcript_status)

    Check the progress of a transcript extraction job using the job ID.

    Usage:

    supadata_check_transcript_status --id "550e8400-e29b-41d4-a716-446655440000"
    

    Scrape (supadata_scrape)

    Extract content from a single URL with advanced options.

    Usage:

    supadata_scrape --url "https://example.com" --lang "en"
    

    Map (supadata_map)

    Discover all indexed URLs on a website to find relevant pages before scraping.

    Usage:

    supadata_map --url "https://example.com"
    

    Crawl (supadata_crawl)

    Start an asynchronous crawl job to extract content from multiple pages on a site.

    Usage:

    supadata_crawl --url "https://example.com/blog" --limit 100
    

    Check Crawl Status (supadata_check_crawl_status)

    Check the progress of a crawl job using the job ID.

    Usage:

    supadata_check_crawl_status --id "550e8400-e29b-41d4-a716-446655440000"
    

    Development

    # Install dependencies
    npm install

    Build

    npm run build

    Run tests

    npm test

    Contributing

    1. Fork the repository 2. Create your feature branch 3. Run tests: npm test 4. Submit a pull request

    License

    MIT License - see LICENSE file for details

    Related MCP Servers

    AI Research Assistant

    AI Research Assistant

    hamid-vakilzadeh

    AI Research Assistant provides comprehensive access to millions of academic papers through the Semantic Scholar and arXiv databases. This MCP server enables AI coding assistants to perform intelligent literature searches, citation network analysis, and paper content extraction without requiring an API key. Key features include: - Advanced paper search with multi-filter support by year ranges, citation thresholds, field of study, and publication type - Title matching with confidence scoring for finding specific papers - Batch operations supporting up to 500 papers per request - Citation analysis and network exploration for understanding research relationships - Full-text PDF extraction from arXiv and Wiley open-access content (Wiley TDM token required for institutional access) - Rate limits of 100 requests per 5 minutes with options to request higher limits through Semantic Scholar

    Web & Search
    12 8
    Linkup

    Linkup

    LinkupPlatform

    Linkup is a real-time web search and content extraction service that enables AI assistants to search the web and retrieve information from trusted sources. It provides source-backed answers with citations, making it ideal for fact-checking, news gathering, and research tasks. Key features of Linkup: - Real-time web search using natural language queries to find current information, news, and data - Page fetching to extract and read content from any webpage URL - Search depth modes: Standard for direct-answer queries and Deep for complex research across multiple sources - Source-backed results with citations and context from relevant, trustworthy websites - JavaScript rendering support for accessing dynamic content on JavaScript-heavy pages

    Web & Search
    2 24
    Math-MCP

    Math-MCP

    EthanHenrickson

    Math-MCP is a computation server that enables Large Language Models (LLMs) to perform accurate numerical calculations through the Model Context Protocol. It provides precise mathematical operations via a simple API to overcome LLM limitations in arithmetic and statistical reasoning. Key features of Math-MCP: - Basic arithmetic operations: addition, subtraction, multiplication, division, modulo, and bulk summation - Statistical analysis functions: mean, median, mode, minimum, and maximum calculations - Rounding utilities: floor, ceiling, and nearest integer rounding - Trigonometric functions: sine, cosine, tangent, and their inverses with degrees and radians conversion support

    Developer Tools
    22 81