scraperapi-mcp-server

scraperapi-mcp-server

A local MCP server that lets AI agents bypass bot detection, geo-restrictions, and JavaScript rendering challenges when scraping the web, backed by ScraperAPI's services

Category
访问服务器

README

ScraperAPI MCP server

The ScraperAPI MCP server enables LLM clients to retrieve and process web scraping requests using the ScraperAPI services.

Table of Contents

Features

  • Full implementation of the Model Context Protocol specification
  • Seamless integration with ScraperAPI for web scraping
  • Simple setup with Python or Docker

Architecture

          ┌───────────────┐     ┌───────────────────────┐     ┌───────────────┐
          │  LLM Client   │────▶│  Scraper MCP Server   │────▶│    AI Model   │
          └───────────────┘     └───────────────────────┘     └───────────────┘
                                            │
                                            ▼
                                  ┌──────────────────┐
                                  │  ScraperAPI API  │
                                  └──────────────────┘

Installation

The ScraperAPI MCP Server is designed to run as a local server on your machine, your LLM client will launch it automatically when configured.

Prerequisites

  • Python 3.11+
  • Docker (optional)

Using Python

Install the package:

pip install scraperapi-mcp-server

Add this to your client configuration file:

{
  "mcpServers": {
    "ScraperAPI": {
      "command": "python",
      "args": ["-m", "scraperapi_mcp_server"],
      "env": {
        "API_KEY": "<YOUR_SCRAPERAPI_API_KEY>"
      }
    }
  }
}

Using Docker

Add this to your client configuration file:

{
  "mcpServers": {
    "ScraperAPI": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "-e",
        "API_KEY=${API_KEY}",
        "--rm",
        "scraperapi-mcp-server"]
    }
  }
}

</br>

[!TIP]

If your command is not working (for example, you see a package not found error when trying to start the server), double-check the path you are using. To find the correct path, activate your virtual environment first, then run:

which <YOUR_COMMAND>

API Reference

Available Tools

  • scrape
    • Scrape a URL from the internet using ScraperAPI
    • Parameters:
      • url (string, required): URL to scrape
      • render (boolean, optional): Whether to render the page using JavaScript. Defaults to False. Set to True only if the page requires JavaScript rendering to display its content.
      • country_code (string, optional): Activate country geotargeting (ISO 2-letter code)
      • premium (boolean, optional): Activate premium residential and mobile IPs
      • ultra_premium (boolean, optional): Activate advanced bypass mechanisms. Can not combine with premium
      • device_type (string, optional): Set request to use mobile or desktop user agents
      • output_format (string, optional): Allows you to instruct the API on what the response file type should be.
      • autoparse (boolean, optional): Activate auto parsing for select websites. Defaults to False. Set to True only if you want the output format in csv or json.
    • Returns: The scraped content as a string

Prompt templates

  • Please scrape this URL <URL>. If you receive a 500 server error identify the website's geo-targeting and add the corresponding country_code to overcome geo-restrictions. If errors continues, upgrade the request to use premium proxies by adding premium=true. For persistent failures, activate ultra_premium=true to use enhanced anti-blocking measures.
  • Can you scrape URL <URL> to extract <SPECIFIC_DATA>? If the request returns missing/incomplete<SPECIFIC_DATA>, set render=true to enable JS Rendering.

Configuration

Settings

  • API_KEY: Your ScraperAPI API key.

Configure Claude Desktop App & Claude Code

Claude Desktop:

  1. Open Claude Desktop and click the settings icon
  2. Select the "Developer" tab
  3. Click "Edit Config" and paste the JSON configuration file

Claude Code:

  1. Add the server manually to your .claude/settings.json with the JSON configuration file, or run:
    claude mcp add scraperapi -e API_KEY=<YOUR_SCRAPERAPI_API_KEY> -- python -m scraperapi_mcp_server
    

Configure Cursor Editor

  1. Open Cursor
  2. Access the Settings Menu
  3. Open Cursor Settings
  4. Go to Tools & Integrations section
  5. Click '+ Add MCP Server'
  6. Choose Manual and paste the JSON configuration file

More here

Configure Windsurf Editor

  1. Open Windsurf
  2. Access the Settings Menu
  3. Click on the Cascade settings
  4. Click on the MCP server section
  5. Click on the gear icon, the mcp_config.json file will open
  6. Paste the JSON configuration file

More here

Configure Cline (VS code extension)

  1. Open VS Code and click the Cline icon in the activity bar to open the Cline panel
  2. Click the MCP Servers icon in the top navigation bar of the Cline pane
  3. Select the "Configure" tab
  4. Click "Configure MCP Servers" at the bottom of the pane — this opens cline_mcp_settings.json
  5. Paste the JSON configuration file

More here

Development

Local setup

  1. Clone the repository:

    git clone https://github.com/scraperapi/scraperapi-mcp
    cd scraperapi-mcp
    
  2. Install dependencies:

    • Using Poetry:
      poetry install
      
    • Using pip:
      # Create virtual environment and activate it
      python -m venv .venv
      source .venv/bin/activate # MacOS/Linux
      # OR
      .venv/Scripts/activate # Windows
      
      # Install the local package in editable mode
      pip install -e .
      
    • Using Docker:
      # Build the Docker image locally
      docker build -t scraperapi-mcp-server .
      

Run the server

  • Using Python:
    python -m scraperapi_mcp_server
    
  • Using Docker:
    # Run the Docker container with your API key
    docker run -e API_KEY=<YOUR_SCRAPERAPI_API_KEY> scraperapi-mcp-server
    

Debug

python3 -m scraperapi_mcp_server --debug

Testing

This project uses pytest for testing.

Install Test Dependencies

  • Using Poetry:
    poetry install --with dev
    
  • Using pip:
    pip install -e .
    pip install pytest pytest-mock pytest-asyncio
    

Running Tests

# Run All Tests
pytest

# Run Specific Test
pytest <TEST_FILE_PATH>

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选