MCP HTTP Fetcher Server
Fetches web pages from HTTP/HTTPS URLs and converts them to Markdown format. Supports both SSE and Stdio protocols for web deployments, Kubernetes environments, and desktop clients.
README
MCP HTTP Fetcher Server
A Model Context Protocol (MCP) server that fetches web pages and converts them to Markdown format. This server supports both SSE (Server-Sent Events) and Stdio protocols, making it suitable for web deployments, Kubernetes environments, desktop clients, and sidecar patterns.
Features
- ✅ Dual Protocol Support: SSE (default) and Stdio protocols
- ✅ HTTP/HTTPS URL fetching: Download content from any web URL
- ✅ HTML to Markdown conversion: Clean conversion using html2text
- ✅ MCP Protocol compliance: Follows MCP server standards
- ✅ Kubernetes ready: SSE protocol with health checks and graceful scaling
- ✅ Sidecar friendly: Stdio protocol for process-to-process communication
- ✅ Error handling: Robust error handling for network issues
- ✅ Docker support: Ready-to-run Docker container
- ✅ Async operation: Non-blocking HTTP requests
- ✅ Modular architecture: Clean separation of concerns for maintainability
Quick Start
SSE Protocol (Default - Web/Kubernetes Deployments)
# Start with SSE protocol (default)
python app/server.py
# Custom host/port for Kubernetes
python app/server.py --protocol sse --host 0.0.0.0 --port 8000
Stdio Protocol (Desktop Clients/Sidecars)
# Start with stdio protocol
python app/server.py --protocol stdio
Protocol Overview
| Protocol | Use Case | Deployment | Scaling |
|---|---|---|---|
| SSE (Default) | Web apps, APIs, Kubernetes | HTTP server with endpoints | Horizontal scaling, load balancing |
| Stdio | Desktop clients, sidecars | Process communication | Single client per process |
For detailed protocol documentation, see docs/PROTOCOLS.md.
Project Structure
The project follows Python best practices with a clean, modular architecture:
mcp_fetcher_http/
├── app/ # Main application package
│ ├── core/ # Core business logic
│ │ ├── fetcher.py # URL fetching functionality
│ │ └── converter.py # HTML to Markdown conversion
│ ├── protocols/ # Protocol implementations
│ │ ├── base.py # Abstract protocol interface
│ │ ├── stdio.py # Standard I/O protocol implementation
│ │ └── sse.py # Server-Sent Events protocol implementation
│ └── server.py # New modular server entry point
├── docs/ # Documentation
│ └── PROTOCOLS.md # Comprehensive protocol guide
├── tests/ # Test suite
│ ├── test_fetcher.py # Tests for URL fetching
│ ├── test_converter.py # Tests for HTML conversion
│ └── test_server.py # Integration tests
├── examples/ # Example usage and demos
│ ├── demo.py # Functionality demonstration
│ └── mcp_config_example.json
├── server.py # Backward-compatible entry point
├── requirements.txt # Python dependencies
└── Dockerfile # Container configuration
Entry Points
python app/server.py- Primary entry point with full protocol supportpython server.py- Legacy entry point (limited protocol options)./run_server.sh- Convenience script with dependency management
Installation
Option 1: Direct Python Installation
- Clone the repository:
git clone https://github.com/photonn/mcp_fetcher_http.git
cd mcp_fetcher_http
- Install dependencies:
pip install -r requirements.txt
- Run the server:
# SSE protocol (default - for web/Kubernetes deployments)
python app/server.py
# Stdio protocol (for desktop clients)
python app/server.py --protocol stdio
# Custom SSE configuration
python app/server.py --protocol sse --host 0.0.0.0 --port 8080
# OR use the convenience script:
./run_server.sh
Option 2: Docker Installation
- Build the Docker image:
docker build -t mcp-fetcher-http .
- Run the container:
# SSE protocol (default)
docker run -p 8000:8000 mcp-fetcher-http
# Stdio protocol
docker run -it mcp-fetcher-http python app/server.py --protocol stdio
Note: Docker build may require internet access to install Python packages. In restricted environments, use the direct Python installation method.
Usage
The server exposes a single MCP tool called fetch_url that accepts a URL parameter and returns the page content converted to Markdown.
MCP Client Configuration
For SSE Protocol (Web/Kubernetes deployments)
{
"mcpServers": {
"http-fetcher": {
"command": "curl",
"args": ["-N", "-H", "Accept: text/event-stream", "http://localhost:8000/sse"],
"description": "HTTP fetcher using SSE protocol"
}
}
}
For Stdio Protocol (Desktop clients)
{
"mcpServers": {
"http-fetcher": {
"command": "python",
"args": ["/path/to/mcp_fetcher_http/app/server.py", "--protocol", "stdio"],
"description": "HTTP fetcher using stdio protocol"
}
}
}
Command Line Options
python app/server.py --help
Available options:
--protocol {stdio,sse}- Communication protocol (default: sse)--host HOST- Host to bind SSE server (default: localhost)--port PORT- Port for SSE server (default: 8000)--endpoint ENDPOINT- SSE message endpoint (default: /messages)--server-name NAME- Server identifier--server-version VERSION- Server version
Tool Schema
{
"name": "fetch_url",
"description": "Fetch a web page and convert it to Markdown format",
"inputSchema": {
"type": "object",
"properties": {
"url": {
"type": "string",
"description": "The URL of the web page to fetch and convert to Markdown"
}
},
"required": ["url"]
}
}
Example Usage
When connected to an MCP client, you can use the tool like this:
{
"name": "fetch_url",
"arguments": {
"url": "https://example.com"
}
}
The server will return the page content converted to Markdown format.
Testing
Run the included test scripts to verify functionality:
# Run all tests (recommended)
python tests/test_server.py # Integration tests
python tests/test_fetcher.py # URL fetching tests
python tests/test_converter.py # HTML conversion tests
# Run demo to see HTML to Markdown conversion
python examples/demo.py
Testing the MCP Server
The server communicates via stdin/stdout using the MCP protocol. For basic testing:
- Start the server:
python server.py(orpython app/server.py) - The server will wait for MCP protocol messages on stdin
- Send a list_tools request to see available tools
- Send a call_tool request with the fetch_url tool
For easier testing, use an MCP client like Claude Desktop or develop a simple test client.
Configuration
The server includes the following configurable aspects:
- Timeout: 30-second timeout for HTTP requests
- Content types: Accepts HTML and XML content types
- Markdown conversion: Preserves links and images in output
Error Handling
The server handles various error conditions:
- Invalid URLs
- Network connectivity issues
- HTTP error responses (4xx, 5xx)
- Content parsing errors
- Timeout conditions
Security Considerations
- The server validates URLs before processing
- Runs in a non-root Docker container
- Has reasonable timeout limits to prevent hanging
- Does not execute or process JavaScript content
Dependencies
mcp>=1.0.0- Model Context Protocol server frameworkaiohttp>=3.9.0- Async HTTP client for fetching URLshtml2text>=2020.1.16- HTML to Markdown conversiontyping-extensions>=4.0.0- Type hints support
License
Licensed under the Apache License, Version 2.0. See the LICENSE file for details.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。