LandingAI ADE MCP Server
Enables extraction of text, tables, and structured data from PDFs, images, and office documents using LandingAI's Agentic Document Extraction API. Supports both direct parsing and background job processing for large files with privacy-focused processing.
README
LandingAI ADE MCP Server
A Model Context Protocol (MCP) server providing direct integration with LandingAI's Agentic Document Extraction (ADE) API. Extract text, tables, and structured data from PDFs, images, and office documents.
Features
- 📄 Document Parsing - Parse entire documents and return markdown output
- 🔍 Data Extraction - Extract structured data using JSON schemas
- ⚡ Parse Jobs - Handle large documents with background processing
- 🛡️ Zero Data Retention - Privacy-focused processing support
Installation
Prerequisites
- Python 3.9 or higher
- LandingAI API key from LandingAI
Option 1: Using uv (Recommended - Simplest)
uv is a fast Python package manager that handles virtual environments automatically.
Install uv (if not already installed)
# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with Homebrew
brew install uv
Set up the project
# Clone the repository
git clone https://github.com/avaxia8/landingai-ade-mcp.git
cd landingai-ade-mcp
# Install dependencies with uv
uv sync
# Or if starting fresh:
uv init
uv add fastmcp httpx pydantic python-multipart aiofiles
Option 2: Using pip with Virtual Environment
# Clone the repository
git clone https://github.com/avaxia8/landingai-ade-mcp.git
cd landingai-ade-mcp
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
# On macOS/Linux:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Configuration
Set Your API Key
export LANDINGAI_API_KEY="your-api-key-here"
Claude Desktop Configuration
Configuration File Location
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Linux:
~/.config/claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
Configuration Examples
Using uv (Recommended)
{
"mcpServers": {
"landingai-ade-mcp": {
"command": "uv",
"args": [
"--directory",
"/path/to/landingai-ade-mcp",
"run",
"python",
"-m",
"server"
],
"env": {
"LANDINGAI_API_KEY": "your-api-key-here"
}
}
}
}
Using Virtual Environment
{
"mcpServers": {
"landingai-ade-mcp": {
"command": "/path/to/landingai-ade-mcp/venv/bin/python",
"args": [
"/path/to/landingai-ade-mcp/server.py"
],
"env": {
"LANDINGAI_API_KEY": "your-api-key-here"
}
}
}
}
After Configuration
- Save the configuration file
- Restart Claude Desktop completely (quit and reopen)
- The server should appear as "landingai-ade-mcp" in your MCP servers
Available Tools
parse_document
Parse entire documents and return markdown output.
# Parse a local file
result = await parse_document(
document_path="/path/to/document.pdf",
model="dpt-2-latest", # optional
split="page" # optional, for page-level splits
)
# Parse from URL
result = await parse_document(
document_url="https://example.com/document.pdf"
)
extract_data
Extract structured data from markdown using a JSON schema.
schema = {
"type": "object",
"properties": {
"invoice_number": {"type": "string"},
"total": {"type": "number"}
}
}
# Extract from markdown content string
result = await extract_data(
schema=schema,
markdown="# Invoice\nInvoice #123\nTotal: $100.00"
)
# Or extract from a markdown file
result = await extract_data(
schema=schema,
markdown="/path/to/document.md" # Will detect if it's a file path
)
# Or extract from URL
result = await extract_data(
schema=schema,
markdown_url="https://example.com/document.md"
)
create_parse_job
Create a parse job for large documents (>50MB recommended).
job = await create_parse_job(
document_path="/path/to/large_document.pdf",
split="page" # optional
)
job_id = job["job_id"]
get_parse_job_status
Check status and retrieve results of a parse job.
status = await get_parse_job_status(job_id)
# Check status
if status["status"] == "completed":
# For small files, data is included directly
# For large files (>1MB), data is auto-fetched from output_url
results = status["data"]
elif status["status"] == "processing":
print(f"Progress: {status['progress'] * 100:.1f}%")
list_parse_jobs
List all parse jobs with filtering and pagination.
jobs = await list_parse_jobs(
page=0, # optional, default 0
pageSize=10, # optional, 1-100, default 10
status="completed" # optional filter
)
process_folder
Process all supported files in a folder - parse documents or extract structured data.
Supported formats:
- Images: APNG, BMP, DCX, DDS, DIB, GD, GIF, ICNS, JP2, JPEG, JPG, PCX, PNG, PPM, PSD, TGA, TIFF, WEBP
- Documents: PDF, DOC, DOCX, PPT, PPTX, ODP, ODT
# Parse all PDFs in a folder
result = await process_folder(
folder_path="/path/to/documents",
operation="parse", # or "extract" for structured data
file_types="pdf", # optional filter
model="dpt-2-latest"
)
# Extract structured data from all documents
schema = {
"type": "object",
"properties": {
"invoice_number": {"type": "string"},
"total": {"type": "number"},
"date": {"type": "string"}
}
}
result = await process_folder(
folder_path="/path/to/invoices",
operation="extract",
schema=schema,
file_types="pdf,jpg" # Process PDFs and images
)
# Process everything with defaults
result = await process_folder(
folder_path="/path/to/mixed_documents"
)
Features:
- Automatic file size detection (uses direct parsing for <50MB, jobs for larger)
- Concurrent processing with rate limiting
- Progress tracking for long-running operations
- Organized output in
ade_resultsfolder - Aggregated data for extraction operations
- Continues processing even if individual files fail
health_check
Check server status and API connectivity.
health = await health_check()
# Returns server status, API connectivity, available tools
File Size Guidelines
- < 50MB: Use
parse_documentdirectly - > 50MB: Always use
create_parse_job
Error Handling
result = await parse_document(document_path="/path/to/file.pdf")
if result.get("status") == "error":
print(f"Error: {result['error']}")
print(f"Status Code: {result.get('status_code')}")
else:
# Process successful result
markdown = result["markdown"]
Common Error Codes
401: Invalid API key413: File too large (use parse jobs)422: Validation error429: Rate limit exceeded
Troubleshooting
Common Issues and Solutions
"Could not connect to MCP server"
-
Python not found: Make sure the Python path in your config is correct
# Find your Python path which python3 -
Module not found errors: Dependencies aren't installed in the Python environment
- If using uv: Run
uv syncin the project directory - If using venv: Activate it and run
pip install -r requirements.txt - Check that the Python path in config matches your environment
- If using uv: Run
-
spawn python ENOENT: The system can't find Python
- Use the full path to Python (e.g.,
/usr/bin/python3instead of justpython) - For virtual environments, use the full path to the venv's Python
- Use the full path to Python (e.g.,
"Server disconnected"
-
Check the server can run manually:
cd /path/to/landingai-ade-mcp python server.py # Should see: "Starting LandingAI ADE MCP Server" -
Check API key is set:
echo $LANDINGAI_API_KEY -
Check dependencies are installed:
python -c "import fastmcp, httpx, pydantic" # Should complete without errors
"ModuleNotFoundError: No module named 'fastmcp'"
This means fastmcp isn't installed in the Python environment being used:
- If using virtual environment: The config is pointing to the wrong Python
- Solution: Use uv or ensure the Python path matches your environment
Platform-Specific Issues
macOS: If you installed Python with Homebrew, the path might be /opt/homebrew/bin/python3 (Apple Silicon) or /usr/local/bin/python3 (Intel)
Windows: Use forward slashes in paths or escape backslashes: C:/path/to/python.exe or C:\\path\\to\\python.exe
Linux: Some systems use python3 instead of python. Always use python3 for clarity.
Debug Steps
-
Test the server standalone:
python server.py -
Check MCP communication:
echo '{"jsonrpc": "2.0", "method": "initialize", "id": 1}' | python server.py -
Verify configuration:
- Open Claude Desktop developer settings
- Check the logs for specific error messages
- Ensure all paths are absolute, not relative
-
Validate API key:
python -c "import os; print('API Key set:', bool(os.environ.get('LANDINGAI_API_KEY')))"
API Documentation
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。