MarkItDown MCP

MarkItDown MCP

Converts various file types (documents, images, audio, web content) to markdown format without requiring Docker, supporting PDF, Word, Excel, PowerPoint, images, audio files, web URLs, and more.

Category
访问服务器

README

MarkItDown-MCP-NPX

npm version Built for AutoGen

NPX wrapper for Microsoft's MarkItDown MCP server - No Docker Required!

This package provides an NPX-compatible wrapper for Microsoft's markitdown-mcp, allowing you to run the MarkItDown MCP server without Docker dependencies.

✨ Features

  • 🚀 No Docker Required: Run directly with "node"
  • 🔧 Automatic Setup: Handles Python environment and dependencies automatically
  • 🔄 Full Compatibility: Works exactly like the original Docker version
  • 💻 Cross-Platform: Works on Windows, macOS, and Linux
  • Fast: Reuses virtual environment after first setup

📋 Prerequisites

Required

  • Node.js 16+: Required for running the local script
  • Python 3.10+: Required for MarkItDown functionality
  • Internet Connection: For initial package installation

Optional (for enhanced functionality)

  • FFmpeg: For audio file processing and transcription (.mp3, .wav files)
  • ExifTool: For advanced image metadata extraction

💡 Note: MarkItDown works perfectly for most file types (PDF, Word, Excel, basic images) without the optional dependencies. They're only needed for audio files and advanced image metadata.

Windows users: See WINDOWS_SETUP.md for easy installation of optional dependencies.

🚀 Quick Start

Using Local Installation (Recommended)

# Run directly with node (no installation required after setup)
node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js

# Run with HTTP transport
node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js --http --host 127.0.0.1 --port 3001

# Run with specific arguments
node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js --help

Future NPX Usage (if published to NPM)

# Once published to NPM, you could use:
npx markitdown-mcp-npx
npx markitdown-mcp-npx --http --host 127.0.0.1 --port 3001

Local Installation

# Clone or download this package
git clone <this-repo-url>
cd markitdown-mcp-npx

# Install dependencies
npm install

# Run locally
npm start

🔧 Configuration for Claude Desktop

The local version can be used as a drop-in replacement for the Docker version in Claude Desktop.

Claude Desktop Configuration

For Local Installation:

{
  "mcpServers": {
    "markitdown": {
      "command": "node",
      "args": [
        "C:\\Users\\YOUR_USERNAME\\MCP\\markitdown-mcp-npx\\bin\\markitdown-mcp-npx.js"
      ]
    }
  }
}

For NPX version (if published to NPM):

{
  "mcpServers": {
    "markitdown": {
      "command": "npx",
      "args": [
        "markitdown-mcp-npx"
      ]
    }
  }
}

For HTTP transport:

{
  "mcpServers": {
    "markitdown": {
      "command": "node",
      "args": [
        "C:\\Users\\YOUR_USERNAME\\MCP\\markitdown-mcp-npx\\bin\\markitdown-mcp-npx.js",
        "--http",
        "--host",
        "127.0.0.1",
        "--port",
        "3001"
      ]
    }
  }
}

Comparison with Docker Version

Feature Docker Version Local Node Version
Setup Requires Docker Requires Node.js + Python
Command docker run ... node path/to/markitdown-mcp-npx.js
Dependencies Isolated in container Managed in virtual environment
Performance Container overhead Direct execution
File Access Requires volume mounts Direct file system access

📖 Usage Examples

Basic STDIO Mode (Default)

node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js

HTTP/SSE Mode

node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js --http --host 127.0.0.1 --port 3001

With Custom Host/Port

node C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js --http --host 0.0.0.0 --port 8080

🛠️ Available Options

Usage: markitdown-mcp-npx [options]

Options:
  --http           Run with Streamable HTTP and SSE transport (default: STDIO)
  --sse            Alias for --http (deprecated)
  --host HOST      Host to bind to (default: 127.0.0.1)
  --port PORT      Port to listen on (default: 3001)
  --help           Show help message

🔍 How It Works

  1. Environment Detection: Automatically detects Python 3.10+ installation
  2. Virtual Environment: Creates isolated Python environment in temp directory
  3. Package Installation: Installs markitdown-mcp and dependencies
  4. Process Management: Spawns and manages the Python MCP server process
  5. Signal Handling: Properly handles termination signals

🐛 Troubleshooting

Python Not Found

Error: Python 3.10+ is required but not found

Solution: Install Python 3.10+ and ensure it's in your PATH

Permission Errors

Error: Failed to create virtual environment

Solution: Check write permissions to your temp directory

Installation Failures

Error: Failed to install markitdown-mcp

Solution: Check internet connectivity and proxy settings

Port Already in Use

Error: Port 3001 already in use

Solution: Use a different port with --port <number>

🧪 Testing with MCP Inspector

You can test the server using the MCP Inspector:

# Start the inspector
npx @modelcontextprotocol/inspector

# For STDIO mode:
# - Transport: STDIO
# - Command: node
# - Args: C:\Users\YOUR_USERNAME\MCP\markitdown-mcp-npx\bin\markitdown-mcp-npx.js

# For HTTP mode:
# - Transport: Streamable HTTP
# - URL: http://127.0.0.1:3001/mcp

📂 File Structure

markitdown-mcp-npx/
├── package.json              # NPM package configuration
├── index.js                  # Main entry point
├── bin/
│   └── markitdown-mcp-npx.js # Node.js executable script
└── README.md                 # This file

🔐 Security Considerations

  • The server runs with the same privileges as the user executing it
  • No authentication is provided for HTTP/SSE modes
  • For HTTP mode, bind to localhost unless specifically needed otherwise
  • Virtual environments provide isolation for Python dependencies

🆚 vs. Docker Version

Advantages of Local Node Version:

  • ✅ No Docker installation required
  • ✅ Direct file system access (no volume mounts)
  • ✅ Faster startup (no container overhead)
  • ✅ Easier to debug and troubleshoot

Advantages of Docker Version:

  • ✅ Complete isolation
  • ✅ Consistent environment across systems
  • ✅ No Python installation required on host

📄 License

This project follows the same MIT license as the original markitdown project.

🔧 Expected Tool Behavior

✓ Single Tool: MarkItDown MCP provides exactly 1 tool called convert_to_markdown
✓ Universal Converter: This one tool handles all file types:

  • 📄 Documents: PDF, Word (.docx), Excel (.xlsx), PowerPoint (.pptx)
  • 🖼️ Images: JPG, PNG, GIF, etc. (with OCR support)
  • 🎧 Audio: MP3, WAV (with transcription if FFmpeg installed)
  • 🌐 Web: HTTP/HTTPS URLs
  • 🗃️ Archives: ZIP files
  • 📊 Data: CSV, JSON, XML

✓ URI Parameter: Accepts http:, https:, file:, or data: URIs

💡 Note: Seeing "1 tools available" in Claude Desktop is correct behavior!

🚫 Troubleshooting

FFmpeg Warning

RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work

This warning is harmless! It means:

  • ✅ MarkItDown is working correctly
  • ✅ All file types work (PDF, Word, Excel, images)
  • ⚠️ Audio files (.mp3, .wav) processing will be limited

To resolve: Install FFmpeg (see WINDOWS_SETUP.md for Windows)

Python Not Found

Error: Python 3.10+ is required but not found

Solution: Install Python 3.10+ and ensure it's in your PATH

Permission Errors

Error: Failed to create virtual environment

Solution: Check write permissions to your temp directory

Installation Failures

Error: Failed to install markitdown-mcp

Solution: Check internet connectivity and proxy settings

Port Already in Use

Error: Port 3001 already in use

Solution: Use a different port with --port <number>

🤝 Contributing

This is an unofficial wrapper for Microsoft's MarkItDown MCP server. For issues with the core MarkItDown functionality, please refer to the original repository.

For issues specific to this wrapper:

  1. Check the troubleshooting section
  2. Verify your Python and Node.js installations
  3. Test with the MCP Inspector

🙏 Acknowledgments

  • Microsoft AutoGen Team: For creating the original MarkItDown and MCP server
  • Model Context Protocol: For the MCP specification
  • Claude Desktop: For MCP integration

Note: This is an unofficial wrapper for MarkItDown MCP. For the official Docker version, visit the original repository.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选