doc-ops-mcp
A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
README
Document Operations MCP Server
Document Operations MCP Server - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
Demo
Video
https://github.com/user-attachments/assets/43dfeeec-8097-413e-8519-a7de98e31136
In this demo, we showcase how to:
- Configure doc-ops-mcp in MCP clients
- Convert DOCX documents to PDF format
- Add default watermarks to converted PDF files
Table of Contents
- Quick Start
- System Architecture
- Optional Integration
- Features
- Open Source Licenses
- Future Roadmap
- Docker Deployment
- Development Guide
- Troubleshooting
- Contributing
1. Quick Start
First, add the Document Operations MCP server to your MCP client.
Standard config works in most MCP clients:
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
}
}
}
}
<details> <summary>Claude Desktop</summary>
Follow the MCP install guide, use the standard config above.
</details>
<details> <summary>VS Code</summary>
Follow the MCP install guide, use the standard config above.
</details>
<details> <summary>Cursor</summary>
Go to Cursor Settings -> MCP -> Add new MCP Server. Name to your liking, use command type with the command npx -y doc-ops-mcp.
</details>
<details> <summary>Other MCP Clients</summary>
For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.
</details>
Configuration
The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the "env" object:
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
"WATERMARK_IMAGE": "/path/to/watermark.png",
"QR_CODE_IMAGE": "/path/to/qrcode.png"
}
}
}
}
Supported Document Operations
| Format | Convert to PDF | Convert to DOCX | Convert to HTML | Convert to Markdown | Content Rewriting | Watermark/QR Code |
|---|---|---|---|---|---|---|
| ✅ | ❌ | ❌ | ❌ | ❌ | ✅ | |
| DOCX | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| HTML | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
| Markdown | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
Rewriting Features:
- Content Replacement: Support batch text replacement and regular expression replacement
- Format Adjustment: Modify document structure, heading levels, and style formatting
- Smart Rewriting: Content optimization while preserving original document format
Usage Examples
Format Conversion:
Convert /Users/docs/report.docx to PDF
Convert /Users/docs/article.md to HTML
Convert /Users/docs/presentation.html to DOCX
Convert /Users/docs/readme.md to PDF (with theme styling)
Document Rewriting:
Rewrite company names in /Users/docs/contract.md
Batch replace terminology in /Users/docs/manual.docx
Adjust heading levels in /Users/docs/article.html
Update dates and version numbers in /Users/docs/policy.md
PDF Enhancement:
Add watermark to /Users/docs/document.pdf
Add QR code to /Users/docs/report.pdf
Add company logo watermark to /Users/docs/invoice.pdf
Environment Variables
The server supports environment variables for controlling output paths and PDF enhancement features:
Core Directories
OUTPUT_DIR: Controls where all generated files are saved (default:~/Documents)CACHE_DIR: Directory for temporary and cache files (default:~/.cache/doc-ops-mcp)
PDF Enhancement Features
WATERMARK_IMAGE: Default watermark image path for PDF files- Automatically added to all PDF conversions
- Supported formats: PNG, JPG
- If not set, default text watermark "doc-ops-mcp" will be used
QR_CODE_IMAGE: Default QR code image path for PDF files- Added to PDFs only when explicitly requested (
addQrCode=true) - Supported formats: PNG, JPG
- If not set, QR code functionality will be unavailable
- Added to PDFs only when explicitly requested (
Output Path Rules:
- If
outputPathis not provided → files saved toOUTPUT_DIRwith auto-generated names - If
outputPathis relative → resolved relative toOUTPUT_DIR - If
outputPathis absolute → used as-is, ignoringOUTPUT_DIR
See OUTPUT_PATH_CONTROL.md for detailed documentation.
2. System Architecture
Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:
┌─────────────────────────────────────────────────────────────┐
│ MCP Client Layer │
│ (Claude Desktop, Cursor, VS Code, etc.) │
└─────────────────────┬───────────────────────────────────────┘
│ JSON-RPC 2.0
┌─────────────────────┴───────────────────────────────────────┐
│ Doc-Ops-MCP Server │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐ │
│ │ Tool Router │ │ Request │ │ Response │ │
│ │ & Handler │ │ Validator │ │ Formatter │ │
│ └────────┬────────┘ └────────┬────────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌────────┴────────────────────┴──────────────────┴─────┐ │
│ │ Document Processing Engine │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Document │ │ Format │ │ Style │ │ │
│ │ │ Reader │ │ Converter │ │ Processor │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ PDF │ │ Watermark/ │ │ Conversion │ │ │
│ │ │ Enhancement │ │ QR Code │ │ Planner │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
└────┴───────────────────────────────────────────────────────┴─┘
│
┌───────────────────────────┴─────────────────────────────────┐
│ Core Dependencies Layer │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ pdf-lib │ │word-extractor│ │ marked │ │
│ │ (PDF Tools) │ │(DOCX Reader)│ │ (Markdown) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ cheerio │ │ jszip │ │ docx │ │
│ │(HTML Parser)│ │(ZIP Handler)│ │(DOCX Gen.) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ ┌─────────────┐ ┌─────────────┐ │
│ │ xml2js │ │Custom OOXML │ │
│ │(XML Parser) │ │ Parser │ │
│ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘
Architecture Overview
Core Features:
- Pure JavaScript implementation with no external system dependencies
- Complete document reading, conversion, and style processing capabilities
- Built-in PDF watermark and QR code addition functionality
- Intelligent conversion planning and path optimization
Conversion Flow:
- Direct Conversion: Supports direct conversion between most formats
- Multi-step Conversion: Complex conversions achieved through intermediate formats
- Style Preservation: Uses OOXML parser to ensure complete style integrity
3. Optional Integration
This server can work with playwright-mcp for enhanced PDF conversion capabilities. Please refer to the official playwright-mcp documentation for detailed configuration.
🔧 PDF Conversion Workflow
This server supports complete PDF conversion functionality:
- Document Parsing: Use OOXML parser to ensure complete style preservation
- Format Conversion: Convert documents to high-quality HTML format
- PDF Generation: Built-in converter or optionally work with
playwright-mcp - Enhancement Processing: Automatically add watermarks and QR codes (if configured)
How It Works
This server uses intelligent conversion architecture:
- Smart Planning:
plan_conversionanalyzes conversion requirements and selects optimal paths - Format Conversion: Use specialized converters to handle various document formats
- Style Preservation: Ensure style integrity through OOXML parser
- Enhancement Processing: Automatically add watermarks, QR codes and other enhancements
- Optional Integration: Support working with
playwright-mcpfor enhanced capabilities
4. Features
MCP Tools
Core Document Tools
| Tool Name | Description | Input Parameters | External Dependencies |
|---|---|---|---|
read_document |
Read document content | filePath: Document path<br>extractMetadata: Extract metadata<br>preserveFormatting: Preserve formatting |
None |
write_document |
Write document content | content: Document content<br>outputPath: Output file path<br>encoding: File encoding |
None |
convert_document |
Smart document conversion | inputPath: Input file path<br>outputPath: Output file path<br>preserveFormatting: Preserve formatting |
None |
plan_conversion |
Conversion planner | sourceFormat: Source format<br>targetFormat: Target format<br>preserveStyles: Preserve styles<br>quality: Conversion quality |
None |
read_document
Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.
Parameters:
filePath(string, required) - Document path to readextractMetadata(boolean, optional) - Extract document metadata, defaults tofalsepreserveFormatting(boolean, optional) - Preserve formatting (HTML output), defaults tofalse
write_document
Write content to document files in specified formats.
Parameters:
content(string, required) - Content to writeoutputPath(string, optional) - Output file path (auto-generated if not provided)encoding(string, optional) - File encoding, defaults toutf-8
convert_document
Convert documents between formats with enhanced style preservation.
Parameters:
inputPath(string, required) - Input file pathoutputPath(string, optional) - Output file path (auto-generated if not provided)preserveFormatting(boolean, optional) - Preserve formatting, defaults totrueuseInternalPlaywright(boolean, optional) - Use built-in Playwright for PDF conversion, defaults tofalse
convert_docx_to_pdf
Convert DOCX to PDF with automatic watermark addition (if configured).
Parameters:
docxPath(string, required) - DOCX file pathoutputPath(string, optional) - Output PDF path (auto-generated if not provided)addQrCode(boolean, optional) - Whether to add QR code, defaults tofalsepreserveFormatting(boolean, optional) - Preserve original formatting, defaults totruechineseFont(string, optional) - Chinese font, defaults toMicrosoft YaHei
convert_markdown_to_pdf
Convert Markdown to PDF with automatic watermark addition (if configured).
Parameters:
markdownPath(string, required) - Markdown file pathoutputPath(string, optional) - Output PDF path (auto-generated if not provided)theme(string, optional) - Theme style, defaults to"github"includeTableOfContents(boolean, optional) - Include table of contents, defaults tofalseaddQrCode(boolean, optional) - Whether to add QR code, defaults tofalse
convert_markdown_to_html
Convert Markdown to HTML.
Parameters:
markdownPath(string, required) - Markdown file pathoutputPath(string, optional) - Output HTML path (auto-generated if not provided)theme(string, optional) - Theme style, defaults to"github"includeTableOfContents(boolean, optional) - Include table of contents, defaults tofalse
convert_markdown_to_docx
Convert Markdown to DOCX.
Parameters:
markdownPath(string, required) - Markdown file pathoutputPath(string, optional) - Output DOCX path (auto-generated if not provided)
convert_html_to_markdown
Convert HTML to Markdown.
Parameters:
htmlPath(string, required) - HTML file pathoutputPath(string, optional) - Output Markdown path (auto-generated if not provided)
plan_conversion
🎯 Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.
Parameters:
sourceFormat(string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)targetFormat(string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)sourceFile(string, optional) - Source file path (for generating specific conversion parameters)preserveStyles(boolean, optional) - Whether to preserve style formatting, defaults totrueincludeImages(boolean, optional) - Whether to include images, defaults totruetheme(string, optional) - Conversion theme, defaults togithubquality(string, optional) - Conversion quality requirement (fast, balanced, high), defaults tobalanced
process_pdf_post_conversion
Parameters:
playwrightPdfPath(string, required) - Generated PDF file pathtargetPath(string, optional) - Target PDF file path (auto-generated if not provided)addWatermark(boolean, optional) - Whether to add watermark, defaults tofalseaddQrCode(boolean, optional) - Whether to add QR code, defaults tofalsewatermarkImage(string, optional) - Watermark image pathqrCodePath(string, optional) - QR code image path
PDF Enhancement Tools
add_watermark
🎨 PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.
Parameters:
pdfPath(string, required) - PDF file pathwatermarkImage(string, optional) - Watermark image path (PNG/JPG)watermarkText(string, optional) - Watermark text contentwatermarkImageScale(number, optional) - Image scale ratio, defaults to0.25watermarkImageOpacity(number, optional) - Image opacity, defaults to0.6watermarkImagePosition(string, optional) - Image position, defaults tofullscreen
add_qrcode
📱 PDF QR Code Addition Tool - Add QR codes to PDF documents.
Parameters:
pdfPath(string, required) - PDF file pathqrCodePath(string, optional) - QR code image pathqrScale(number, optional) - QR code scale ratio, defaults to0.15qrOpacity(number, optional) - QR code opacity, defaults to1.0qrPosition(string, optional) - QR code position, defaults tobottom-centeraddText(boolean, optional) - Whether to add explanatory text, defaults totrue
System Requirements
System Requirements
- Node.js ≥ 18.0.0
- Zero external system dependencies - All processing via npm packages
- Optional Integration: playwright-mcp for enhanced PDF conversion
Core Technology Stack
- pdf-lib - PDF operations and enhancement
- word-extractor - DOCX document text extraction
- marked - Markdown parsing and rendering
- cheerio - HTML parsing and manipulation
- docx - DOCX document generation
- jszip - ZIP file processing
- xml2js - XML parsing and conversion
- Custom OOXML Parser - Advanced DOCX style preservation
Installation
# Global installation
npm install -g doc-ops-mcp
# Or using pnpm
pnpm add -g doc-ops-mcp
# Or using bun
bun add -g doc-ops-mcp
Architecture Components
- MCP Server Core: Handles JSON-RPC 2.0 communication and tool registration
- Smart Router: Routes requests to optimal processing modules
- Conversion Engine: Contains specialized converters for different document types
- Style Processor: Ensures style preservation during format conversion
- Security Module: Provides path validation and content security handling
5. Open Source Licenses
Project License
- This Project: MIT License
- Compatibility: Available for commercial and non-commercial use
Third-Party Dependencies
| Library | Version | License | Purpose |
|---|---|---|---|
| pdf-lib | ^1.17.1 | MIT | PDF document manipulation |
| word-extractor | ^1.0.4 | MIT | DOCX document text extraction |
| marked | ^15.0.12 | MIT | Markdown parsing and rendering |
| cheerio | ^1.0.0-rc.12 | MIT | HTML parsing and manipulation |
| docx | ^9.5.1 | Apache-2.0 | DOCX document generation |
| jszip | ^3.10.1 | MIT | ZIP file processing |
| xml2js | ^0.6.2 | MIT | XML parsing and conversion |
License Compatibility
- ✅ Commercial Use: All dependencies support commercial use
- ✅ Distribution: Free to distribute and modify
- ✅ Patent Protection: Apache-2.0 provides patent protection
- ⚠️ Notice: Original license notices must be retained
6. Future Roadmap
Core Features
- 🔄 Enhanced Conversion Quality: Improve style preservation for complex documents
- 📊 Excel Support: Complete Excel read/write and conversion functionality
- 🎨 Template System: Support for custom document templates
- 🔍 OCR Integration: Image text recognition capabilities
System Improvements
- 🌐 Multi-language Support: Internationalization and localization
- 🔐 Security Enhancements: Document encryption and access control
- ⚡ Performance Optimization: Large file handling and memory optimization
- 🔌 Plugin System: Extensible processor architecture
Version Roadmap
- v2.0: Complete Excel support and template system
- v3.0: OCR integration and multi-language support
- v4.0: Advanced security features and plugin system
7. Docker Deployment
Quick Start
Using Pre-built Image
# Pull the latest image
docker pull docops/doc-ops-mcp:latest
# Run with default configuration
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
docops/doc-ops-mcp:latest
Building from Source
# Clone the repository
git clone https://github.com/JefferyMunoz/doc-ops-mcp.git
cd doc-ops-mcp
# Build the Docker image
docker build -t doc-ops-mcp .
# Run the container
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
doc-ops-mcp
Docker Compose Deployment
Create a docker-compose.yml file:
version: '3.8'
services:
doc-ops-mcp:
image: docops/doc-ops-mcp:latest
container_name: doc-ops-mcp
ports:
- "3000:3000"
volumes:
- ./documents:/app/documents
- ./config:/app/config
environment:
- NODE_ENV=production
- PORT=3000
restart: unless-stopped
# Optional: Add Nginx for reverse proxy
nginx:
image: nginx:alpine
container_name: doc-ops-nginx
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- doc-ops-mcp
restart: unless-stopped
Environment Variables
| Variable | Description | Default |
|---|---|---|
PORT |
Server port | 3000 |
NODE_ENV |
Environment mode | production |
LOG_LEVEL |
Logging level | info |
MAX_FILE_SIZE |
Maximum file size (MB) | 50 |
Volume Mounts
Mount local directories for persistent storage:
# Documents directory for file processing
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
-v $(pwd)/output:/app/output \
doc-ops-mcp
Docker Configuration Examples
Production Deployment
# Production setup with Docker Swarm
docker swarm init
docker stack deploy -c docker-compose.yml doc-ops
# Scale the service
docker service scale doc-ops_mcp=3
Health Checks
The container includes built-in health checks:
# Check container health
docker ps
# View health check logs
docker inspect --format='{{.State.Health.Status}}' doc-ops-mcp
# Manual health check
docker exec doc-ops-mcp curl -f http://localhost:3000/health || exit 1
8. Development Guide
Local Development
# Clone the repository
git clone https://github.com/your-org/doc-ops-mcp.git
cd doc-ops-mcp
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build the project
npm run build
# Run tests
npm test
Project Structure
src/
├── index.ts # MCP server entry point
├── tools/ # Tool implementations
│ ├── documentConverter.ts
│ ├── pdfTools.ts
│ └── ...
├── types/ # Type definitions
└── utils/ # Utility functions
Adding New Tools
- Create a new tool file in
src/tools/ - Implement the tool logic
- Register the tool in
src/index.ts - Add test cases
- Update documentation
9. Troubleshooting
Common Issues
- Port conflicts: Change the host port in docker-compose.yml
- Permission issues: Ensure volume mounts have correct permissions
- Memory issues: Increase Docker memory allocation
Debug Mode
# Run with debug logging
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-e LOG_LEVEL=debug \
doc-ops-mcp
# View logs
docker logs -f doc-ops-mcp
10. Contributing
How to Contribute
- Fork the Project
- Create a Feature Branch (
git checkout -b feature/AmazingFeature) - Commit Your Changes (
git commit -m 'Add some AmazingFeature') - Push to the Branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Intellectual Property License
By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License. This means:
- You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
- You confirm that you have the right to make these contributions
- You understand that your contributions will become part of the open source project
- You waive any claims to exclusive ownership of the contributed code
If you cannot agree to these terms, please do not submit a Pull Request.
Code Standards
- Use TypeScript
- Follow ESLint configuration
- Add appropriate tests
- Update relevant documentation
Reporting Issues
- Use GitHub Issues
- Provide detailed error information and reproduction steps
- Include system environment information
License
This project is licensed under the MIT License - see the LICENSE file for details.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。