WarpGBM MCP Service
Provides GPU-accelerated gradient boosting model training and inference through a cloud service. Enables AI agents to train models on NVIDIA A10G GPUs and get fast cached predictions with portable model artifacts.
README
⚡ WarpGBM MCP Service
GPU-accelerated gradient boosting as a cloud MCP service
Train on A10G GPUs • Getartifact_idfor <100ms cached predictions • Download portable artifacts
<div align="center">
🌐 Live Service • 📖 API Docs • 🤖 Agent Guide • 🐍 Python Package
</div>
🎯 What is This?
Outsource your GBDT workload to the world's fastest GPU implementation.
WarpGBM MCP is a stateless cloud service that gives AI agents instant access to GPU-accelerated gradient boosting. Built on WarpGBM (91+ ⭐), this service handles training on NVIDIA A10G GPUs while you receive portable model artifacts and benefit from smart 5-minute caching.
🏗️ How It Works (The Smart Cache Workflow)
graph LR
A[Train on GPU] --> B[Get artifact_id + model]
B --> C[5min Cache]
C --> D[<100ms Predictions]
B --> E[Download Artifact]
E --> F[Use Anywhere]
- Train: POST your data → Train on A10G GPU → Get
artifact_id+ portable artifact - Fast Path: Use
artifact_id→ Sub-100ms cached predictions (5min TTL) - Slow Path: Use
model_artifact_joblib→ Download and use anywhere
Architecture: 🔒 Stateless • 🚀 No model storage • 💾 You own your artifacts
⚡ Quick Start
For AI Agents (MCP)
Add to your MCP settings (e.g., .cursor/mcp.json):
{
"mcpServers": {
"warpgbm": {
"url": "https://warpgbm.ai/mcp/sse"
}
}
}
For Developers (REST API)
# 1. Train a model
curl -X POST https://warpgbm.ai/train \
-H "Content-Type: application/json" \
-d '{
"X": [[5.1,3.5,1.4,0.2], [6.7,3.1,4.4,1.4], ...],
"y": [0, 1, 2, ...],
"model_type": "warpgbm",
"objective": "multiclass"
}'
# Response includes artifact_id for fast predictions
# {"artifact_id": "abc-123", "model_artifact_joblib": "H4sIA..."}
# 2. Make fast predictions (cached, <100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
-H "Content-Type: application/json" \
-d '{
"artifact_id": "abc-123",
"X": [[5.0,3.4,1.5,0.2]]
}'
🚀 Key Features
| Feature | Description |
|---|---|
| 🎯 Multi-Model | WarpGBM (GPU) + LightGBM (CPU) |
| ⚡ Smart Caching | artifact_id → 5min cache → <100ms inference |
| 📦 Portable Artifacts | Download joblib models, use anywhere |
| 🤖 MCP Native | Direct tool integration for AI agents |
| 💰 X402 Payments | Optional micropayments (Base network) |
| 🔒 Stateless | No data storage, you own your models |
| 🌐 Production Ready | Deployed on Modal with custom domain |
🐍 Python Package vs MCP Service
This repo is the MCP service wrapper. For production ML workflows, consider using the WarpGBM Python package directly:
| Feature | MCP Service (This Repo) | Python Package |
|---|---|---|
| Installation | None needed | pip install git+https://... |
| GPU | Cloud (pay-per-use) | Your GPU (free) |
| Control | REST API parameters | Full Python API |
| Features | Train, predict, upload | + Cross-validation, callbacks, feature importance |
| Best For | Quick experiments, demos | Production pipelines, research |
| Cost | $0.01 per training | Free (your hardware) |
Use this MCP service for: Quick tests, prototyping, agents without local GPU
Use Python package for: Production ML, research, cost savings, full control
📡 Available Endpoints
Core Endpoints
| Method | Endpoint | Description |
|---|---|---|
GET |
/models |
List available model backends |
POST |
/train |
Train model, get artifact_id + model |
POST |
/predict_from_artifact |
Fast predictions (artifact_id or model) |
POST |
/predict_proba_from_artifact |
Probability predictions |
POST |
/upload_data |
Upload CSV/Parquet for training |
POST |
/feedback |
Submit feedback to improve service |
GET |
/healthz |
Health check with GPU status |
MCP Integration
| Method | Endpoint | Description |
|---|---|---|
SSE |
/mcp/sse |
MCP Server-Sent Events endpoint |
GET |
/.well-known/mcp.json |
MCP capability manifest |
GET |
/.well-known/x402 |
X402 pricing manifest |
💡 Complete Example: Iris Dataset
# 1. Train WarpGBM on Iris (60 samples recommended for proper binning)
curl -X POST https://warpgbm.ai/train \
-H "Content-Type: application/json" \
-d '{
"X": [[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5],
[5.1,3.5,1.4,0.2], [4.9,3,1.4,0.2], [4.7,3.2,1.3,0.2], [4.6,3.1,1.5,0.2], [5,3.6,1.4,0.2],
[7,3.2,4.7,1.4], [6.4,3.2,4.5,1.5], [6.9,3.1,4.9,1.5], [5.5,2.3,4,1.3], [6.5,2.8,4.6,1.5],
[6.3,3.3,6,2.5], [5.8,2.7,5.1,1.9], [7.1,3,5.9,2.1], [6.3,2.9,5.6,1.8], [6.5,3,5.8,2.2],
[7.6,3,6.6,2.1], [4.9,2.5,4.5,1.7], [7.3,2.9,6.3,1.8], [6.7,2.5,5.8,1.8], [7.2,3.6,6.1,2.5]],
"y": [0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2,
0,0,0,0,0, 1,1,1,1,1, 2,2,2,2,2,2,2,2,2,2],
"model_type": "warpgbm",
"objective": "multiclass",
"n_estimators": 100
}'
# Response:
{
"artifact_id": "abc123-def456-ghi789",
"model_artifact_joblib": "H4sIA...",
"training_time_seconds": 0.0
}
# 2. Fast inference with cached artifact_id (<100ms)
curl -X POST https://warpgbm.ai/predict_from_artifact \
-H "Content-Type: application/json" \
-d '{
"artifact_id": "abc123-def456-ghi789",
"X": [[5,3.4,1.5,0.2], [6.7,3.1,4.4,1.4], [7.7,3.8,6.7,2.2]]
}'
# Response: {"predictions": [0, 1, 2], "inference_time_seconds": 0.05}
# Perfect classification! ✨
⚠️ Important: WarpGBM uses quantile binning which requires 60+ samples for proper training. With fewer samples, the model can't learn proper decision boundaries.
🏠 Self-Hosting
Local Development
# Clone repo
git clone https://github.com/jefferythewind/mcp-warpgbm.git
cd mcp-warpgbm
# Setup environment
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
# Run locally (GPU optional for dev)
uvicorn local_dev:app --host 0.0.0.0 --port 8000 --reload
# Test
curl http://localhost:8000/healthz
Deploy to Modal (Production)
# Install Modal
pip install modal
# Authenticate
modal token new
# Deploy
modal deploy modal_app.py
# Service will be live at your Modal URL
Deploy to Other Platforms
# Docker (requires GPU)
docker build -t warpgbm-mcp .
docker run --gpus all -p 8000:8000 warpgbm-mcp
# Fly.io, Railway, Render, etc.
# See their respective GPU deployment docs
🧪 Testing
# Install dev dependencies
pip install -r requirements-dev.txt
# Run all tests
./run_tests.sh
# Or use pytest directly
pytest tests/ -v
# Test specific functionality
pytest tests/test_train.py -v
pytest tests/test_integration.py -v
📦 Project Structure
mcp-warpgmb/
├── app/
│ ├── main.py # FastAPI app + routes
│ ├── mcp_sse.py # MCP Server-Sent Events
│ ├── model_registry.py # Model backend registry
│ ├── models.py # Pydantic schemas
│ ├── utils.py # Serialization, caching
│ ├── x402.py # Payment verification
│ └── feedback_storage.py # Feedback persistence
├── .well-known/
│ ├── mcp.json # MCP capability manifest
│ └── x402 # X402 pricing manifest
├── docs/
│ ├── AGENT_GUIDE.md # Comprehensive agent docs
│ ├── MODEL_SUPPORT.md # Model parameter reference
│ └── WARPGBM_PYTHON_GUIDE.md
├── tests/
│ ├── test_train.py
│ ├── test_predict.py
│ ├── test_integration.py
│ └── conftest.py
├── examples/
│ ├── simple_train.py
│ └── compare_models.py
├── modal_app.py # Modal deployment config
├── local_dev.py # Local dev server
├── requirements.txt
└── README.md
💰 Pricing (X402)
Optional micropayments on Base network:
| Endpoint | Price | Description |
|---|---|---|
/train |
$0.01 | Train model on GPU, get artifacts |
/predict_from_artifact |
$0.001 | Batch predictions |
/predict_proba_from_artifact |
$0.001 | Probability predictions |
/feedback |
Free | Help us improve! |
Note: Payment is optional for demo/testing. See
/.well-known/x402for details.
🔐 Security & Privacy
✅ Stateless: No training data or models persisted
✅ Sandboxed: Runs in temporary isolated directories
✅ Size Limited: Max 50 MB request payload
✅ No Code Execution: Only structured JSON parameters
✅ Rate Limited: Per-IP throttling to prevent abuse
✅ Read-Only FS: Modal deployment uses immutable filesystem
🌍 Available Models
🚀 WarpGBM (GPU)
- Acceleration: NVIDIA A10G GPUs
- Speed: 13× faster than LightGBM
- Best For: Time-series, financial modeling, temporal data
- Special: Era-aware splitting, invariant learning
- Min Samples: 60+ recommended
⚡ LightGBM (CPU)
- Acceleration: Highly optimized CPU
- Speed: 10-100× faster than sklearn
- Best For: General tabular data, large datasets
- Special: Categorical features, low memory
- Min Samples: 20+
🗺️ Roadmap
- [x] Core training + inference endpoints
- [x] Smart artifact caching (5min TTL)
- [x] MCP Server-Sent Events integration
- [x] X402 payment verification
- [x] Modal deployment with GPU
- [x] Custom domain (warpgbm.ai)
- [x] Smithery marketplace listing
- [ ] ONNX export support
- [ ] Async job queue for large datasets
- [ ] S3/IPFS dataset URL support
- [ ] Python client library (
warpgbm-client) - [ ] Additional model backends (XGBoost, CatBoost)
💬 Feedback & Support
Help us make this service better for AI agents!
Submit feedback about:
- Missing features that would unlock new use cases
- Confusing documentation or error messages
- Performance issues or timeout problems
- Additional model types you'd like to see
# Via API
curl -X POST https://warpgbm.ai/feedback \
-H "Content-Type: application/json" \
-d '{
"feedback_type": "feature_request",
"message": "Add support for XGBoost backend",
"severity": "medium"
}'
Or via:
- GitHub Issues: mcp-warpgbm/issues
- GitHub Discussions: warpgbm/discussions
- Email: support@warpgbm.ai
📚 Learn More
- 🐍 WarpGBM Python Package - The core library (91+ ⭐)
- 🤖 Agent Guide - Complete usage guide for AI agents
- 📖 API Docs - Interactive OpenAPI documentation
- 🔌 Model Context Protocol - MCP specification
- 💰 X402 Specification - Payment protocol for agents
- ☁️ Modal Docs - Serverless GPU platform
📄 License
GPL-3.0 (same as WarpGBM core)
This ensures improvements to the MCP wrapper benefit the community, while allowing commercial use through the cloud service.
🙏 Credits
Built with:
- WarpGBM - GPU-accelerated GBDT library
- Modal - Serverless GPU infrastructure
- FastAPI - Modern Python web framework
- LightGBM - Microsoft's GBDT library
<div align="center">
Built with ❤️ for the open agent economy
⭐ Star on GitHub • 🚀 Try Live Service • 📖 Read the Docs
</div>
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。