ByteBot MCP Server
Enables autonomous task execution and direct desktop computer control through ByteBot's dual-API architecture, supporting intelligent hybrid workflows with mouse/keyboard operations, screen capture, file I/O, and automatic intervention handling.
README
ByteBot MCP Server
Production-grade Model Context Protocol (MCP) server for ByteBot's dual-API architecture, providing intelligent hybrid workflow orchestration for autonomous task execution and desktop computer control.
Overview
This MCP server integrates ByteBot's Agent API (task management) and Desktop API (computer control) into a unified interface for AI assistants like Claude. It enables:
- Autonomous Task Execution: Create and manage tasks for ByteBot to execute independently
- Direct Computer Control: Mouse, keyboard, screen capture, and file operations
- Hybrid Workflows: Intelligent orchestration with automatic monitoring and intervention handling
- Real-time Updates: Optional WebSocket support for live task status notifications
Features
Agent API Tools (Task Management)
bytebot_create_task- Create new tasks with priority levelsbytebot_list_tasks- List and filter tasks by status/prioritybytebot_get_task- Get detailed task information with message historybytebot_get_in_progress_task- Check currently running taskbytebot_update_task- Update task status or prioritybytebot_delete_task- Delete tasks
Desktop API Tools (Computer Control)
Mouse Operations:
bytebot_move_mouse- Move cursor to coordinatesbytebot_click- Click with left/right/middle buttonbytebot_drag- Drag from one position to anotherbytebot_scroll- Scroll in any direction
Keyboard Operations:
bytebot_type_text- Type text stringsbytebot_paste_text- Paste text (for special characters)bytebot_press_keys- Keyboard shortcuts (Ctrl+C, Alt+Tab, etc.)
Screen Operations:
bytebot_screenshot- Capture screen as base64 PNGbytebot_cursor_position- Get current cursor position
File I/O:
bytebot_read_file- Read file content (base64)bytebot_write_file- Write file content (base64)
System:
bytebot_switch_application- Switch to applicationbytebot_wait- Wait for specified duration
Hybrid Orchestration Tools (Priority 1)
bytebot_create_and_monitor_task- Create task and wait for completionbytebot_monitor_task- Monitor existing task until terminal statebytebot_intervene_in_task- Provide help when task needs interventionbytebot_execute_workflow- Multi-step workflow with automatic error recovery
Prerequisites
- Node.js: 20.x or higher
- ByteBot Instance: Running and accessible at configured endpoints
- Agent API (default:
http://localhost:9991) - Desktop API (default:
http://localhost:9990)
- Agent API (default:
Installation
# Clone or download this repository
cd bytebot-mcp-server
# Install dependencies
npm install
# Build TypeScript code
npm run build
Configuration
1. Create Environment File
Copy the example environment file and customize:
cp .env.example .env
2. Edit .env File
# ByteBot Agent API (Task Management)
BYTEBOT_AGENT_URL=http://localhost:9991
# ByteBot Desktop API (Computer Control)
BYTEBOT_DESKTOP_URL=http://localhost:9990
# WebSocket Configuration (Optional)
BYTEBOT_WS_URL=ws://localhost:9991
ENABLE_WEBSOCKET=false
# Server Configuration
MCP_SERVER_NAME=bytebot-mcp
# Timeouts (milliseconds)
REQUEST_TIMEOUT=30000
DESKTOP_ACTION_TIMEOUT=10000
# Retry Configuration
MAX_RETRIES=3
RETRY_DELAY=1000
# Monitoring Configuration
TASK_POLL_INTERVAL=2000
TASK_MONITOR_TIMEOUT=300000
# File Configuration
MAX_FILE_SIZE=10485760
# Logging
LOG_LEVEL=info
3. Remote ByteBot Configuration
If ByteBot is running on a remote server:
BYTEBOT_AGENT_URL=http://your-server.com:9991
BYTEBOT_DESKTOP_URL=http://your-server.com:9990
BYTEBOT_WS_URL=ws://your-server.com:9991
MCP Client Setup
Claude Desktop
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"bytebot": {
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Zed Editor
Add to your Zed settings:
{
"context_servers": {
"bytebot": {
"command": {
"path": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"]
},
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
}
}
Continue.dev
Add to .continue/config.json:
{
"mcpServers": [
{
"name": "bytebot",
"command": "node",
"args": ["/absolute/path/to/bytebot-mcp-server/dist/index.js"],
"env": {
"BYTEBOT_AGENT_URL": "http://localhost:9991",
"BYTEBOT_DESKTOP_URL": "http://localhost:9990"
}
}
]
}
Usage Examples
Example 1: Basic Task Creation
User: Create a task for ByteBot to search Wikipedia for "quantum computing"
Claude uses: bytebot_create_task
{
"description": "Go to wikipedia.org and search for 'quantum computing'",
"priority": "MEDIUM"
}
Response:
{
"id": "task-123",
"status": "PENDING",
"priority": "MEDIUM",
"createdAt": "2024-01-15T10:30:00Z"
}
Example 2: Hybrid Workflow (Create → Monitor → Complete)
User: Create a task to log into example.com and wait for it to complete
Claude uses: bytebot_create_and_monitor_task
{
"description": "Navigate to example.com and log in with credentials from keychain",
"timeout": 60000,
"pollInterval": 2000
}
Response:
{
"taskId": "task-456",
"finalStatus": "COMPLETED",
"completedAt": "2024-01-15T10:31:45Z",
"messagesCount": 12,
"task": { ... full task details ... }
}
Example 3: Task Needs Intervention
User: Create a task to fill out a complex form
Claude uses: bytebot_create_and_monitor_task
{
"description": "Fill out the registration form at example.com/register"
}
Response (after monitoring):
{
"taskId": "task-789",
"finalStatus": "NEEDS_HELP",
"task": {
"id": "task-789",
"status": "NEEDS_HELP",
"messages": [
{
"role": "assistant",
"content": "I need the user's phone number to complete this form"
}
]
}
}
User: My phone number is 555-1234
Claude uses: bytebot_intervene_in_task
{
"taskId": "task-789",
"message": "User's phone number is 555-1234",
"action": "resume",
"continueMonitoring": true
}
Response:
{
"taskId": "task-789",
"status": "COMPLETED",
"intervention": "applied"
}
Example 4: Interactive Desktop Control
User: Take a screenshot and click at position (500, 300)
Claude uses: bytebot_screenshot
Response: { "screenshot": "iVBORw0KG..." }
Claude uses: bytebot_click
{
"x": 500,
"y": 300,
"button": "left"
}
Response: ✓ bytebot_click completed successfully
Example 5: Multi-Step Workflow
User: Execute a workflow to open Firefox, navigate to GitHub, and take a screenshot
Claude uses: bytebot_execute_workflow
{
"steps": [
{
"name": "Open Firefox",
"description": "Switch to Firefox browser application"
},
{
"name": "Navigate to GitHub",
"description": "Navigate to github.com in the browser"
},
{
"name": "Take Screenshot",
"description": "Capture a screenshot of the GitHub homepage"
}
],
"priority": "HIGH"
}
Response:
{
"steps": [
{ "name": "Open Firefox", "taskId": "task-001", "status": "COMPLETED" },
{ "name": "Navigate to GitHub", "taskId": "task-002", "status": "COMPLETED" },
{ "name": "Take Screenshot", "taskId": "task-003", "status": "COMPLETED" }
],
"overallStatus": "completed",
"totalInterventions": 0
}
Example 6: File Operations
User: Read the contents of /home/user/data.txt
Claude uses: bytebot_read_file
{
"path": "/home/user/data.txt"
}
Response: { "content": "SGVsbG8gV29ybGQh..." } // Base64 encoded
Troubleshooting
Error: "Cannot connect to ByteBot server"
Cause: ByteBot is not running or endpoint URL is incorrect
Solution:
- Verify ByteBot is running:
curl http://localhost:9991/tasks - Check
.envfile has correct URLs - Ensure no firewall blocking connections
Error: "Request to ByteBot timed out"
Cause: Task took longer than configured timeout
Solution:
- Increase
REQUEST_TIMEOUTin.envfor Agent API calls - Increase
DESKTOP_ACTION_TIMEOUTfor Desktop API calls - Use
bytebot_create_and_monitor_taskwith custom timeout:{ "description": "Long running task", "timeout": 600000 }
Error: "Task with ID xyz not found"
Cause: Task was deleted or ID is incorrect
Solution:
- List all tasks:
bytebot_list_tasks - Verify task ID from response
- Check if task was accidentally deleted
Warning: "Screenshot size is 8.5MB"
Cause: Screenshot is very large (high resolution display)
Solution:
- This is just a warning, screenshot still works
- Consider reducing screen resolution if frequently capturing screenshots
- Screenshots >5MB will show this warning
Error: "Task must be in NEEDS_HELP state"
Cause: Attempting to intervene in task that doesn't need help
Solution:
- Check task status first:
bytebot_get_task - Only use
bytebot_intervene_in_taskwhen status isNEEDS_HELP - Use
bytebot_update_taskto manually change status if needed
WebSocket Connection Failed
Cause: WebSocket URL incorrect or ByteBot doesn't support WebSocket
Solution:
- Set
ENABLE_WEBSOCKET=falsein.envto disable WebSocket - Server will automatically fall back to HTTP polling
- WebSocket is optional - all features work without it
Error: "File size exceeds maximum allowed size"
Cause: Trying to upload/read file larger than 10MB
Solution:
- Increase
MAX_FILE_SIZEin.env(in bytes) - Split large files into smaller chunks
- Compress files before uploading
API Reference
Task Priority Levels
LOW- Background tasks, non-urgentMEDIUM- Default priority (recommended)HIGH- Important tasks, process soonURGENT- Critical tasks, process immediately
Task Lifecycle States
PENDING- Task created, waiting to startIN_PROGRESS- Task currently executingNEEDS_HELP- Task blocked, requires interventionNEEDS_REVIEW- Task complete but needs verificationCOMPLETED- Task finished successfullyCANCELLED- Task cancelled by userFAILED- Task failed with error
Mouse Buttons
left- Primary button (default)right- Context menu buttonmiddle- Scroll wheel click
Scroll Directions
up- Scroll updown- Scroll downleft- Scroll leftright- Scroll right
Common Applications
firefox- Mozilla Firefoxchrome- Google Chromesafari- Safari (macOS)terminal- Terminal/Command Promptvscode- Visual Studio Code
Architecture
┌─────────────────────────────────────────────┐
│ MCP Client (Claude) │
└─────────────────┬───────────────────────────┘
│ stdio transport
┌─────────────────▼───────────────────────────┐
│ ByteBot MCP Server │
│ ┌────────────────────────────────────────┐ │
│ │ Agent Tools │ Desktop Tools │ │
│ │ Hybrid Orchestrator │ │
│ └────────────┬──────────────┬─────────────┘ │
└───────────────┼──────────────┼───────────────┘
│ │
┌──────────▼──┐ ┌──────▼──────┐
│ Agent API │ │ Desktop API │
│ (port 9991) │ │ (port 9990) │
└─────────────┘ └─────────────┘
│ │
┌──────▼───────────────────▼──────┐
│ ByteBot Instance │
└─────────────────────────────────┘
Development
Build
npm run build
Type Check
npm run type-check
Watch Mode
npm run dev
Environment Variables Reference
| Variable | Default | Description |
|---|---|---|
BYTEBOT_AGENT_URL |
http://localhost:9991 |
ByteBot Agent API endpoint |
BYTEBOT_DESKTOP_URL |
http://localhost:9990 |
ByteBot Desktop API endpoint |
BYTEBOT_WS_URL |
ws://localhost:9991 |
WebSocket endpoint for real-time updates |
ENABLE_WEBSOCKET |
false |
Enable WebSocket connections |
MCP_SERVER_NAME |
bytebot-mcp |
Server identifier |
REQUEST_TIMEOUT |
30000 |
HTTP request timeout (ms) |
DESKTOP_ACTION_TIMEOUT |
10000 |
Desktop action timeout (ms) |
MAX_RETRIES |
3 |
Maximum retry attempts for failed requests |
RETRY_DELAY |
1000 |
Initial retry delay (ms) |
TASK_POLL_INTERVAL |
2000 |
Task status polling interval (ms) |
TASK_MONITOR_TIMEOUT |
300000 |
Maximum task monitoring duration (ms) |
MAX_FILE_SIZE |
10485760 |
Maximum file size in bytes (10MB) |
LOG_LEVEL |
info |
Logging level (debug/info/warn/error) |
License
MIT
Support
For issues and questions:
- ByteBot Documentation: https://docs.bytebot.ai
- MCP Specification: https://modelcontextprotocol.io
- Report issues: Create an issue in this repository
Version History
1.0.0 (2024-01-15)
- Initial release
- Agent API integration (task management)
- Desktop API integration (computer control)
- Hybrid orchestration tools
- WebSocket support for real-time updates
- Comprehensive error handling and retry logic
- Full TypeScript implementation with strict typing
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。