mac-use-mcp
Zero-dependency macOS desktop automation for AI agents. Screenshot, mouse, keyboard, clipboard, and window control via MCP. 18 tools, macOS 13+, one command: npx mac-use-mcp.
README
mac-use-mcp

[!WARNING] This tool has full control over mouse, keyboard, and screen. Please use in a sandboxed environment to protect your privacy and avoid accidental data loss by your agents. You are responsible for any actions performed through this tool.
Zero-native-dependency macOS desktop automation via MCP.
Give AI agents eyes and hands on macOS — click, type, screenshot, and inspect any application.
Use Cases
- Automated UI testing — click buttons, verify element states with
get_ui_elements, validate screen content viascreenshot - Desktop workflow automation — launch apps with
open_application, fill forms withtype_text, navigate menus viaclick_menu - Screenshot-based monitoring — capture screen regions periodically with
screenshotfor visual diffing or alerting - Accessibility inspection — query UI element trees with
get_ui_elementsfor QA and compliance checks - AI agent computer use — give LLMs eyes and hands on macOS via
screenshot,click,type_text, and more
Why mac-use-mcp?
- Just works —
npx mac-use-mcpand grant two macOS permissions. No node-gyp, no Xcode tools, no build step. - 18 tools, one server — screenshots, clicks, keystrokes, window management, accessibility inspection, and clipboard.
- macOS 13+ on Intel and Apple Silicon — no native addons, no architecture headaches.
Install
Requirements: macOS 13+ and Node.js 22+. The server communicates over stdio transport.
This package only works on macOS. It will refuse to install on other operating systems.
No build steps. No native dependencies. Just run:
npx mac-use-mcp
npxwill prompt to install the package on first run. Usenpx -y mac-use-mcpto skip the confirmation.
[!TIP] Model selection matters. Desktop automation involves screenshot–action loops that add up in token usage. A fast model with solid reasoning, good vision, and reliable tool calling is recommended:
Model Provider Gemini 3 Flash Claude Sonnet 4.6 Anthropic GPT-5 mini OpenAI MiniMax-M2.5 MiniMax Kimi K2.5 Moonshot AI Qwen3.5 Alibaba GLM-4.7 Zhipu AI
Permission Setup
mac-use-mcp requires two macOS permissions to function. Grant them once and you're set.
Accessibility
Required for mouse and keyboard control.
- Open System Settings > Privacy & Security > Accessibility
- Click the + button
- Add your MCP client application (e.g., Claude Desktop, your terminal emulator)
- Ensure the toggle is enabled
Screen Recording
Required for screenshots.
- Open System Settings > Privacy & Security > Screen Recording
- Click the + button
- Add your MCP client application
- Ensure the toggle is enabled
- Restart the application if prompted
Verify permissions
After granting both permissions and configuring your MCP client (see next section), use the check_permissions tool to confirm everything is working:
> check_permissions
✓ Accessibility: granted
✓ Screen Recording: granted
MCP Client Configuration
<details open> <summary><strong>Claude Code</strong></summary>
claude mcp add mac-use-mcp -- npx mac-use-mcp
</details>
<details> <summary><strong>Claude Desktop</strong></summary>
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>OpenAI Codex</strong></summary>
Add to ~/.codex/config.toml:
[mcp_servers.mac-use]
command = "npx"
args = ["-y", "mac-use-mcp"]
Or via CLI:
codex mcp add mac-use -- npx -y mac-use-mcp
</details>
<details> <summary><strong>Google Antigravity</strong></summary>
Add to ~/.gemini/antigravity/mcp_config.json:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>Gemini CLI</strong></summary>
Add to ~/.gemini/settings.json:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>VS Code / Copilot</strong></summary>
Add to .vscode/mcp.json in your workspace (or open the Command Palette and run MCP: Open User Configuration for global setup):
{
"servers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>Cursor</strong></summary>
Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project-level):
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>Windsurf</strong></summary>
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>Cline</strong></summary>
Open Cline's MCP settings (in the Cline extension panel, click the MCP servers icon), then add:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
<details> <summary><strong>Kiro</strong></summary>
Add to ~/.aws/amazonq/mcp.json:
{
"mcpServers": {
"mac-use-mcp": {
"command": "npx",
"args": ["mac-use-mcp"]
}
}
}
</details>
Tools
This Node.js MCP server exposes 18 tools for mouse, keyboard, and screen control to any MCP-compatible client.
Screen
| Tool | Description |
|---|---|
screenshot |
Capture the screen, a region, or a window by title (PNG or JPEG) |
get_screen_info |
Get display count, resolution, origin, and scale factor for each display |
Input
| Tool | Description |
|---|---|
click |
Click at screen coordinates with button, click count, and modifier options |
move_mouse |
Move the cursor to a position |
scroll |
Scroll up, down, left, or right at a position |
drag |
Drag from one point to another over a configurable duration |
type_text |
Type text at the cursor position (supports Unicode, CJK, and emoji) |
press_key |
Press a key or key combination (e.g., "cmd+c", "Return") |
Window & App
| Tool | Description |
|---|---|
list_windows |
List all visible windows with positions and sizes |
focus_window |
Activate an app and bring a specific window to the front |
open_application |
Launch an application by name |
click_menu |
Click a menu bar item by path (e.g., "File > Save As...") |
App names support fuzzy matching — "chrome" resolves to "Google Chrome", "code" to "Code", etc.
Accessibility
| Tool | Description |
|---|---|
get_ui_elements |
Query UI elements via Accessibility API — find buttons, text fields, and other controls by role or title |
Clipboard
| Tool | Description |
|---|---|
clipboard_read |
Read the current system clipboard as plain text |
clipboard_write |
Write text to the system clipboard |
Utility
| Tool | Description |
|---|---|
wait |
Pause for a specified duration (in milliseconds, 0–10 000) |
check_permissions |
Verify Accessibility and Screen Recording access |
get_cursor_position |
Get current cursor coordinates |
Examples
Common workflow patterns using mac-use-mcp tools:
Screenshot a specific window
1. focus_window({ app: "Safari" })
2. screenshot({ mode: "window", window_title: "Safari" })
Click a button in a dialog
1. get_ui_elements({ app: "Finder", role: "AXButton" })
→ finds "OK" button at position (500, 300)
2. click({ x: 500, y: 300 })
Automate a menu action
1. open_application({ name: "TextEdit" })
2. click_menu({ app: "TextEdit", path: "Format > Make Plain Text" })
Copy text between apps
1. focus_window({ app: "Safari" })
2. press_key({ key: "cmd+a" }) # select all
3. press_key({ key: "cmd+c" }) # copy
4. focus_window({ app: "Notes" })
5. press_key({ key: "cmd+v" }) # paste
How It Works
- Swift binary handles mouse input (CGEvent), screen capture (CGWindowListCreateImage), window enumeration (CGWindowListCopyWindowInfo), and UI element queries (Accessibility API)
- AppleScript handles keyboard input (System Events
key code), window focus, and menu clicks - Node.js MCP server orchestrates everything over stdio, translating MCP tool calls into system operations
- No native Node.js addons — the Swift binary is pre-compiled and ships with the npm package
- Serial execution queue prevents race conditions between system operations
Known Limitations
- Screen Recording prompt on Sequoia: macOS 15 shows a monthly system prompt asking to reconfirm Screen Recording access. This is an OS-level behavior and cannot be suppressed.
- Secure input fields: Password fields and other secure text inputs block synthetic keyboard events. This is a macOS security feature.
- Keyboard input on macOS 26+: CGEvent keyboard synthesis is silently blocked. Keyboard input uses AppleScript (
System Events key code) as a workaround, which may behave differently in some edge cases. - System dialogs: Some system-level dialogs (e.g., FileVault unlock, Login Window) cannot be interacted with programmatically due to macOS security restrictions.
- Headless / CI: Requires a graphical session. Headless macOS environments (e.g., standard GitHub Actions runners) are not supported.
Troubleshooting
<details> <summary><strong>Permission prompts keep appearing</strong></summary>
Grant Accessibility and Screen Recording permissions to your terminal app in System Settings > Privacy & Security. A restart of the terminal may be required.
</details>
<details> <summary><strong>macOS Sequoia permission dialogs</strong></summary>
macOS 15 (Sequoia) introduced stricter permission prompts. Allow the prompts when they appear. The check_permissions tool can verify your current permission status.
</details>
<details> <summary><strong>Secure input fields</strong></summary>
Some password fields and secure text inputs block programmatic key events. This is a macOS security feature. Use clipboard_write + press_key("cmd+v") as a workaround.
</details>
<details> <summary><strong>Screen recording shows black screenshots</strong></summary>
Ensure Screen Recording permission is granted to your terminal app (not just Accessibility). Restart the terminal after granting.
</details>
Related Projects
- Playwright MCP — Browser automation via accessibility tree. Complements mac-use-mcp for web-only tasks.
- Peekaboo — macOS screen automation with ScreenCaptureKit. Requires macOS 15+ and a Swift build.
- awesome-mcp-servers — Curated list of MCP servers across the ecosystem.
Contributing
See CONTRIBUTING.md for development setup and guidelines.
Security
To report a vulnerability, see SECURITY.md.
Support
- Found a bug? Open an issue
- Have a feature idea? Open an issue
- Like the project? Give it a star — it helps others discover mac-use-mcp.
License
MIT © 2026 antbotlab
macOS is a trademark of Apple Inc., registered in the U.S. and other countries and regions.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。