hyprland-mcp

hyprland-mcp

An MCP server for Hyprland desktop automation that allows AI assistants to see the screen, control mouse and keyboard, and manage windows using native Wayland tools. It integrates OCR for text-based interaction and supports complex multi-monitor setups with pixel-accurate coordinate mapping.

Category
访问服务器

README

hyprland-mcp

MCP server for Hyprland desktop automation. Gives AI assistants the ability to see the screen, control mouse and keyboard, manage windows, and interact with the desktop — all through Hyprland's native Wayland tools.

Built for Claude Code, but works with any MCP client.

What it does

  • Screenshots — Capture the full desktop, a specific monitor, window, or region. Images are automatically resized and JPEG-compressed to fit within MCP output limits. Every screenshot includes a coordinate mapping so the AI knows how to translate image positions to screen coordinates.
  • OCR — Find and click text on screen using Tesseract. click_text("Send") captures a screenshot, runs OCR, finds the text, and clicks it — all in one tool call. Auto-scopes to the active window for better accuracy.
  • Mouse — Move, click, scroll, and drag. Positioning uses Hyprland's native movecursor (pixel-accurate, no mouse acceleration issues).
  • Keyboard — Type text or send key combinations. Shortcuts can target specific windows without focusing them.
  • Window management — List, focus, close, move, resize, fullscreen, and float windows.
  • Workspaces & monitors — List workspaces, switch between them, query monitor layout and cursor position.
  • Clipboard — Read and write clipboard text.
  • App launching — Launch applications through Hyprland (detached, no shell expansion).

Requirements

  • Hyprland (Wayland compositor)
  • Python 3.10+
  • System tools: grim, wtype, ydotool, wl-clipboard, tesseract

The install script checks for all of these and offers to install any that are missing.

Installation

curl -sSL https://raw.githubusercontent.com/alderban107/hyprland-mcp/main/install.sh | bash

The install script handles everything automatically:

  1. Detects your package manager (pacman, apt, dnf, zypper, xbps, emerge, nix)
  2. Installs any missing system dependencies
  3. Installs hyprland-mcp via pipx
  4. Registers the MCP server with Claude Code

Restart Claude Code after installing.

Verify with claude mcp list — you should see hyprland: ✓ Connected.

<details> <summary>Manual installation</summary>

pipx install git+https://github.com/alderban107/hyprland-mcp.git
claude mcp add --transport stdio --scope user hyprland -- hyprland-mcp

Or from a local clone:

git clone https://github.com/alderban107/hyprland-mcp.git
cd hyprland-mcp
python3 -m venv .venv
.venv/bin/pip install -e .
claude mcp add --transport stdio --scope user hyprland -- /path/to/hyprland-mcp/.venv/bin/hyprland-mcp

</details>

Tools (27)

Screenshot & OCR

Tool Description
screenshot Capture desktop, monitor, window, or region. Returns inline JPEG + coordinate mapping for translating image positions to screen coordinates.
screenshot_with_ocr Screenshot + OCR in one call. Returns the image and all detected text. Auto-scopes to active window.
click_text Find text on screen via OCR and click it. One tool call replaces screenshot → parse → click. Auto-scopes to active window.
find_text_on_screen Find text on screen via OCR. Returns screen coordinates of all matches, ready for mouse_click.
type_into Find a text input field by placeholder text, click it, type, and optionally press Enter.

Mouse

Tool Description
mouse_move Move cursor to absolute coordinates (pixel-accurate via Hyprland's movecursor)
mouse_click Click at position or current location (left/right/middle, single/double)
mouse_scroll Scroll wheel up/down at position or current location
mouse_drag Click-drag from one position to another

Keyboard

Tool Description
type_text Type text as keyboard input (via wtype)
key_press Press a key combination like ctrl+c, alt+F4 (via Hyprland sendshortcut)
send_shortcut Send a shortcut with explicit modifiers and key, optionally targeting a specific window

Window Management

Tool Description
list_windows List all windows with class, title, size, position (filterable by workspace/monitor)
get_active_window Get details about the currently focused window
focus_window Focus a window by class or title selector
close_window Close a window (WM_CLOSE — apps can show save dialogs)
move_window Move a window to a pixel position or workspace
resize_window Resize a window to exact pixel dimensions
toggle_fullscreen Toggle fullscreen or maximize mode
toggle_floating Toggle floating mode

Workspace & Monitor

Tool Description
list_monitors List connected monitors with resolution, position, refresh rate
list_workspaces List active workspaces with window counts
switch_workspace Switch to a workspace by name or number
get_cursor_position Get current cursor position in absolute layout coordinates

Clipboard & System

Tool Description
clipboard_read Read current clipboard text
clipboard_write Write text to clipboard
launch_app Launch an application (detached, via hyprctl dispatch exec)

How it works

Screenshot coordinate mapping

Multi-monitor setups and image scaling make coordinate translation tricky. Every screenshot call returns a coordinate mapping alongside the image:

Coordinate mapping: This 941x1030 image covers screen region
starting at absolute (5447, 38), native size 941x1030.
To convert image coordinates to absolute screen coordinates:
  screen_x = image_x * 1.00 + 5447
  screen_y = image_y * 1.00 + 38

This prevents the AI from using image pixel positions directly as screen coordinates — a common failure mode on multi-monitor setups where monitors have different positions in the layout.

OCR and dark themes

Tesseract OCR was designed for black text on white paper. Most desktop apps use dark themes, which tanks OCR accuracy. hyprland-mcp automatically detects dark-background screenshots and inverts them before running OCR, significantly improving text detection.

OCR tools auto-scope to the active window by default (configurable with scope="full" for the entire desktop). Smaller capture area = better OCR accuracy = more reliable coordinate mapping.

Mouse positioning

Mouse movement uses hyprctl dispatch movecursor — Hyprland's native IPC command that sets the cursor to exact pixel coordinates. No mouse acceleration, no relative movement, no coordinate drift. ydotool is only used for click and scroll events (which don't involve positioning).

Screenshot sizing

Screenshots are automatically scaled to fit within MCP output limits. Default: max width 1024px, JPEG quality 60. A 2560x1440 desktop becomes ~80-100KB — small enough for inline display in the conversation.

For reading fine text or UI details, use the region parameter to capture a smaller area at full resolution, or capture a specific window.

Project structure

hyprland_mcp/
  server.py       # FastMCP instance, all tool definitions, entry point
  hyprctl.py      # Async wrappers for hyprctl IPC (query, dispatch, batch)
  screenshot.py   # grim capture + Pillow resize/compress + coordinate mapping
  input.py        # Mouse (movecursor + ydotool) and keyboard (wtype + sendshortcut)
  clipboard.py    # wl-copy / wl-paste wrappers
  ocr.py          # Tesseract OCR with dark-theme preprocessing
  errors.py       # Exception hierarchy + tool availability checks

Safety

  • close_window sends WM_CLOSE — apps can show "save changes?" dialogs. There is no force-kill tool.
  • launch_app goes through hyprctl dispatch exec — detached from the MCP process, no shell expansion.
  • No file system access — the MCP can see the screen and interact with it, but cannot read or write files.
  • Missing system tools produce clear error messages listing what to install.

License

MIT

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选