kmux

kmux

A terminal emulator MCP server specifically engineered for LLMs with block-oriented design that organizes command input/output into recognizable blocks and semantic session management. Enables AI to efficiently use terminals for writing code, installing software, and executing commands without context overflow.

Category
访问服务器

README

kmux

中文

Terminal MCP server of the AI, for the AI, but by Creative Koalas.

What is kmux?

kmux (name comes from "koala" + "tmux") is a terminal MCP server engineered for Large Language Models (LLMs). I.e., it's a terminal emulator for AI. Add this to your LLM and your LLM will be able to take advantage of terminals and do things like writing code, installing software, etc.

Install & Usage

Currently, kmux only supports Zsh. Make sure you have Zsh installed and "findable" (i.e., in your $PATH) before using kmux.

To install kmux, run:

pip install -i https://test.pypi.org/simple/ kmux

To start kmux, run:

python -m kmux --root-password <your_root_password>

Or, if you don't want the LLM to have root privilege, just omit the password, like this:

python -m kmux

For Claude Code:

# Omit the --root-password argument if you don't want the AI to be root
claude mcp add kmux -- python -m kmux --root-password <your_root_password>

For Claude Desktop, add kmux to the mcpServers field of the configuration file, like this:

{
  "mcpServers": {
    ... (other MCP servers)
    "kmux": {
      "command": "python",
      "args": [
        "-m",
        "kmux",
        "--root-password", // Omit this if you don't want AI to be root
        "<your-root-password>", // Omit this if you don't want AI to be root
      ]
    }
  }
}

Visit this page if it's your first time adding MCP server to Claude Desktop.

How does kmux differ from other terminal MCPs out there?

kmux is the first terminal tool specifically engineered and tailored for LLMs.

Other terminal MCP servers are "usable" for LLM; kmux is "useful".

The "LLM user experience" of kmux is designed around the idea of "AI ergonomics", a concept proposed in 2023 by Trent Fellbootman, CEO of Creative Koalas.

Transformer-based LLMs and humans have different ways of preceiving and interacting with the environment; kmux is designed to make it natural for LLMs to use terminals.

Block-oriented design

While other terminal MCP servers just provide a way to read/write data from a terminal, kmux organizes the input & output of a terminal into blocks. Such block-oriented design is also what made warp (probably the most popular terminal emulator that doesn't come with the OS nowadays) stand out in its initial release.

Consider the following commands and outputs:

$ ls
file1.txt
file2.txt
file3.txt
$ cat file1.txt
This is the content of file1.txt.
$ cat file2.txt
This is the content of file2.txt.
$ cat file3.txt
This is the content of file3.txt.

As a human, you can easily see that there are 4 commands and 4 respective outputs in the example above. However, separating those command/output pairs programmatically is not an easy feat, and currently, most, if not all, terminal MCP servers just treat everything in a terminal as a single, large chunk of data.

Such an approach raises a series of problems:

  1. There is no easy way to see the output of a specific (usually the current) command. If you want to avoid omitting anything useful, the most straightforward way is to include everything in the terminal output when the LLM reads from the terminal. That would blow up the LLM context, but unfortunately, this is what most terminal MCP servers do.
  2. It is hard to tell when a command has finished executing. A lot of terminal MCP servers just wait a certain amount of time before returning everything in the terminal when the LLM wants to execute a command and see its output.

If you've been using terminals before warp, you may realize that these problems also exist in terminals for humans. However, such problems were ignored for a long time because:

  1. Humans can just scroll around the terminal to find the commands and their outputs. When we see a command prompt, we know that the last command is finished and the next one starts.
  2. Humans naturally multi-task. For simple commands, we just wait in front of the terminal until we see the next command prompt; for long-running commands, we just switch to something else and go back to check again later.

kmux solves those challenges by implementing a block-recognition mechanism that allows programmatically and incrementally segmenting and recognizing command/output blocks as new content appears on the terminal.

Such a mechanism solves the problems mentioned above:

  • With block recognition, LLM can selectively read the outputs of any specifically command (usually the currently running command or the last executed command); you only read what you need, instead of everything in the terminal. No more blowing up the model context.
  • Incrementally segmenting output blocks allows us to know precisely when a command has finished executing. No more waiting for a fixed amount of time; LLM can just execute a command and get its output when it's done.

How is block recognition implemented?

For those interested, here's an overview of how block recognition is implemented.

The block recognition mechanism largely relies on zsh hooks (hence the zsh dependency). Basically, zsh has certain hooks that gets called when:

  • Command input beings
  • Command input ends
  • Command output begins
  • Command output ends

By utilizing those hooks, kmux injects certain markers into the terminal output to indicate the beginning and the end of a command/output block. These markers are injected as ANSI escape sequences; they are invisible to humans and LLMs but allow us to programmatically identify the beginning/end of each command/output block.

Since these hooks are shell-specific, and zsh provides the most straightforward hook interface while also being almost identical to bash (the most popular shell) in syntax, we choose to stick with zsh at present.

Semantic session management

Another kmux feature designed for AI ergonomics is its semantic session management system. Basically, kmux supports attaching a label (title) and a description (summary) to each shell session. These can be used to mark what a shell session is for.

When the LLM list the shell sessions, these labels & descriptions along with the session ID and the command currently running in each session are shown to the LLM, so that the LLM knows what is going on in each shell session without the need to see the full outputs.

This is useful because while humans would just look at the terminal screen and title, the only way for LLMs to get "metadata" of a terminal session is through a specific MCP tool. Hence, making the session listing MCP tool return not only the session IDs but also the metadata would make it much easier for LLMs to quickly know what is going on in each session and help them to decide what to do next.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选