Better Browser MCP
Multi-agent local browser automation MCP server with per-agent WebSocket paths, configurable ports, and fixes for upstream issues like port collisions and recursion bugs.
README
<h1 align="center">Better Browser MCP</h1>
<p align="center"> Multi-agent local browser automation. Drop-in upgrade of <a href="https://github.com/browsermcp/mcp">@browsermcp/mcp</a> with configurable ports, per-agent WebSocket paths, and no more port-9009 fighting between agents. <br /> <a href="https://github.com/nbiish/betterbrowsermcp/issues">Issues</a> • <a href="#multi-agent-setup">Multi-agent setup</a> • <a href="#why">Why this exists</a> </p>
What's different from @browsermcp/mcp?
@browsermcp/mcp@0.1.3 |
@nbiish/betterbrowsermcp@0.2.0 |
|
|---|---|---|
| Port collision behavior | Silently kills the other process (lsof -ti:9009 | xargs kill -9) |
Hard error, process exits with clear message |
| Multiple agents on one machine | Each MCP process fights for the same port | Each agent runs on its own port — no fighting |
| Agent identification | None | BROWSER_MCP_AGENT_ID env var, exposed in WS path |
| WebSocket path | / (any) |
/ws/<agent-id> |
| Auth | None | Optional BROWSER_MCP_AUTH_TOKEN for shared-secret handshake |
Recursion bug in server.close() |
Yes (crashes on every reconnect) | Fixed — explicit __origClose binding |
| Workspace monorepo deps | Required @repo/* for build |
Self-contained, builds from a single npm install |
| Bind address | Any | Defaults to 127.0.0.1 (localhost-only by default) |
All browser_navigate, browser_click, browser_snapshot, etc. tool names are identical to upstream — no client changes needed on the LLM side.
Quick start
Single agent (one process, one browser)
npx @nbiish/betterbrowsermcp@latest
# or, from this checkout:
npm install
npm run build
node dist/index.js
The server binds port 9009, WebSocket at ws://127.0.0.1:9009/ws/default. The browser extension connects there.
Multi-agent (one process per agent, all sharing one browser)
# Agent "hermes" on port 9009
BROWSER_MCP_AGENT_ID=hermes BROWSER_MCP_PORT=9009 \
npx @nbiish/betterbrowsermcp@latest &
# Agent "omp" on port 9010
BROWSER_MCP_AGENT_ID=omp BROWSER_MCP_PORT=9010 \
npx @nbiish/betterbrowsermcp@latest &
# Agent "codex" on port 9011
BROWSER_MCP_AGENT_ID=codex BROWSER_MCP_PORT=9011 \
npx @nbiish/betterbrowsermcp@latest &
Each process binds its own port. They never fight. The browser extension connects to all three WebSocket endpoints and lets the user bind each tab to a specific agent.
Multi-agent setup
The user-facing flow:
- Start one MCP process per agent (different ports, different
BROWSER_MCP_AGENT_ID) - Configure the browser extension with the list of WS endpoints to monitor (e.g.
ws://127.0.0.1:9009/ws/hermes,ws://127.0.0.1:9010/ws/omp) - For each browser tab, the user clicks the extension icon and picks which agent controls it. The binding persists for that tab until changed or disconnected.
- Different tabs can be bound to different agents — the same browser serves many agents concurrently.
The MCP processes don't know about each other. The browser extension is the multiplexer that knows which agent controls which tab.
Multi-tab per agent (v0.3.0+)
A single agent can have multiple browser tabs bound to it, so a Hermes-style agent can drive Stripe in one tab and the inference provider dashboard in another, all from one MCP process. The LLM picks which tab to act on.
Tools for multi-tab work
| Tool | Purpose |
|---|---|
browser_list_tabs |
List all bound tabs with their tabId, label, URL, active marker |
browser_open_tab |
Open a new tab and bind it (optional url, label) |
browser_close_tab |
Close a bound tab |
browser_rename_tab |
Set a human-readable label on a tab |
browser_set_active_tab |
Switch which tab unspecific tool calls route to |
Every existing browser_* tool also accepts an optional tabId parameter. If omitted, the call routes to the agent's active tab.
Example: driving Stripe + OpenAI console from one agent
LLM: "I need to set up a Stripe webhook. Let me check what tabs are bound."
-> browser_list_tabs
-> response:
tabId=12345 label="Stripe dashboard" url=https://dashboard.stripe.com
tabId=67890 label="OpenAI console" url=https://platform.openai.com ← ACTIVE
LLM: "I should focus on Stripe first."
-> browser_set_active_tab(tabId=12345)
LLM: "Let me take a snapshot of the Stripe dashboard."
-> browser_snapshot
-> response: full ARIA tree, refs like e1, e2, ... for the Stripe dashboard
LLM: "Click 'Webhooks'."
-> browser_click(element="Webhooks link in sidebar", ref="e14")
LLM: "Now switch to the OpenAI console and grab the API key."
-> browser_set_active_tab(tabId=67890)
-> browser_snapshot
-> browser_click(element="API keys", ref="e7")
Multi-tab vs. multi-agent
- Multi-agent = multiple MCP processes, one per agent (e.g. Hermes + OMP + Codex), each with its own port and WS endpoint
- Multi-tab = within ONE agent's MCP process, multiple browser tabs are bound, with per-tab labels and an "active" tab for unspecific calls
The two compose: spawn N MCP processes (multi-agent), each connects to the same browser with M tabs (multi-tab), and the LLM picks which (agent, tab) pair to drive for each tool call.
Environment variables
| Var | Default | Description |
|---|---|---|
BROWSER_MCP_AGENT_ID |
default |
Agent identifier. Used in the WS path (/ws/<id>) so the extension can route tab bindings. |
BROWSER_MCP_PORT |
9009 |
WebSocket port to bind. Use different ports for different agents. |
BROWSER_MCP_BIND |
127.0.0.1 |
Bind address. Never set to 0.0.0.0 — exposes browser automation to the network. |
BROWSER_MCP_AUTH_TOKEN |
(unset) | Optional shared secret. If set, the extension must send {type:"auth", token:"..."} as its first WS message, else the connection is closed with 4401. |
BROWSER_MCP_WS_PATH_PREFIX |
/ws |
Path prefix for the WS endpoint. Default /ws means the agent's endpoint is at /ws/<agentId>. |
Browser extension
Better Browser MCP is server-side only. The browser extension that talks to it is a fork of the upstream @browsermcp extension with two changes:
- Configurable WS endpoints — instead of a hard-coded
ws://localhost:9009, the extension popup lets the user add/remove WS endpoints to monitor. Each is identified by agent ID. - Per-tab agent binding — when the user clicks the extension icon on a tab, they see a list of currently-connected agents (i.e. which WS endpoints are open and which tabs they're bound to). Picking one binds the current tab to that agent until changed or disconnected.
The forked extension is built separately and lives in nbiish/betterbrowsermcp-extension (forthcoming).
Until that's ready, you can patch the upstream extension to:
- Read WS endpoints from a config (instead of hardcoded
localhost:9009) - Show a tab-binding UI in the popup
Why this exists
The original @browsermcp/mcp@0.1.3 has two design flaws that cause constant pain in multi-agent setups:
1. killProcessOnPort on startup
Every time the server starts, it runs lsof -ti:9009 | xargs kill -9 before binding. This was meant to free the port from a stale previous instance, but in a multi-agent world it means every agent's MCP process murders every other agent's MCP process on startup. The result: keepalive failures every ~90s, ClosedResourceError on every tool call, weeks of debugging.
Better Browser MCP removed this. Port collision is now a hard error with a clear message: which port, which env var to change, and how to investigate (lsof -ti:<port> | xargs ps -p).
2. Single WebSocket per process, no agent awareness
The upstream server has a single Context object holding the one WebSocket. There's no concept of "I'm agent X, please route my tool calls to my tab". The result: in a multi-agent setup, only one agent can have a tab connected at a time, and the others fail with "No connection to browser extension".
Better Browser MCP gives each MCP process an explicit BROWSER_MCP_AGENT_ID. The WebSocket is served at /ws/<agentId>. The browser extension binds tabs to specific agent IDs. Each agent gets its own dedicated tab.
3. Recursion bug in server.close()
The upstream dist/index.js has server.close = async () => { await server.close(); ... } — it calls itself recursively, blowing the stack on every reconnect with RangeError: Maximum call stack size exceeded.
Better Browser MCP fixes this with explicit __origClose binding.
Development
# Install deps
npm install
# Typecheck
npm run typecheck
# Build (ESM via tsup)
npm run build
# Test (manual)
BROWSER_MCP_AGENT_ID=hermes BROWSER_MCP_PORT=9099 \
npm start
# in another shell:
curl http://127.0.0.1:9099/
# {"name":"Better Browser MCP (agent: hermes)","bind":"127.0.0.1","port":9099, ...}
Project structure
src/
config.ts env var resolution, WS URL helpers
context.ts per-process Context (one WebSocket = one tab)
messaging.ts WS message protocol (inlined from upstream)
server.ts MCP server, tool routing
tools/ tool implementations (navigate, click, etc.)
utils.ts helpers (wait, port check)
ws.ts WebSocket server with auth handshake
index.ts entry point
types.ts Zod schemas (inlined from upstream's monorepo)
Patched bugs from upstream
server.close()recursion (src/server.ts: increateServerWithTools): captureoriginalClosebefore overridekillProcessOnPortmurder (removed entirely — src/utils.ts: onlyisPortInUseremains)- Workspace monorepo deps (inlined into single-package repo)
Credits
Better Browser MCP is a fork of browsermcp/mcp with the multi-agent fixes needed for the Hermes + OMP + Codex multi-agent workflow. Originally adapted from Microsoft's Playwright MCP server.
By Nbiish — first repo, but probably not the last.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。