Better Browser MCP

Better Browser MCP

Multi-agent local browser automation MCP server with per-agent WebSocket paths, configurable ports, and fixes for upstream issues like port collisions and recursion bugs.

Category
访问服务器

README

<h1 align="center">Better Browser MCP</h1>

<p align="center"> Multi-agent local browser automation. Drop-in upgrade of <a href="https://github.com/browsermcp/mcp">@browsermcp/mcp</a> with configurable ports, per-agent WebSocket paths, and no more port-9009 fighting between agents. <br /> <a href="https://github.com/nbiish/betterbrowsermcp/issues">Issues</a> • <a href="#multi-agent-setup">Multi-agent setup</a> • <a href="#why">Why this exists</a> </p>


What's different from @browsermcp/mcp?

@browsermcp/mcp@0.1.3 @nbiish/betterbrowsermcp@0.2.0
Port collision behavior Silently kills the other process (lsof -ti:9009 | xargs kill -9) Hard error, process exits with clear message
Multiple agents on one machine Each MCP process fights for the same port Each agent runs on its own port — no fighting
Agent identification None BROWSER_MCP_AGENT_ID env var, exposed in WS path
WebSocket path / (any) /ws/<agent-id>
Auth None Optional BROWSER_MCP_AUTH_TOKEN for shared-secret handshake
Recursion bug in server.close() Yes (crashes on every reconnect) Fixed — explicit __origClose binding
Workspace monorepo deps Required @repo/* for build Self-contained, builds from a single npm install
Bind address Any Defaults to 127.0.0.1 (localhost-only by default)

All browser_navigate, browser_click, browser_snapshot, etc. tool names are identical to upstream — no client changes needed on the LLM side.


Quick start

Single agent (one process, one browser)

npx @nbiish/betterbrowsermcp@latest
# or, from this checkout:
npm install
npm run build
node dist/index.js

The server binds port 9009, WebSocket at ws://127.0.0.1:9009/ws/default. The browser extension connects there.

Multi-agent (one process per agent, all sharing one browser)

# Agent "hermes" on port 9009
BROWSER_MCP_AGENT_ID=hermes BROWSER_MCP_PORT=9009 \
  npx @nbiish/betterbrowsermcp@latest &

# Agent "omp" on port 9010
BROWSER_MCP_AGENT_ID=omp BROWSER_MCP_PORT=9010 \
  npx @nbiish/betterbrowsermcp@latest &

# Agent "codex" on port 9011
BROWSER_MCP_AGENT_ID=codex BROWSER_MCP_PORT=9011 \
  npx @nbiish/betterbrowsermcp@latest &

Each process binds its own port. They never fight. The browser extension connects to all three WebSocket endpoints and lets the user bind each tab to a specific agent.


Multi-agent setup

The user-facing flow:

  1. Start one MCP process per agent (different ports, different BROWSER_MCP_AGENT_ID)
  2. Configure the browser extension with the list of WS endpoints to monitor (e.g. ws://127.0.0.1:9009/ws/hermes, ws://127.0.0.1:9010/ws/omp)
  3. For each browser tab, the user clicks the extension icon and picks which agent controls it. The binding persists for that tab until changed or disconnected.
  4. Different tabs can be bound to different agents — the same browser serves many agents concurrently.

The MCP processes don't know about each other. The browser extension is the multiplexer that knows which agent controls which tab.

Multi-tab per agent (v0.3.0+)

A single agent can have multiple browser tabs bound to it, so a Hermes-style agent can drive Stripe in one tab and the inference provider dashboard in another, all from one MCP process. The LLM picks which tab to act on.

Tools for multi-tab work

Tool Purpose
browser_list_tabs List all bound tabs with their tabId, label, URL, active marker
browser_open_tab Open a new tab and bind it (optional url, label)
browser_close_tab Close a bound tab
browser_rename_tab Set a human-readable label on a tab
browser_set_active_tab Switch which tab unspecific tool calls route to

Every existing browser_* tool also accepts an optional tabId parameter. If omitted, the call routes to the agent's active tab.

Example: driving Stripe + OpenAI console from one agent

LLM: "I need to set up a Stripe webhook. Let me check what tabs are bound."
  -> browser_list_tabs
  -> response:
      tabId=12345  label="Stripe dashboard"  url=https://dashboard.stripe.com
      tabId=67890  label="OpenAI console"    url=https://platform.openai.com  ← ACTIVE

LLM: "I should focus on Stripe first."
  -> browser_set_active_tab(tabId=12345)

LLM: "Let me take a snapshot of the Stripe dashboard."
  -> browser_snapshot
  -> response: full ARIA tree, refs like e1, e2, ... for the Stripe dashboard

LLM: "Click 'Webhooks'."
  -> browser_click(element="Webhooks link in sidebar", ref="e14")

LLM: "Now switch to the OpenAI console and grab the API key."
  -> browser_set_active_tab(tabId=67890)
  -> browser_snapshot
  -> browser_click(element="API keys", ref="e7")

Multi-tab vs. multi-agent

  • Multi-agent = multiple MCP processes, one per agent (e.g. Hermes + OMP + Codex), each with its own port and WS endpoint
  • Multi-tab = within ONE agent's MCP process, multiple browser tabs are bound, with per-tab labels and an "active" tab for unspecific calls

The two compose: spawn N MCP processes (multi-agent), each connects to the same browser with M tabs (multi-tab), and the LLM picks which (agent, tab) pair to drive for each tool call.

Environment variables

Var Default Description
BROWSER_MCP_AGENT_ID default Agent identifier. Used in the WS path (/ws/<id>) so the extension can route tab bindings.
BROWSER_MCP_PORT 9009 WebSocket port to bind. Use different ports for different agents.
BROWSER_MCP_BIND 127.0.0.1 Bind address. Never set to 0.0.0.0 — exposes browser automation to the network.
BROWSER_MCP_AUTH_TOKEN (unset) Optional shared secret. If set, the extension must send {type:"auth", token:"..."} as its first WS message, else the connection is closed with 4401.
BROWSER_MCP_WS_PATH_PREFIX /ws Path prefix for the WS endpoint. Default /ws means the agent's endpoint is at /ws/<agentId>.

Browser extension

Better Browser MCP is server-side only. The browser extension that talks to it is a fork of the upstream @browsermcp extension with two changes:

  1. Configurable WS endpoints — instead of a hard-coded ws://localhost:9009, the extension popup lets the user add/remove WS endpoints to monitor. Each is identified by agent ID.
  2. Per-tab agent binding — when the user clicks the extension icon on a tab, they see a list of currently-connected agents (i.e. which WS endpoints are open and which tabs they're bound to). Picking one binds the current tab to that agent until changed or disconnected.

The forked extension is built separately and lives in nbiish/betterbrowsermcp-extension (forthcoming).

Until that's ready, you can patch the upstream extension to:

  • Read WS endpoints from a config (instead of hardcoded localhost:9009)
  • Show a tab-binding UI in the popup

Why this exists

The original @browsermcp/mcp@0.1.3 has two design flaws that cause constant pain in multi-agent setups:

1. killProcessOnPort on startup

Every time the server starts, it runs lsof -ti:9009 | xargs kill -9 before binding. This was meant to free the port from a stale previous instance, but in a multi-agent world it means every agent's MCP process murders every other agent's MCP process on startup. The result: keepalive failures every ~90s, ClosedResourceError on every tool call, weeks of debugging.

Better Browser MCP removed this. Port collision is now a hard error with a clear message: which port, which env var to change, and how to investigate (lsof -ti:<port> | xargs ps -p).

2. Single WebSocket per process, no agent awareness

The upstream server has a single Context object holding the one WebSocket. There's no concept of "I'm agent X, please route my tool calls to my tab". The result: in a multi-agent setup, only one agent can have a tab connected at a time, and the others fail with "No connection to browser extension".

Better Browser MCP gives each MCP process an explicit BROWSER_MCP_AGENT_ID. The WebSocket is served at /ws/<agentId>. The browser extension binds tabs to specific agent IDs. Each agent gets its own dedicated tab.

3. Recursion bug in server.close()

The upstream dist/index.js has server.close = async () => { await server.close(); ... } — it calls itself recursively, blowing the stack on every reconnect with RangeError: Maximum call stack size exceeded.

Better Browser MCP fixes this with explicit __origClose binding.


Development

# Install deps
npm install

# Typecheck
npm run typecheck

# Build (ESM via tsup)
npm run build

# Test (manual)
BROWSER_MCP_AGENT_ID=hermes BROWSER_MCP_PORT=9099 \
  npm start
# in another shell:
curl http://127.0.0.1:9099/
# {"name":"Better Browser MCP (agent: hermes)","bind":"127.0.0.1","port":9099, ...}

Project structure

src/
  config.ts           env var resolution, WS URL helpers
  context.ts          per-process Context (one WebSocket = one tab)
  messaging.ts        WS message protocol (inlined from upstream)
  server.ts           MCP server, tool routing
  tools/              tool implementations (navigate, click, etc.)
  utils.ts            helpers (wait, port check)
  ws.ts               WebSocket server with auth handshake
  index.ts            entry point
  types.ts            Zod schemas (inlined from upstream's monorepo)

Patched bugs from upstream

  • server.close() recursion (src/server.ts: in createServerWithTools): capture originalClose before override
  • killProcessOnPort murder (removed entirely — src/utils.ts: only isPortInUse remains)
  • Workspace monorepo deps (inlined into single-package repo)

Credits

Better Browser MCP is a fork of browsermcp/mcp with the multi-agent fixes needed for the Hermes + OMP + Codex multi-agent workflow. Originally adapted from Microsoft's Playwright MCP server.

By Nbiish — first repo, but probably not the last.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选