opengrok-go-mcp

opengrok-go-mcp

Source-code intelligence for agents, powered by OpenGrok and Go.

Category
访问服务器

README

<div align="center">

<img src="assets/groky.png" alt="opengrok-go-mcp mascot" width="400" />

opengrok-go-mcp

Source-code intelligence for agents, powered by OpenGrok and Go.

I have seen things you people wouldn’t grep. — Senior Architect, probably.

<br/>

License Go CI MCP Evals

</div>


Agent-oriented MCP server for searching, navigating, and reading code through OpenGrok.

It turns OpenGrok into a safer code-intelligence surface for LLM agents:

  • capability-gated tools that only appear when the backing OpenGrok feature works
  • paginated search and file reads with stable cursors
  • citation URLs on code results so answers can point back to source
  • warnings for broad, heuristic, truncated, or best-effort results (warnings[] codes plus a legacy warning string)
  • automatic context expansion around search hits with explicit limits
  • full, compact, and experimental gateway tool surfaces for different agent styles

If you are an AI agent reading this repository, start with AGENTS.md for project constraints and agent workflow guidance.

Pre-1.0 note: this MCP server is still evolving. Some tools, responses, and configuration paths may be broken or change before a stable 1.0 release. Please report issues using docs/reporting-issues.md.

When To Use It

Use opengrok-go-mcp when you want an agent to investigate a large indexed codebase without cloning it locally. It works best for finding symbols, reading files, tracing references, narrowing broad searches, and producing answers with source citations.

It is intentionally honest about OpenGrok's limits. OpenGrok provides full-text search plus ctags definitions, not a full semantic call graph or AST engine. For structural questions, use this server to find the right files and symbols, then verify relationships with language-aware tools when needed.

Client Setup

Required: OPENGROK_MCP_BASE_URL — OpenGrok API base URL ending in /api/v1. On startup the server discovers projects, probes capabilities, and registers only tools that work. A typical reverse-proxied instance needs nothing else.

Copy a block below, replace the URL, restart the client.

The server defaults to OPENGROK_MCP_AGENT_PROFILE=economy (lean payloads, no auto context expansion). Set OPENGROK_MCP_AGENT_PROFILE=rich when you want expanded search context by default.

<details> <summary><strong>Released binary</strong> (no Go install)</summary>

Download the archive for your OS/arch from GitHub Releases, verify checksums.txt, and point your MCP client at the unpacked opengrok-go-mcp binary. Example (Claude Code):

{
  "mcpServers": {
    "opengrok": {
      "command": "/path/to/opengrok-go-mcp",
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}

</details>

<details open> <summary><strong>Claude Code</strong> (<code>go run</code>)</summary>

Add to ~/.claude.json under mcpServers, or run claude mcp add:

{
  "mcpServers": {
    "opengrok": {
      "command": "go",
      "args": [
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/opengrok-go-mcp@v0.4.0"
      ],
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}

</details>

<details> <summary><strong>OpenCode</strong> (<code>go run</code>)</summary>

Add to opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "opengrok": {
      "type": "local",
      "enabled": true,
      "command": [
        "go",
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/opengrok-go-mcp@v0.4.0"
      ],
      "environment": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}

</details>

<details> <summary><strong>Codex</strong></summary>

Add to .codex/config.toml in the project root or ~/.codex/config.toml:

[[mcp_servers]]
name = "opengrok"
command = ["go", "run", "github.com/rokasklive/opengrok-go-mcp/cmd/opengrok-go-mcp@v0.4.0"]

[mcp_servers.env]
OPENGROK_MCP_BASE_URL = "https://your-opengrok-host/source/api/v1"

</details>

<details> <summary><strong>Other clients</strong> (Cursor, VS Code MCP, …)</summary>

Most stdio MCP clients use the same shape as Claude Code — command, args, and an env map (name may vary: environment, env):

{
  "mcpServers": {
    "opengrok": {
      "command": "go",
      "args": [
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/opengrok-go-mcp@v0.4.0"
      ],
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}

Cursor: project .cursor/mcp.json or Cursor Settings → MCP. VS Code: MCP extension config with the same server entry.

</details>

<details> <summary><strong>Local clone</strong> (replace <code>go run</code> package path)</summary>

Use the same env / environment block as above. Example command for Claude Code:

"command": "sh",
"args": [
  "-c",
  "cd /path/to/opengrok-go-mcp && go run ./cmd/opengrok-go-mcp --read-timeout=30s --write-timeout=30s"
]

Optional HTTP mode from a clone:

OPENGROK_MCP_TRANSPORT=http \
OPENGROK_MCP_BASE_URL=https://your-opengrok-host/source/api/v1 \
go run ./cmd/opengrok-go-mcp

MCP endpoint: http://127.0.0.1:8765/mcp

</details>

Common environment variables

Variable Required When to set
OPENGROK_MCP_BASE_URL yes OpenGrok API URL ending in /api/v1
OPENGROK_MCP_API_TOKEN no Auth required, or startup logs 401/403 on probes. Full Authorization value: Bearer <token> or Basic <credentials>. Never logged.
OPENGROK_MCP_DEFAULT_PROJECT no Multiple projects and you want one implicit on calls that omit project. Auto-set when exactly one project is discovered.

If search probes return 401/403 without a token, the server still starts and logs remediation; search tools stay gated until OPENGROK_MCP_API_TOKEN is set.

<details> <summary><strong>All environment variables</strong></summary>

Variable Default Purpose
OPENGROK_MCP_WEB_BASE_URL derived Web UI base for citations and raw-file fallback
OPENGROK_MCP_PROJECTS Comma-separated allowlist; skips API/scrape discovery
OPENGROK_MCP_DISABLE_PROJECT_SCRAPE false Skip web-UI scrape when /projects/indexed fails
OPENGROK_MCP_PROJECT_REQUIRED true Require project on tool calls
OPENGROK_MCP_PROBE_FILE project/path for file-read capability probe
OPENGROK_MCP_TRANSPORT stdio http for Streamable HTTP (127.0.0.1:8765/mcp)
OPENGROK_MCP_LISTEN 127.0.0.1:8765 HTTP listen address
OPENGROK_MCP_TOOL_SURFACE compact full (fine-grained tools) or gateway (experimental)
OPENGROK_MCP_AGENT_PROFILE economy rich for expanded search context and per-result links by default. Per-call expand_context / response_mode / include_links still override
OPENGROK_MCP_MEMORY_ENABLED true Process-scoped memory tools on the full surface only (stdio). Disabled over HTTP regardless of this setting
OPENGROK_MCP_INSECURE_SKIP_TLS_VERIFY false Trusted internal hosts with broken TLS only
OPENGROK_MCP_CURSOR_SECRET HMAC secret for signed pagination cursors
OPENGROK_MCP_AUTO_EXPAND_CONTEXT true Auto-expand context around search hits
OPENGROK_MCP_CONTEXT_BEFORE / AFTER 5 / 10 Expansion window lines
OPENGROK_MCP_MAX_EXPANDED_RESULTS 10 Max results to expand
OPENGROK_MCP_MAX_EXPANDED_FILES 5 Max files fetched during expansion
OPENGROK_MCP_CONTEXT_FETCH_CONCURRENCY 3 Parallel fetches during expansion
OPENGROK_MCP_RETRY_MAX_ATTEMPTS 2 Retries for transient OpenGrok errors
OPENGROK_MCP_RETRY_BASE_DELAY 200ms Retry backoff base
OPENGROK_MCP_CACHE_ENABLED false In-process response cache
OPENGROK_MCP_CACHE_TTL 5m Cache entry lifetime
OPENGROK_MCP_CACHE_MAX_SIZE 1000 Max cache entries
DEBUG false 1 logs OpenGrok HTTP requests to stderr

Context budget overrides: OPENGROK_MCP_BUDGET_{MINIMAL|DEFAULT|MAXIMAL}_{BEFORE|AFTER|RESULTS|FILES}.

Deprecated: OPENGROK_MCP_PROJECT_SCRAPE (use OPENGROK_MCP_DISABLE_PROJECT_SCRAPE). Removed: OPENGROK_MCP_BASIC_AUTH_TOKEN (use OPENGROK_MCP_API_TOKEN="Basic …").

Full reference: docs/configuration.md.

</details>

Security

Avoid passing secrets as CLI flags. Use OPENGROK_MCP_API_TOKEN for OpenGrok auth; the server never logs token values.

Operational caveats:

  • HTTP mode does not add inbound client authentication. Keep the default loopback bind address or put it behind trusted network/auth controls.
  • OPENGROK_MCP_INSECURE_SKIP_TLS_VERIFY=true is only for controlled internal instances with broken certificates. Do not use it for public or untrusted hosts.
  • Raw file fallback uses OPENGROK_MCP_WEB_BASE_URL with the same configured credentials. Treat that URL as part of the trusted OpenGrok boundary.
  • Memory tools are full-surface only (stdio). They are process-scoped, ephemeral, and disabled over HTTP because memory is not isolated by client session.
  • Set OPENGROK_MCP_CURSOR_SECRET for shared deployments if cursor integrity matters.

Evaluation

Hermetic stdio evals in evals/ — real MCP binary, fake OpenGrok backend, no live instance. CI runs them on every PR (ci.yml). README summaries and Δ columns compare against committed baselines in evals/baselines/. Refresh locally with scripts/update-eval-results.sh; the optional pre-push hook (scripts/install-githooks.sh) runs tests and updates those files before you push.

go test ./evals/ -count=1                        # contract + token benchmark
go test ./evals/ -run TestEvalSuite -count=1      # MCP contract only
go test ./evals/ -run TestTokenBenchmark -count=1 # token economy only

How to read the tables below

  • Contract eval — hermetic MCP calls against a fake OpenGrok backend; checks outputs, errors, and pagination fields. Δ is change vs the committed baseline in evals/baselines/. coverage@K is the fraction of eval cases exercised.
  • Token benchmark — same harness, but measures UTF-8 bytes crossing the MCP wire (tool schemas, requests, responses). Est. tokens = bytes ÷ 4 (rough heuristic, not a specific model tokenizer).
  • Surfacefull (fine-grained tools), compact (4 consolidated tools, default), or gateway (experimental discover + call).
  • ListTools — one-time cost when the client loads the tool list and schemas at session start; usually the largest line item on full.
  • Warm totalListTools plus all tool calls in a scenario (request + response bytes). For gateway, warm excludes the one-time opengrok_discover call; cold includes it (first session only). On full and compact, cold = warm.
  • Min–max — range across the four replay scenarios (symbol lookup, file browse, multi-step symbol search, search-and-read). Per-scenario breakdown is in the collapsed table.
  • Δ on token rows — change in estimated tokens vs the last committed baseline (evals/baselines/token_report.json).

<!-- EVAL-RESULTS START -->

Contract eval

Last run: 2026-06-24 · direct-call · harness docs →

10/10 passed · 100% (Δ ±0) · 100% coverage@K — see How to read the tables for Δ and coverage@K.

<details> <summary>Per-tool scores</summary>

Tool Score Cases
get_file_context 100% (Δ ±0) 1
list_projects 100% (Δ ±0) 1
list_symbols 100% (Δ ±0) 1
read_file 100% (Δ ±0) 1
search_code 100% (Δ ±0) 4
search_symbol_definitions 100% (Δ ±0) 1
search_symbol_references 100% (Δ ±0) 1

</details>

Token economy benchmark

Last run: 2026-06-24 · deterministic-replay · est. tokens = bytes÷4 (heuristic, not model-exact)

ListTools dominates session cost on the full surface (18 tools). Compact (4) and gateway (2) register far fewer schemas.

Surface ListTools (est. tokens) Warm total min–max (est. tokens)
full 14k (Δ ±0) 14k–15k (Δ ±0)
compact 3.5k (Δ ±0) 4.4k–6.8k (Δ ±0)
gateway 261 (Δ ±0) 1.2k–2.1k (Δ ±0)

Warm = ListTools + scenario tool traffic. Gateway warm omits one-time discover; full/compact cold = warm. Compact file-exploration skips files.list (no compact op).

<details> <summary>Per-scenario warm totals (est. tokens; ListTools + calls)</summary>

Scenario full compact gateway
Compound symbol 15k (Δ ±0) 4.8k (Δ ±0) 1.6k (Δ ±0)
File exploration 14k (Δ ±0) 4.4k (Δ ±0) 1.2k (Δ ±0)
Symbol investigation (3 calls) 15k (Δ ±0) 5.3k (Δ ±0) 2.1k (Δ ±0)
Search + read 15k (Δ ±0) 4.5k (Δ ±0) 1.3k (Δ ±0)

</details>

Δ vs baseline from 2026-06-24.

<!-- EVAL-RESULTS END -->

Development

This project uses GitHub Spec Kit for non-trivial feature planning.

For meaningful behavior changes, new MCP tools, schema changes, configuration changes, or changes that affect agent-facing behavior, contributors should start from the project constitution:

  • .specify/memory/constitution.md

Feature work should generally produce:

  • specs/FEATURE/spec.md
  • specs/FEATURE/plan.md
  • specs/FEATURE/tasks.md

Small bug fixes, documentation edits, dependency bumps, and mechanical refactors do not require a full Spec Kit workflow unless they affect the public MCP contract.

All changes must preserve the MCP contract, OpenGrok semantics, security posture, compatibility expectations, and documentation requirements described in the constitution.

Known Limitations

  • Large project traversal is bounded, and some search and discovery operations are best-effort rather than language-semantic.
  • HTTP transport is intended for controlled local or internal setups and does not add inbound client authentication.

See docs/limitations.md for the detailed current list, behavioral impact, and mitigations.

License

opengrok-go-mcp is licensed under the Apache License 2.0 (Apache-2.0) for new releases starting with v0.3.0-beta.2.

Earlier published releases up to and including v0.3.0-beta.1 were released under CC0-1.0 and remain available under those terms.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选