journal-rag

journal-rag

Hybrid retrieval MCP server for searching team markdown journals using BM25 and local vector embeddings, with tools for search, browse, and regex lookup.

Category
访问服务器

README

journal-rag

Source-control-friendly hybrid retrieval over team markdown journals. Heading-chunked BM25 + local vector embeddings fused via Reciprocal Rank Fusion (RRF), with regex as an escape hatch. Index built on startup with an optional gitignored JSON cache.

Embeddings run locally via @huggingface/transformers (default model: all-MiniLM-L6-v2) — no API keys, no external calls.

Each consuming repo commits journal-rag.config.json and markdown under docs/journal/ (or other configured folders). This package is the shared engine.

Per-repo config

Create journal-rag.config.json at the repo root:

{
  "sources": ["docs/journal"],
  "cachePath": ".journal-rag/index.json",
  "embeddingModel": "Xenova/all-MiniLM-L6-v2"
}
Field Required Default Description
sources yes Directories containing markdown journals
cachePath no .journal-rag/index.json BM25 chunk index cache path
embeddingModel no Xenova/all-MiniLM-L6-v2 Hugging Face model ID for local embeddings

The vector cache (vectors.json) is stored in the same directory as cachePath.

Add to .gitignore:

.journal-rag/

Build & install (once per machine)

cd c:/repos/journal-rag
npm install          # runs prepare → build
npm link             # puts journal + journal-mcp on your PATH

npm link registers two global commands:

Command What it runs
journal CLI (search, list, get, …)
journal-mcp MCP stdio server (for editor config)

Re-run npm run build (or npm link again) after pulling server changes. Alternative to link: npm install -g . from this repo (same effect).

CLI (any teammate)

From a repo root with config:

journal search "HttpFacade singleton"        # hybrid BM25 + vector (default)
journal search "HttpFacade singleton" --bm25 # BM25-only (no embedding)
journal list --filter dialog
journal get docs/journal/2026-04-21_vapp-http-facade-and-singleton-sweep.md
journal index --rebuild

After npm link in this repo, journal search "..." works globally.

Set JOURNAL_RAG_WORKSPACE to an absolute repo root only when you must run the CLI from a subdirectory.

The first run downloads the embedding model (~80 MB) to the Hugging Face cache directory. Subsequent runs load from cache.

MCP tools

Tool Purpose
search_journal Hybrid BM25 + vector search with RRF fusion (query, k). Falls back to BM25-only if vector index is unavailable.
get_entry Full file by path or filename
list_entries Browse metadata (filter optional)
search_regex Exact / path / symbol lookup

Editor setup

Use stdio — spawn Node with dist/server.js.

Put MCP config in the workspace, not your user profile

The server resolves journal-rag.config.json by walking up from its working directory. That file lives at each consuming repo's root (next to docs/journal/), not in journal-rag itself.

If you add the server to a global / user-level editor profile, the spawn cwd is usually wrong (home dir, editor install dir, last random folder, etc.) and the server cannot find config — even if you hardcode "cwd": "C:/repos/my-repo", that breaks the moment you open a second repo workspace.

Do this instead: commit workspace-level MCP config inside each repo that has journals. Teammates run npm link once (see above) so journal-mcp is on PATH — no machine-specific paths in the committed JSON.

Cursor

.cursor/mcp.json at the repo root (e.g. my-repo/.cursor/mcp.json) — safe to commit:

{
  "mcpServers": {
    "journal": {
      "command": "journal-mcp",
      "cwd": "${workspaceFolder}",
      "env": {
        "JOURNAL_RAG_WORKSPACE": "${workspaceFolder}"
      }
    }
  }
}

${workspaceFolder} resolves to the repo you opened. journal-mcp comes from npm link in the journal-rag repo.

VS Code (Copilot agent mode)

Same idea: .vscode/mcp.json in the repo, not User settings:

{
  "servers": {
    "journal": {
      "type": "stdio",
      "command": "journal-mcp",
      "cwd": "${workspaceFolder}"
    }
  }
}

JetBrains AI Assistant / Junie

Configure MCP at project scope (.idea / project settings), not the IDE default profile. Open the repo as the project root. Command: journal-mcp (after npm link).

If journal-mcp is not found

Ensure npm's global bin dir is on your PATH (npm bin -g). On Windows that is usually %APPDATA%\\npm. Then re-run npm link from journal-rag. Fallback for a single machine only: "command": "node", "args": ["<absolute-path>/journal-rag/dist/server.js"].

Fallback

If an editor cannot set cwd per workspace, set env JOURNAL_RAG_WORKSPACE to the absolute path of the consuming repo root in that workspace's MCP config.

Design notes

  • Corpus is small (~tens of files); BM25 over heading chunks matches how journals are written.
  • Vector embeddings (local, via Transformers.js) add semantic recall for paraphrased or conceptual queries.
  • Reciprocal Rank Fusion (RRF, k=60) merges BM25 and vector rankings without needing score normalization.
  • Index caches are optional and gitignored; markdown in git is the source of truth.
  • Vector cache is incremental — only new/changed chunks are re-embedded on rebuild.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
mcp-server-qdrant

mcp-server-qdrant

这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。

官方
精选
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选