LODA MCP Server

LODA MCP Server

Provides token-efficient document search and retrieval for LLMs by returning relevant document sections within specified token budgets. It utilizes section-aware parsing and Bloom filter elimination to offer high-speed, zero-dependency access to large documents.

Category
访问服务器

README

LODA MCP Server

LLM-Optimized Document Access - A Model Context Protocol server for token-efficient document search in Claude Desktop and Claude Code.

License: MIT Node.js MCP Compatible Tests


What is LODA?

LODA (LLM-Optimized Document Access) is a search strategy designed specifically for how LLMs consume documents. Instead of returning raw matches or arbitrary chunks, LODA understands document structure and returns the most relevant sections within your token budget.

The Problem

When LLMs work with large documents, they face a fundamental challenge:

Traditional Approach Problem
Load entire document Exceeds context limits
Keyword search No relevance ranking, returns too much
RAG/Vector search Requires infrastructure, 200-500ms latency
Chunk-based retrieval Arbitrary boundaries break coherence

We discovered a "gap zone" at 25-35% document positions where traditional smart retrieval actually performed worse than brute-force loading.

The Solution

LODA combines lightweight techniques to achieve vector search quality at grep-like speeds:

┌─────────────────┐     ┌──────────────────────┐     ┌─────────────────┐
│  Large Document │────▶│  LODA Search Engine  │────▶│ Relevant Sections│
│   (5000+ lines) │     │  • Bloom Filters     │     │  within budget   │
│                 │     │  • Token Budget      │     │  (~200 tokens)   │
│                 │     │  • Relevance Scoring │     │                  │
└─────────────────┘     │  • Smart Caching     │     └─────────────────┘
                        └──────────────────────┘

Results:

  • 70-95% token savings compared to loading full document
  • 1-5ms search latency (cached) vs 200-500ms for RAG
  • Zero external dependencies - no vector database needed

Quick Start

1. Installation

git clone https://github.com/patrickkarle/loda-mcp-server.git
cd loda-mcp-server
npm install

2. Configure Claude Desktop

Find your config file:

  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

Add this to the file:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

3. Configure Claude Code

Add to your project's .claude/settings.json or global ~/.claude/settings.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/full/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

4. Use It!

Ask Claude:

"Use loda_search to find the authentication section in api-docs.md"

"Search architecture.md for deployment instructions with a 500 token budget"


How LODA Works

1. Bloom Filter Elimination

Before scoring, LODA uses Bloom filters to instantly eliminate sections that definitely don't contain your search terms. This O(1) operation typically eliminates 80%+ of sections.

2. Section-Aware Parsing

LODA respects your document's structure. It understands markdown headings and returns complete logical sections, not arbitrary text chunks.

3. Relevance Scoring

Each candidate section is scored based on:

  • Query term presence in content (0.8 base score)
  • Header match bonus (+0.2 for header matches)
  • Multi-term coverage (all terms weighted equally)

4. Token Budget Selection

You specify a token budget, LODA returns the best sections that fit:

// "I need info about auth, but only have 500 tokens of context"
{
  query: "authentication",
  contextBudget: 500
}

5. Aggressive Caching

Document structures and Bloom filters are cached with TTL (60s default). Repeated searches on the same document are 10x+ faster.


API Reference

loda_search

The main search tool.

Parameters:

Parameter Type Required Default Description
documentPath string Yes - Path to document (relative to staging or absolute)
query string Yes - Search keywords or phrase
contextBudget number No null Maximum tokens to return (null = unlimited)
maxSections number No 5 Maximum sections to return

Example Request:

{
  "documentPath": "api-docs.md",
  "query": "authentication oauth",
  "contextBudget": 500,
  "maxSections": 3
}

Example Response:

{
  "query": "authentication oauth",
  "documentPath": "/path/to/api-docs.md",
  "sections": [
    {
      "id": "section-5",
      "header": "OAuth 2.0 Authentication",
      "level": 3,
      "score": 1.0,
      "lineRange": [27, 41],
      "tokenEstimate": 88
    },
    {
      "id": "section-4",
      "header": "API Key Authentication",
      "level": 3,
      "score": 0.8,
      "lineRange": [15, 26],
      "tokenEstimate": 66
    }
  ],
  "metadata": {
    "totalSections": 21,
    "candidatesAfterBloom": 5,
    "scoredAboveZero": 3,
    "returnedSections": 2,
    "totalTokens": 154,
    "budgetStatus": "SAFE",
    "truncated": false,
    "cacheHit": true
  }
}

Budget Status Values

Status Meaning
UNLIMITED No budget was specified
SAFE Total tokens under 80% of budget
WARNING Total tokens between 80-100% of budget
EXCEEDED Over budget (first section always returned)

Other Tools

Tool Description
list_document_sections Get hierarchical structure of document
read_section Read specific section by ID with context
read_lines Read specific line range
search_content Basic regex search (no LODA optimization)

Staging Directory

By default, LODA looks for documents in the staging/ subdirectory:

loda-mcp-server/
├── staging/              ← Put documents here
│   ├── api-docs.md
│   ├── architecture.md
│   └── user-guide.md
└── document_access_mcp_server.js

You can also use absolute paths to search any document on your system.


HTTP Mode (Development/Testing)

For testing without Claude, run the server in HTTP mode:

node document_access_mcp_server.js --mode=http --port=49400

Then test with curl:

# Health check
curl http://localhost:49400/health

# List tools
curl http://localhost:49400/tools

# Search
curl -X POST http://localhost:49400/tools/loda_search \
  -H "Content-Type: application/json" \
  -d '{"documentPath": "api-docs.md", "query": "authentication"}'

Performance

Metric Target Achieved
Search latency (cached) <10ms 1-5ms
Search latency (cold) <100ms 20-50ms
Token savings >70% 70-95%
Bloom filter effectiveness >80% ~85%
Cache hit rate >80% ~90%

Testing

# Run all LODA tests
npm test

# Run specific component tests
npm test -- tests/loda_search_handler.test.js

# Run with coverage
npm test -- --coverage

Test Results: 46/46 Passing

Component Tests Status
token_estimator 6
relevance_scorer 8
budget_manager 6
bloom_filter 10
loda_index 8
loda_search_handler 8

Architecture

loda/
├── token_estimator.js      # Pure token estimation (~4 chars/token)
├── relevance_scorer.js     # Section relevance scoring
├── budget_manager.js       # Token budget selection
├── loda_index.js           # Cached document structure (TTL + LRU)
├── bloom_filter.js         # O(1) section elimination
├── loda_search_handler.js  # Main orchestrator
└── index.js                # Module entry

document_access_mcp_server.js  # MCP server with 5 tools

Research & Development

This project was built using the Continuum Development Process (CDP), a 13-phase methodology that emphasizes traceability and quality gates.

Why We Built This

We tried several approaches before arriving at LODA:

Approach Why It Failed
Semantic Chunking Arbitrary boundaries split logical units
RAG + Vector Search Too much infrastructure for single-doc access
JIT-Steg Retrieval "Gap zone" at 25-35% where overhead exceeded brute-force
Simple Grep No relevance ranking, no token awareness

LODA combines the best of each: section awareness, fast elimination, budget control, and zero external dependencies.

Research Documents


Configuration Examples

Claude Desktop (Windows)

%APPDATA%\Claude\claude_desktop_config.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["C:/Users/YourName/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Claude Desktop (macOS/Linux)

~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/home/yourname/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Claude Code (Project-level)

.claude/settings.json:

{
  "mcpServers": {
    "loda": {
      "command": "node",
      "args": ["/path/to/loda-mcp-server/document_access_mcp_server.js", "--mode=stdio"]
    }
  }
}

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Write tests for new functionality
  4. Submit a PR with documentation

License

MIT License - see LICENSE for details.


Acknowledgments


Made with 🧠 for LLMs that need to read documents efficiently.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选