MCP 服务器

codetex-mcp

A commit-aware code context manager for LLMs that indexes Git repositories into a multi-tier knowledge hierarchy (repo overviews, file summaries, symbol details) with SQLite vector search, serving context via the Model Context Protocol.

README

codetex-mcp

A commit-aware code context manager for LLMs. Indexes Git repositories into a multi-tier knowledge hierarchy — repo overviews, file summaries, and symbol details — stored in SQLite with vector search. Serves context to LLM clients via the Model Context Protocol (MCP) or a local CLI.

What It Does

codetex builds a structured, searchable index of your codebase that LLMs can query on demand:

Tier 1 — Repo Overview: Purpose, architecture, directory structure, key technologies, entry points
Tier 2 — File Summaries: Per-file purpose, public interfaces, dependencies, roles
Tier 3 — Symbol Details: Function/class signatures, parameters, return types, call relationships

Summaries are generated by an LLM (Anthropic Claude). Embeddings are computed locally with sentence-transformers for semantic search. Everything is stored in a single SQLite database with sqlite-vec for vector queries.

Incremental sync means only changed files are re-analyzed when you update your code.

Requirements

Python 3.12+
Git
An Anthropic API key (for indexing)

Installation

# With pip
pip install codetex-mcp

# With uv (recommended)
uv tool install codetex-mcp

Quick Start

1. Set your Anthropic API key

# Via environment variable
export ANTHROPIC_API_KEY=sk-ant-...

# Or via config
codetex config set llm.api_key sk-ant-...

2. Add a repository

# Local repo
codetex add /path/to/your/project

# Remote repo (clones to ~/.codetex/repos/)
codetex add https://github.com/user/repo.git

3. Index it

# Preview what indexing will cost (no API calls)
codetex index my-project --dry-run

# Build the full index
codetex index my-project

4. Query your codebase

# Repo overview (Tier 1)
codetex context my-project

# File summary (Tier 2)
codetex context my-project --file src/auth/login.py

# Symbol detail (Tier 3)
codetex context my-project --symbol authenticate_user

# Semantic search
codetex context my-project --query "how is authentication implemented?"

5. Keep it up to date

# Incremental sync — only re-analyzes changed files
codetex sync my-project

MCP Server Setup

The MCP server lets LLM clients (like Claude Code, Cursor, Windsurf, etc.) query your indexed codebases directly.

Claude Code

Add to your Claude Code MCP settings (~/.claude/claude_desktop_config.json):

{
  "mcpServers": {
    "codetex": {
      "command": "codetex",
      "args": ["serve"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

If you installed with uv tool, use the full path:

{
  "mcpServers": {
    "codetex": {
      "command": "/path/to/codetex",
      "args": ["serve"],
      "env": {
        "ANTHROPIC_API_KEY": "sk-ant-..."
      }
    }
  }
}

Find the path with which codetex or uv tool dir.

Other MCP Clients

Any client that supports MCP stdio transport can use codetex. The server command is:

codetex serve

Available MCP Tools

Once connected, the LLM has access to 7 tools:

Tool	Description
`get_repo_overview`	Tier 1 repo overview (architecture, technologies, entry points)
`get_file_context`	Tier 2 file summary with symbol list
`get_symbol_detail`	Tier 3 full symbol detail (signature, params, relationships)
`search_context`	Semantic search across all indexed context
`get_repo_status`	Index status (staleness, file/symbol counts, last indexed)
`sync_repo`	Trigger incremental sync from within the LLM session
`list_repos`	List all registered repositories

CLI Reference

`codetex add <target>`

codetex add .                                    # Current directory
codetex add /path/to/repo                        # Local path
codetex add https://github.com/user/repo.git     # Remote (clones locally)
codetex add git@github.com:user/repo.git         # SSH remote

`codetex index <repo-name>`

Build a full index for a registered repository.

codetex index my-project                # Full index
codetex index my-project --dry-run      # Preview (files, symbols, estimated LLM calls/tokens)
codetex index my-project --path src/    # Index only files under src/

`codetex sync <repo-name>`

Incremental sync to the current HEAD. Only files changed since the last indexed commit are re-analyzed.

codetex sync my-project                 # Sync changes
codetex sync my-project --dry-run       # Preview what would change
codetex sync my-project --path src/     # Sync only changes under src/

`codetex context <repo-name>`

Query indexed context at any tier.

codetex context my-project                              # Tier 1: repo overview
codetex context my-project --file src/main.py           # Tier 2: file summary
codetex context my-project --symbol MyClass             # Tier 3: symbol detail
codetex context my-project --query "error handling"     # Semantic search

`codetex status <repo-name>`

Show index status: indexed commit, current HEAD, staleness, file/symbol counts, token usage.

`codetex list`

List all registered repositories with their index status.

`codetex config show`

Display the current configuration.

`codetex config set <key> <value>`

Update a configuration value.

codetex config set llm.api_key sk-ant-...
codetex config set llm.model claude-sonnet-4-5-20250929
codetex config set indexing.max_file_size_kb 1024
codetex config set indexing.max_concurrent_llm_calls 10

Configuration

Configuration is loaded in layers (last wins):

Defaults — sensible out-of-the-box values
TOML file — ~/.codetex/config.toml
Environment variables — override everything

Config file

# ~/.codetex/config.toml

[storage]
data_dir = "~/.codetex"                  # Base directory for DB and cloned repos

[llm]
provider = "anthropic"                   # LLM provider (currently: anthropic)
model = "claude-sonnet-4-5-20250929"     # Model used for summarization
api_key = "sk-ant-..."                   # Anthropic API key

[indexing]
max_file_size_kb = 512                   # Skip files larger than this
max_concurrent_llm_calls = 5             # Parallel LLM requests during indexing
tier1_rebuild_threshold = 0.10           # Rebuild repo overview if >=10% of files changed on sync

[embedding]
model = "all-MiniLM-L6-v2"              # Sentence-transformers model for embeddings

Environment variables

Variable	Maps to	Example
`ANTHROPIC_API_KEY`	`llm.api_key`	`sk-ant-...`
`CODETEX_DATA_DIR`	`storage.data_dir`	`/custom/path`
`CODETEX_LLM_PROVIDER`	`llm.provider`	`anthropic`
`CODETEX_LLM_MODEL`	`llm.model`	`claude-sonnet-4-5-20250929`
`CODETEX_MAX_FILE_SIZE_KB`	`indexing.max_file_size_kb`	`1024`
`CODETEX_MAX_CONCURRENT_LLM`	`indexing.max_concurrent_llm_calls`	`10`
`CODETEX_TIER1_THRESHOLD`	`indexing.tier1_rebuild_threshold`	`0.15`
`CODETEX_EMBEDDING_MODEL`	`embedding.model`	`all-MiniLM-L6-v2`

File Exclusion

Files are filtered through multiple stages:

Default excludes — node_modules/, __pycache__/, .git/, dist/, build/, .venv/, *.lock, *.min.js, *.pyc, *.so, etc.
.gitignore — standard gitignore rules from your repo
.codetexignore — same syntax as .gitignore, placed in your repo root. Use !pattern to un-ignore files
File size — files exceeding max_file_size_kb are skipped
Binary detection — files with null bytes in the first 8 KB are skipped

Language Support

Language	Tree-sitter (full AST)	Fallback (regex)
Python	Yes	Yes
JavaScript	Yes	Yes
TypeScript	Yes	Yes
Go	Yes	Yes
Rust	Yes	Yes
Java	Yes	Yes
Ruby	Yes	Yes
C/C++	Yes	Yes
All others	—	Yes

Tree-sitter grammars for all 8 languages are installed automatically. For other languages, the fallback parser uses regex patterns to extract functions, classes, and imports.

Architecture

CLI (Typer) ──┐
              ├──▶ Core Services (Indexer, Syncer, ContextStore, SearchEngine)
MCP (FastMCP)─┘         │              │              │
                    Analysis        LLM Provider    Embeddings
                 (tree-sitter +    (Anthropic)    (sentence-transformers)
                  regex fallback)       │              │
                         └──────────────┴──────────────┘
                                        │
                                   SQLite + sqlite-vec

Two entry points (CLI and MCP server) share the same core service layer
No DI framework — services are wired via a create_app() factory
All core services are async — CLI bridges with asyncio.run()
Embeddings are local — no external API calls for vector search (model auto-downloads on first run, ~90 MB)
Single SQLite database — 6 main tables + 2 vector tables (384-dimensional embeddings)

Development

git clone https://github.com/mrosata/codetex-mcp.git
cd codetex-mcp

# Install dependencies (including dev)
uv sync

# Run tests
uv run pytest

# Run tests with coverage
uv run pytest --cov=codetex_mcp

# Lint and format
uv run ruff check src/ tests/
uv run ruff format src/ tests/

# Type check
uv run mypy src/

Releasing

Releases are automated via GitHub Actions and python-semantic-release. Version bumps are driven by conventional commit messages on main.

Commit message format

Prefix	Effect	Example
`fix: ...`	Patch bump (0.1.0 → 0.1.1)	`fix: handle missing gitignore`
`feat: ...`	Minor bump (0.1.0 → 0.2.0)	`feat: add Ruby tree-sitter support`
`feat!: ...`	Major bump (0.1.0 → 1.0.0)	`feat!: redesign context API`
`docs:`, `chore:`, `ci:`, `test:`, `refactor:`	No release	`docs: update README`

A BREAKING CHANGE: line in the commit body also triggers a major bump.

How it works

Push or merge a PR to main
CI runs lint, type check, and tests
The release workflow analyzes commits since the last tag
If a version bump is needed, it:
- Updates the version in pyproject.toml
- Creates a git tag (e.g., v0.2.0)
- Publishes a GitHub Release with a changelog
- Builds and publishes the package to PyPI

Manual release (not recommended)

If you need to release without the automation:

uv build
uv publish

License

MIT

codetex-mcp

README

codetex-mcp

What It Does

Requirements

Installation

Quick Start

1. Set your Anthropic API key

2. Add a repository

3. Index it

4. Query your codebase

5. Keep it up to date

MCP Server Setup

Claude Code

Other MCP Clients

Available MCP Tools

CLI Reference

codetex add <target>

codetex index <repo-name>

codetex sync <repo-name>

codetex context <repo-name>

codetex status <repo-name>

codetex list

codetex config show

codetex config set <key> <value>

Configuration

Config file

Environment variables

File Exclusion

Language Support

Architecture

Development

Releasing

Commit message format

How it works

Manual release (not recommended)

License

推荐服务器

`codetex add <target>`

`codetex index <repo-name>`

`codetex sync <repo-name>`

`codetex context <repo-name>`

`codetex status <repo-name>`

`codetex list`

`codetex config show`

`codetex config set <key> <value>`