codecortex
Persistent codebase knowledge layer for AI agents. Pre-digests codebases into structured knowledge (symbols, dependency graphs, co-change patterns, architectural decisions) and serves via MCP. 28 languages, 14 tools, ~85% token reduction.
README
CodeCortex
Persistent codebase knowledge layer for AI agents. Your AI shouldn't re-learn your codebase every session.
<a href="https://glama.ai/mcp/servers/@rushikeshmore/codecortex"> <img width="380" height="200" src="https://glama.ai/mcp/servers/@rushikeshmore/codecortex/badge" alt="codecortex MCP server" /> </a>
The Problem
Every AI coding session starts from scratch. When context compacts or a new session begins, the AI re-scans the entire codebase. Same files, same tokens, same wasted time. It's like hiring a new developer every session who has to re-learn everything before writing a single line.
The data backs this up:
- AI agents increase defect risk by 30% on unfamiliar code (CodeScene + Lund University, 2025)
- Code churn grew 2.5x in the AI era (GitClear, 211M lines analyzed)
- Nobody combines structural + semantic + temporal + decision knowledge in one portable tool
The Solution
CodeCortex pre-digests codebases into layered knowledge files and serves them to any AI agent via MCP. Instead of re-understanding your codebase every session, the AI starts with knowledge.
Hybrid extraction: tree-sitter native N-API for structure (symbols, imports, calls across 28 languages) + host LLM for semantics (what modules do, why they're built that way). Zero extra API keys.
Quick Start
# Install
npm install -g codecortex-ai
# Initialize knowledge for your project
cd /path/to/your-project
codecortex init
# Start MCP server (for AI agent access)
codecortex serve
# Check knowledge freshness
codecortex status
Connect to Claude Code
Add to your MCP config:
{
"mcpServers": {
"codecortex": {
"command": "codecortex",
"args": ["serve"],
"cwd": "/path/to/your-project"
}
}
}
What Gets Generated
All knowledge lives in .codecortex/ as flat files in your repo:
.codecortex/
cortex.yaml # project manifest
constitution.md # project overview for agents
overview.md # module map + entry points
graph.json # dependency graph (imports, calls, modules)
symbols.json # full symbol index (functions, classes, types...)
temporal.json # git coupling, hotspots, bug history
modules/*.md # per-module deep analysis
decisions/*.md # architectural decision records
sessions/*.md # session change logs
patterns.md # coding patterns and conventions
Six Knowledge Layers
| Layer | What | File |
|---|---|---|
| 1. Structural | Modules, deps, symbols, entry points | graph.json + symbols.json |
| 2. Semantic | What each module does, data flow, gotchas | modules/*.md |
| 3. Temporal | Git behavioral fingerprint - coupling, hotspots, bug history | temporal.json |
| 4. Decisions | Why things are built this way | decisions/*.md |
| 5. Patterns | How code is written here | patterns.md |
| 6. Sessions | What changed between sessions | sessions/*.md |
The Temporal Layer
This is the killer differentiator. The temporal layer tells agents "if you touch file X, you MUST also touch file Y" even when there's no import between them. This comes from git co-change analysis, not static code analysis.
Example from a real codebase:
routes.tsandworker.tsco-changed in 9/12 commits (75%) with zero imports between them- Without this knowledge, an AI editing one file would produce a bug 75% of the time
MCP Tools (14)
Read Tools (9)
| Tool | Description |
|---|---|
get_project_overview |
Constitution + overview + graph summary |
get_module_context |
Module doc by name, includes temporal signals |
get_session_briefing |
Changes since last session |
search_knowledge |
Keyword search across all knowledge |
get_decision_history |
Decision records filtered by topic |
get_dependency_graph |
Import/export graph, filterable |
lookup_symbol |
Symbol by name/file/kind |
get_change_coupling |
What files must I also edit if I touch X? |
get_hotspots |
Files ranked by risk (churn x coupling) |
Write Tools (5)
| Tool | Description |
|---|---|
analyze_module |
Returns source files + structured prompt for LLM analysis |
save_module_analysis |
Persists LLM analysis to modules/*.md |
record_decision |
Saves architectural decision to decisions/*.md |
update_patterns |
Merges coding pattern into patterns.md |
report_feedback |
Agent reports incorrect knowledge for next analysis |
CLI Commands
| Command | Description |
|---|---|
codecortex init |
Discover project + extract symbols + analyze git history |
codecortex serve |
Start MCP server (stdio transport) |
codecortex update |
Re-extract changed files, update affected modules |
codecortex status |
Show knowledge freshness, stale modules, symbol counts |
Token Efficiency
CodeCortex uses a three-tier memory model to minimize token usage:
Session start (HOT only): ~4,300 tokens
Working on a module (+WARM): ~5,000 tokens
Need coding patterns (+COLD): ~5,900 tokens
vs. raw scan of entire codebase: ~37,800 tokens
85-90% token reduction. 7-10x efficiency gain.
Supported Languages (28)
| Category | Languages |
|---|---|
| Web | TypeScript, TSX, JavaScript, Liquid |
| Systems | C, C++, Objective-C, Rust, Zig, Go |
| JVM | Java, Kotlin, Scala |
| .NET | C# |
| Mobile | Swift, Dart |
| Scripting | Python, Ruby, PHP, Lua, Bash, Elixir |
| Functional | OCaml, Elm, Emacs Lisp |
| Other | Solidity, Vue, CodeQL |
Tech Stack
- TypeScript ESM, Node.js 20+
tree-sitter(native N-API) + 28 language grammar packages@modelcontextprotocol/sdk- MCP servercommander- CLIsimple-git- git integrationyaml,zod,glob
License
MIT
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。