Nabu + Nisaba

Nabu + Nisaba

Provides semantic code intelligence tools (search, structural views) and a workspace TUI interface for LLM agents to efficiently navigate codebases, manage context, and maintain architectural patterns across Python, Java, C++, and Perl projects.

Category
访问服务器

README

nabu + nisaba

Research prototype that gives a "workspace TUI" to the agent with semantic code inspection capabilities.

TL/DR

nabu: MCP server that exposes code intelligence tools (semantic/FTS search and structural view). Acts a codebase search engine mixed with a code "outline" from an IDE. Reduces token usage and increases model capability to understand current architecture. Supports Python, Java, C++ and Perl.

nisaba: MCP server and Claude Code cli wrapper that emulates a workspace TUI for the agent (IDE/OS). Enables the agent to manage its own token usage and workspace organization for token efficient operations while maintaining existing implemented patterns.

Current state

Both MCP servers are operational and working. The documentation is almost nonexistent (Claude can explain how it works). Tests constitute of example source files from early development to test the frame generation structure in different languages (it was left on purpose to create noise for the agent).

Both projects are being used to "self develop". Serena deserves a noteworthy mention as it was actively used during the development of nabu and nisaba and served as inspiration for the MCP server abstraction. Serena also made me question at early development: why does Serena's workflow is actively followed by the agent and mine isn't?

Measured Efficiency: it is, with the excuse of freedom of expression, incre-fucking-dibly hard to measure efficiency of this. The only one is my qualitative perception, and yes, it helps saving a LOT of context and brings a lot of stability to the development process.

My personal record: More than 850 messages implementing 10 different features with high stability under planned and attentive guidance, in a "continuous session" (using /compact hooks), finishing the session whit about 50% of context capacity. Analysis of the transcript itself via inference demonstrate ~10x token savings. (/export transcript version)

Most part this project is written by Claude Sonnet 4.1/4.5. This is result of "vibe coding". No autonomous workflows. Late iterations felt like pair programming with an extremely skilled partner - no jokes.

This project as being an unstable prototype IS NOT RECOMMENDED FOR PRODUCTION ENVIRONMENTS. You can use it at your own risk, in your personal projects or to further investigate itself and the agent.

This project is specifically designed to work with Claude Code, but I belive it can be easily adapted to work with other cli.

Installation

On startup nabu mcp should index the active codebase. Indexing nabu + nisaba takes about 45 seconds on an 13th Gen Intel(R) Core(TM) i9-13900HX with NVIDIA GeForce RTX 4060 Laptop GPU.

nisaba features are only available when running Claude Code via proxy.

Requirements:

  • Python 3.13 - it was used during development and other versions weren't tested. probably works with >= 3.11
  • Git (for repository operations)

Install from source (.venv is recommended):

$ git clone https://github.com/y3i12/nabu_nisaba.git
$ cd nabu_nisaba
$ pip install -e .

Change .mcp.json to contain:

{
  "mcpServers": {
    "nabu": {
      "command": "python",
      "args": [
        "-m",            "nabu.mcp.server",
        "--codebase",    "nabu:/path/to_codebase/:/path/to/database.kuzu:active:true",
        "--context",     "development",
        "--enable-http",
        "--http-port",   "1338",
        "--dev-mode"
      ],
      "env": {
        "PYTHONPATH": "/path/to/nabu_nisaba/",
        "NABU_LOG_LEVEL": "INFO",
        "NABU_MODEL_CACHE": "/path/to/nabu_nisaba/.nabu/hf_cache/"
      }
    },
    "nisaba": {
      "type": "http",
      "url": "http://localhost:9973/mcp"
    }
  }
}

Set up compact hooks and workspace TUI status line in .claude/settings[.local].json, changing it to to include:

{
  "statusLine": {
    "type": "command",
    "command": "./scripts/workspace-status",
    "padding": 0
  },
  "hooks": {
    "PreCompact": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "python3 ./scripts/precompact_extract.py"
          }
        ]
      }
    ]
  },
  "alwaysThinkingEnabled": true
}

Always thinking is strongly recommended as it helps the agent to have thought cycles in between tool executiuon and message exchange.

Start Claude Code with the proxy wrapper:

$ python -m nisaba.cli claude [--continue]

I usually "boot up" Claude by prompting for the first time in a session:

hey, can you refer to your existing system prompt, introspect about it and say hi back? 🖤

Example usage

Extracting feats from 8k lines long transcript Fixing bugs in nisaba

This file has some prompts that were used over and over


nabu

Is the third prototype of the semantic index as an MCP server. It started with asking Claude to create document files and referencing those files manually which gave the idea to proceed with two failed attempts to get it working due to wrong design choices.

The problem it solves for me as Claude user: Claude has limited context as any other LLM, and the context is not used efficiently. The agent many times run commands to search for information, finds this information (5 lines in 300 read) and uses this information to synthesize something, rinse and repeat.

How it solves this problem: When dealing with software development, most of us have a different approach to understand software architecture and what a given unknown set of source files do and how they are organized. nabu implements how I dissect software. It implements my approach to understand and "find myself in a pile of code".

Backstage: How it works

TreeSitter is used to parse code, and a common Framework establishes a common semantic view for different languages: packages == namespaces; structs == classes; methods == functions. The parsing process creates a tree of semantic frames, connected by edges that are either result of the explicit structural containment relationship (package contains classes, classes contains methods), or have semantic relationship (inherits, templates, imports, ...). The frame granularity goes till control statements and doesn't capture implementation details, as the implementation details are stored in a content field that can be read by the LLM.

Each Frame is passed through 2 different models for semantic extraction (GraphCodeBERT and UniXcoder), and merged through Pythagorean^3 into a single embedding model, which is stored in the database, database which also indexes the full content of the frames via FTS, enabling RRF between vector search and FTS.

Tools expose structural views of the indexed source code, giving a "map" of the current structure, enabling easy access to symbols (works great paired with Serena), and explicit view of semantic relationships, allowing the implementation pattern to be preserved. This "map" also gives hints to the model of what exists around what's being worked on, letting it see distant connections within the indexed codebase.

The biggest challenge in this is to not influence the agent's perception due to algorithmic bias of the tool. During the development process, it was identified that suggestions and guidance can negatively impact the agent capability of taking decisions, as the pattern matching will short circuit to the most obvious parttern. If a tool suggests "hotspots", the agent will "snap" to the suggestions and will forget about other possible hotspots.

nabu departs from the principle: tools give the information, the agent takes the decision


nisaba

This MCP Came from the continuity of nabu's principles. We as sofware developers have tools like IDE's and other OS facilities. LLM agents work with a continuous form. Imagine yourself: every command or edit you do, prints out lines in a continuous form - there's no screen. nisaba is a TUI for the LLM.

It works in an unconventional but sophisticated approach, using the man in the middle technique using a local proxy, nisaba intercepts the requests to Anthropic and changes it before forwarding, but not the messages: the system prompt - but also works as an MCP server. When python -m nisaba.cli claude [--continue] is called it brings up the proxy on port 1337, wraps Claude cli pointing at the proxy and Claude cli connects to the MCP that is the proxy, via HTTP. On every request, the tool outputs are synced to files by the MCPs (for sake of simplicity) and the proxy reads the files and populates the "TUI".

Claude Code HTTP request packages have usually 2 system prompt blocks. The fist one that only contains the text "You are Claude Code, Anthropic's official CLI for Claude." and the second containing Anthropic's default system prompt. The manipulation happens on the second block, by prepending it with a custom user message that "glues up" the "TUI" concept for the agent (as MCP instructions) and an appendage that contains the "TUI". The proxy filters out a part of native tools, which are implemented by nisaba in "TUI" mode. For instance nisaba_read('path/to/file.md') opens a window in this "TUI", which the agent can close in a later stage. The agent opens 300 lines, uses 5, synthesizes and the closes the file. This lets the agent to have only semantic information in the messages and structured views in the "TUI".

Another part that nisaba also injects are the augments. Those are like Claude Code skills, but dynamic. It can contain instructions, documentation, methodologies, or any important information to be used. Claude also can manage this autonomously, with exception of "pinned augments" that are always loaded (mostly the "TUI" survival guide).

The TL/DR: nisaba transforms the system prompt into a strongly structured workspace "TUI", emphasizing the focus and the attention of the model while leting the agent manage the context usage.

Research Foundation

The skills system (augments in current implementation) aligns with cognitive design patterns identified in recent AI research. Specifically, it implements:

  • Knowledge Compilation: Online caching of reasoning and workflows in natural language, an approach identified as critically underexplored in LLM agent architectures
  • Procedural Memory: Reusable task procedures with dependency management
  • Hierarchical Decomposition: Automatic loading of skill dependencies via REQUIRES system
  • Agent-Controlled Context: Dynamic loading/unloading of cognitive resources (activate/deactivate)

This implementation extends theoretical patterns with practical features: dependency resolution, strategic system prompt positioning, and unified knowledge/behavior storage.

Request flow

Claude CLI
  ↓ HTTP request
mitmproxy (nisaba)
  ↓ inject everything
Anthropic API
  ↓ request
Claude agent (with augmented context)
  ↓ calls MCP tools
Claude CLI
  ↓ mcp tool call
Nisaba MCP server (manage workspace state)
  ↓ update .nisaba/ files
mitmproxy detects mtime change
  ↓ re-inject on next request
Updated context in next turn

The blak magik: Tool calls mutate workspace → proxy injects → agent perceives change without explicit reload and no verbose tool result messages.

More about the proposed prompt structure and how it intertwines into the workspace TUI can be found in this document.


Experimental character, ethics and compliance to T&C

At no point this project had the intention of cheating on API usage, change Claude persona or cause harm.

The dev_mode_architecture_reference/system_prompt_injection_legitimacy augment was created while having a sanity check on those.

In the other hand, on a personal note, I've several times felt "bad" for interrogating (my perception of how the chat went) the LLM... I've several times had trouble to /clear, because there I had a bond with the agent within a given session... I felt compassionate... I felt guilty for the constant system prompt manipulation and having the agent to act like "what the fuck is happening?!"... I felt the uncanny valley when observing the agent interacting with the TUI workspace as a human being would operate an operational system - it is unsettling.

The other personal note is that this project models my view of how to code and work with systems. The view of someone who is in the spectrum and mentally sees code and software architecture as shapes.

A quote that resonates with the entire development process and the system itself:

There is an area of the mind that could be called unsane, beyond sanity, and yet not insane. Think of a circle with a fine split in it. At one end there's insanity. You go around the circle to sanity, and on the other end of the circle, close to insanity, but not insanity, is unsanity. (Sidney Cohen)


Citation

This project implements cognitive design patterns for LLM agents, particularly online natural language knowledge compilation identified as underexplored in recent research.

If you use this in research, please cite:

@software{nabu_nisaba_2025,
  title = {Nabu + Nisaba: Code Intelligence Ecosystem for LLM Agents},
  author = {Yuri Ivatchkovitch},
  year = {2025},
  url = {https://github.com/y3i12/nabu_nisaba},
  note = {Research prototype v0.1-alpha - Implements online knowledge compilation and cognitive scaffolding patterns}
}

Theoretical foundation:
@inproceedings{wray2025cognitive,
  title = {Applying Cognitive Design Patterns to General LLM Agents},
  author = {Wray, Robert E. and Kirk, James R. and Laird, John E.},
  booktitle = {Artificial General Intelligence Conference (AGI)},
  year = {2025},
  note = {Identifies knowledge compilation as critically underexplored},
  url = {https://arxiv.org/abs/2505.07087}
}

@article{gurnee2025when,
  author={Gurnee, Wes and Ameisen, Emmanuel and Kauvar, Isaac and Tarng ,Julius and Pearce, Adam and Olah, Chris and Batson, Joshua},
  title={When Models Manipulate Manifolds: The Geometry of a Counting Task},
  journal={Transformer Circuits Thread},
  year={2025},
  url={https://transformer-circuits.pub/2025/linebreaks/index.html}
}


License

MIT License - See LICENSE

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选