ariadne

ariadne

Ariadne's thread — a way out of the microservice maze. Local cross-service semantic chain hinter for microservices (GraphQL/HTTP/Kafka/frontend)

Category
访问服务器

README

Ariadne

License: MIT Python 3.10+ MCP Status ariadne MCP server Awesome MCP Servers

Ariadne's thread — a way out of the microservice maze.

Cross-service API dependency graph and semantic code navigation for microservice architectures. MCP stdio server for AI coding assistants (Claude Code, Cursor, Windsurf), with a CLI twin for scripting. Read-only static analysis on SQLite + TF-IDF + embeddings.


Who is this for

  • AI coding assistants (Claude Code, Cursor, Windsurf) — a structured cross-service dependency view sized for the context window, in place of raw grep output.
  • Backend engineers tracing a feature across 4+ services — GraphQL, REST, Kafka, and frontend calls resolved in one query.
  • Platform and reviewers doing cross-service impact analysis — surface the full call chain a change in one service touches before it ships.
  • Onboarding engineers mapping an unfamiliar microservice topology from a single business term.

Why

Ariadne indexes only the contract layer — GraphQL mutations, REST endpoints, Kafka topics, frontend queries — nothing else. That narrowness is what makes results fit an AI context window.

Approach Problem Ariadne solves
grep / rg across repos Drowns in DTOs, tests, configs
IDE "Find Usages" Stops at service boundaries
Service mesh dashboards Needs production traffic; no feature mapping
Full AST / call-graph tools Slow to build; too much detail

Example

You ask Claude "where does createOrder live across the stack?" Claude calls query_chains mid-conversation and gets back:

Top Cluster #1  [confidence: 0.91]
  Services: gateway, orders-svc, billing-svc, web
  - [web]          Frontend Mutation: createOrder
  - [gateway]      GraphQL Mutation:  createOrder
  - [orders-svc]   HTTP POST /orders: createOrder
  - [orders-svc]   Kafka Topic:       order-created
  - [billing-svc]  Kafka Listener:    order-created → chargeCustomer

Claude then summarises: "createOrder is a GraphQL mutation in gateway, forwarded to orders-svc via POST /orders, which publishes an order-created Kafka event that billing-svc consumes to charge the customer."

~500 tokens round-trip. The equivalent grep -r createOrder across four repos would return 40+ matches across DTOs, tests, and configs at ~2000 tokens, with the contract layer buried.


Golden path

The intended workflow when an AI assistant drives Ariadne via the MCP server.

1. query_chains(hint="createOrder")
     → ranked clusters across services. Start here for cross-service context.

2. expand_node(name="order-created")
     → one-hop neighbours of a known node. Within 10 min of a matching
       query_chains, this auto-logs positive feedback — the expand IS the signal.

3. Read the files the returned clusters / neighbours point at.

4. log_feedback(hint, accepted=False, ...)
     → manual thumbs-down only. Positive feedback is captured in step 2.

On stale_warning, call rescan() and retry. See FAQ.


Quick start

Three commands, then restart Claude Code.

pip install mcp onnxruntime tokenizers huggingface_hub
cp ariadne.config.example.json ariadne.config.json   # edit repos inside
python3 main.py install ariadne.config.json ~/your-workspace

install is idempotent — re-run it after pulling new code, or let the assistant call rescan when it sees a stale_warning. See --help for flags (--no-scan, --force, --snippet, --marker).


Tools

What the assistant sees once install is done and Claude Code is restarted:

Tool Args Purpose
query_chains hint, top_n (default 3) Business term → cross-service clusters
expand_node name (partial match supported) One-hop neighbours of a known node
rescan (none) Refresh the index in place when a response has a stale_warning; git-hash incremental, returns {nodes, duration_ms}
ariadne_help (none) Setup guide + runtime config diagnostics (missing DB, empty index, stale scan)
log_feedback hint, accepted, node_ids, ... Manual thumbs-down (positive feedback is implicit — see Feedback boost under Architecture)

Configuration

Config format

{
  "repos": [
    {
      "name": "gateway",
      "path": "../gateway",
      "scanners": ["graphql"]
    },
    {
      "name": "orders-svc",
      "path": "../orders-svc",
      "scanners": [
        "http",
        "kafka",
        {
          "type": "backend_clients",
          "client_target_map": { "billing": "billing-svc", "user": "user-svc" }
        }
      ]
    },
    {
      "name": "web",
      "path": "../web",
      "scanners": [
        "frontend_graphql",
        {
          "type": "frontend_rest",
          "base_class_service": { "OrdersApiService": "orders-svc" }
        }
      ]
    }
  ]
}

Paths are resolved relative to the config file. Each repo lists one or more scanners — either by name (string) or as an object with extra options.

Available scanners

Scanner Looks for
graphql .graphql / .gql SDL → Query / Mutation / Subscription / Type
http Spring @RestController (Java/Kotlin) → HTTP endpoints
kafka Spring application.yaml topics + @KafkaListener + producers
backend_clients Spring RestClient / RestTemplate outbound calls in *Client.*
frontend_graphql TypeScript gql\`` literals → frontend Query/Mutation
frontend_rest axios/fetch calls in TS/TSX files, excluding tests/mocks/types
cube cube.js cube(...) definitions

Custom scanners

Any language or framework not covered above can be added without touching Ariadne's source code. Implement scanner.BaseScanner, put the module somewhere Python can import it, and reference the class by dotted path in ariadne.config.json:

{
  "name": "my-go-service",
  "path": "../my-go-service",
  "scanners": [
    {
      "type": "my_scanners.go_scanner:GoRouteScanner",
      "route_file": "cmd/server/routes.go"
    }
  ]
}

"type" is "module.path:ClassName". Every other key is passed to __init__.

# my_scanners/go_scanner.py
from scanner import BaseScanner

class GoRouteScanner(BaseScanner):
    def __init__(self, route_file: str = "routes.go"):
        self.route_file = route_file

    def scan(self, repo_path: str, service: str) -> list[dict]:
        # parse repo_path/self.route_file, return node dicts
        return [{"id": f"{service}::http::GET::/ping", "type": "http_endpoint",
                 "raw_name": "ping", "service": service,
                 "source_file": self.route_file,
                 "method": "GET", "path": "/ping", "fields": []}]

FAQ

Does Ariadne require a running cluster, server, or network? No. Pure static analysis. Source → local SQLite (ariadne.db, embeddings.db, feedback.db). No network calls, no uploads.

How does it know when to re-scan? If the oldest scan is >7 days old, MCP responses include a stale_warning field (CLI prints the same warning to stderr). From an AI conversation, call rescan(); from the shell, python3 main.py scan --config <path>.

Results feel generic at first — will they improve? Yes. expand_node follow-ups implicitly log positive feedback; the boost rerank step (confidence + 0.15 * boost) promotes clusters that have been useful for similar hints. Day-one results are pure lexical ranking; after a few weeks they reflect your team's navigation patterns. Count-based, not a learned model.

Can I use it without an AI assistant — just as a CLI? Yes. python3 main.py scan / query / expand / stats — zero deps beyond Python 3.10. MCP is still the recommended path.


Architecture

ariadne/
├── scanner/       # per-framework extractors → node dicts
├── normalizer/    # camelCase/snake/kebab → tokens
├── scoring/       # IDF-Jaccard engine + bge-small embedder
├── store/         # SQLite: ariadne.db / embeddings.db / feedback.db
├── query/         # query / expand entry points
├── mcp_server.py  # MCP stdio server
├── main.py        # CLI
└── tests/         # pytest suite

Scoring

The math is information retrieval, not graph theory. Node names are tokenized (createOrder["create", "order"]) and compared with IDF-weighted Jaccard:

idf_jaccard(A, B) = Σ idf(t)  (t ∈ A ∩ B)  /  Σ idf(t)  (t ∈ A ∪ B)
idf(t)           = log(N / df(t))

Rare tokens dominate; high-frequency domain words (task, id, service) self-dampen, no stopword list needed.

base  = idf_jaccard(name) * 0.55 + idf_jaccard(fields) * 0.45
score = min(base * role_mult * service_mult, 1.0)

role_mult    = 1.3   for complementary pairs
                     (GraphQL Mutation ↔ Kafka topic ↔ HTTP POST,
                      GraphQL Query ↔ Cube Query ↔ HTTP GET)
service_mult = 1.25  cross-service / 0.8 same-service

Clustering

Two-stage, O(anchors × neighbours), independent of repo count.

  1. Tokenize the hint, score against all nodes, keep the top 30 anchors with score ≥ 0.15.
  2. For each anchor, pull its edges from the DB (single IN query) and keep the top 12 neighbours with edge_score ≥ 0.25.
  3. Merge anchor neighbourhoods that overlap by ≥ 25%.
  4. Per cluster, take top 2 nodes per (service, type), capped at 12.
  5. Confidence = mean edge score · 0.6 + type diversity · 0.2 + service diversity · 0.2.

Embeddings

TF-IDF is the primary recall channel. bge-small-en-v1.5 (ONNX int8 quantized) is used for two narrow jobs:

  • Recall fallback: when token overlap is weak, find synonyms (e.g. assignHomeworkassignStudentsToTask) and add them to the anchor set.
  • Reranking: build top_n × 2 clusters first, then re-sort by 0.6 · confidence + 0.4 · max_cos(hint, cluster_nodes) and truncate to top_n.

The ONNX model is ~34 MB (int8 quantized) and runs on CPU via onnxruntime. Cold start ~0.3s. Vectors cached in embeddings.db; only the query hint is embedded at query time.

Feedback boost

A final rerank step that adapts ranking to your team's vocabulary — no model training, no uploads. feedback.db is local per developer.

Every query_chains call caches returned clusters for 10 minutes. A follow-up expand_node(name) that substring-matches a node in a pending cluster auto-writes an accepted=True row — the expand IS the signal. log_feedback(hint, accepted, ...) is the manual escape hatch for thumbs-down.

On the next query() for the same hint:

final_score = confidence + 0.15 * sum(prior_accepted_count per node in cluster)

Weight (0.15) and decay window (90 days) are intentionally conservative — lexical confidence still dominates. Disable with export ARIADNE_FEEDBACK_BOOST=0.


Tests

python3 tests/test_semantic_hint.py
python3 tests/test_feedback_boost.py
python3 tests/test_implicit_feedback.py
python3 tests/test_onnx_embedder.py

A pre-commit hook at hooks/pre-commit runs test_semantic_hint.py — enable once per clone with:

ln -sf ../../hooks/pre-commit .git/hooks/pre-commit

Roadmap

  • More Kafka sources beyond application.yaml + @KafkaListener + KafkaTemplate.send
  • TF-IDF weight tuning for very high-frequency domain tokens
  • Stronger feedback signal: decay tuning, per-service weighting, cross-hint generalisation (current boost is count-based within the same hint)
  • Watch mode: hook into git post-commit / file events to auto-trigger rescan instead of waiting for a stale_warning
  • expand_node product polish: clearer trigger conditions, smaller input surface, output that points at the next step
  • Parameter pass across all tools: task-oriented names over implementation names; unify verb prefixes for naming consistency

Non-goals

  • LLM as the primary judge (slow, costly, non-reproducible)
  • Visualization / graph database backend
  • Full AST call-graph extraction

License

MIT — see LICENSE.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选