genai-lab

genai-lab

Exposes tools for semantic search and RAG-based Q\&A from a local knowledge base, along with resources and prompt templates.

Category
访问服务器

README

GenAI Lab

GenAI Lab implements a TypeScript-based GenAI agent workflow with semantic search, Retrieval-Augmented Generation (RAG), MCP tools/resources/prompts, and LangGraph-based orchestration.

The project focuses on the technical mechanics behind controlled GenAI systems: retrieval, grounding, structured planning, graph-based routing, bounded retries, and MCP tool exposure.

Technical Capabilities

  • Embedding-based semantic search over a local knowledge base
  • Retrieval-Augmented Generation with source-grounded responses
  • Source citation support using note IDs
  • LangGraph workflow orchestration with explicit shared state
  • Structured LLM planning with constrained execution steps
  • Retrieval-query extraction before semantic search
  • Conditional graph routing based on planner output and retrieval state
  • Decomposed RAG flow: retrieval node followed by answer/draft generation nodes
  • Bounded retry path for low/no retrieval results
  • Safe fallback behavior for unsupported or ungrounded requests
  • MCP server exposing tools, resources, and prompt templates
  • AI SDK MCP client flow for LLM-driven MCP tool selection

System Architecture

Agent workflow

User request
  → planner node
  → conditional routing
  → retrieve context when needed
  → answer question OR draft message
  → retry/fallback when needed
  → final response

MCP flow

LLM client
  → discovers MCP tools
  → chooses search_notes / answer_from_notes
  → MCP server executes backend logic
  → result returns to LLM
  → final response

Setup

Install dependencies:

pnpm install

This project requires an OpenAI API key.

Create .env.local:

OPENAI_API_KEY=your_openai_api_key

Supported Example Flows

Flow Path What it demonstrates Example command
Grounded Q&A flow planner → retrieve_notes → answer_question → final_response Answers using retrieved context and source citations pnpm agent "Can you explain why retrieval should happen before generation?"
Retrieval-augmented drafting flow planner → retrieve_notes → draft_message → final_response Retrieves relevant context before generating a draft pnpm agent "Use my notes about token-heavy conversations to write a short team update"
Direct drafting flow planner → draft_message → final_response Generates a draft without retrieval when no knowledge lookup is needed pnpm agent "Draft a Slack message saying QA signoff is pending"
Retrieval-only flow planner → retrieve_notes → final_response Searches notes and returns grounded matching context pnpm agent "Search saved notes for deterministic backend functions"
Fallback flow planner → final_response Avoids unsupported answers when no tool/knowledge path applies pnpm agent "Tell me something funny"
Standalone semantic search query → embedding → similarity ranking → notes Runs vector similarity search directly pnpm semantic-search "backend-defined tools and input schemas"
Standalone RAG flow question → retrieve context → generate answer → cite source Runs retrieval and answer generation without LangGraph pnpm rag "Why do long chats become more expensive?"
MCP tool-selection flow LLM → MCP tools → selected tool → tool result → final answer Lets the LLM choose MCP-exposed tools pnpm mcp:llm "Explain how teams reduce LLM cost in long conversations"

Quick Demo

pnpm agent "Can you explain why retrieval should happen before generation?"

pnpm agent "Use my notes about token-heavy conversations to write a short team update"

pnpm agent "Draft a Slack message saying QA signoff is pending"

pnpm agent "Search saved notes for deterministic backend functions"

pnpm agent "Tell me something funny"

pnpm semantic-search "backend-defined tools and input schemas"

pnpm rag "Why do long chats become more expensive?"

pnpm mcp:llm "Explain how teams reduce LLM cost in long conversations"

Tech Stack

  • TypeScript
  • Vercel AI SDK
  • OpenAI models via @ai-sdk/openai
  • LangGraph
  • Model Context Protocol TypeScript SDK
  • Zod
  • pnpm

Project Structure

lib/
  agent/
    agent-state.ts
    mini-agent.ts
  embedding.ts
  knowledge-base.ts
  retrieve-context.ts
  semantic-search.ts
  rag-answer.ts
  rag-types.ts
  vector-utils.ts

mcp/
  server.ts
  client-test.ts
  llm-client-test.ts

scripts/
  agent.ts
  rag.ts
  semantic-search.ts

Implementation Notes

1. Semantic search

The project embeds notes and queries, then ranks notes by vector similarity.

query
→ embedding
→ similarity search
→ ranked notes

2. RAG

The RAG flow retrieves relevant context before generating an answer.

question
→ retrieve context
→ generate grounded answer
→ include source citation

3. LangGraph orchestration

The agent workflow uses LangGraph to keep explicit state across nodes.

Example plans:

retrieve_notes → answer_question → final_response
retrieve_notes → draft_message → final_response
draft_message → final_response
final_response

4. Retrieval-query extraction

The planner extracts a focused retrieval query instead of sending the full user request to semantic search.

Example:

User request:
Use my notes about token-heavy conversations to write a short team update

Retrieval query:
token-heavy conversations

5. Decomposed RAG

RAG is split into retrieval and generation steps inside the agent workflow.

retrieve context
→ answer or draft from retrieved context
→ final response

This keeps retrieval, generation, routing, and fallback behavior visible and independently controllable.

6. Conditional routing

The graph routes based on the current state.

if context found and answer requested → answer_question
if context found and draft requested → draft_message
if no context → retry or fallback

7. Bounded retry

When retrieval fails, the agent retries once with a broader query and lower score threshold.

focused query fails
→ retry with broader query
→ succeed or stop safely

8. Safe fallback

Unsupported or unrelated requests return a bounded fallback instead of generating unsupported answers from missing context.

Example:

pnpm agent "Tell me something funny"

Expected behavior:

returns a bounded fallback instead of answering from unsupported context

MCP Interface

The MCP server exposes:

MCP Primitive Name Purpose
Tool search_notes Search saved notes
Tool answer_from_notes Answer questions from notes
Resource notes://all Read local knowledge base
Prompt rag_answer_prompt Prompt template for grounded answers

The LangGraph agent and MCP examples are intentionally kept as separate flows in this repo.

  • The LangGraph flow demonstrates controlled agent orchestration with state, planning, routing, retrieval, drafting, retries, and fallback behavior.
  • The MCP flow demonstrates how the same search/RAG capabilities can be exposed to external MCP clients as tools, resources, and prompts.

In a larger application, these patterns can be combined by having LangGraph nodes call MCP tools for external capabilities such as GitHub, Jira, Slack, or Confluence.

Knowledge Base

The local knowledge base contains notes about:

  • RAG basics
  • token cost in chat apps
  • workflow routing
  • tool calling
  • semantic search

The knowledge base is intentionally small so retrieval, ranking, grounding, and routing behavior are easy to inspect.

Design Scope

Current scope:

  • local knowledge base
  • embedding-based semantic search
  • RAG with source citations
  • CLI scripts instead of a web UI
  • MCP over stdio
  • controlled LangGraph workflow

The focus is on the core mechanics of retrieval, grounding, planning, routing, MCP tool exposure, and bounded agent behavior.

Next Improvements

  • Improve CLI output formatting
  • Add clearer docs for LangGraph state and routing
  • Add sample output snapshots
  • Add basic tests for retrieval and RAG behavior

TODO

Short Term

  • Add more knowledge-base examples
  • Add an eval script for expected retrieval results
  • Add safer handling for low-confidence retrieval
  • Add structured output for final agent responses

Future Extensions

  • Database-backed vector search
  • Chunking for longer documents
  • Richer source metadata
  • External integrations
  • Guardrails and approval workflows
  • Observability logs
  • Eval suite

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选