Gemini CLI RAG MCP

Gemini CLI RAG MCP

Transforms static gemini-cli documentation into a queryable RAG service, enabling developers to ask questions about Gemini CLI in natural language and receive instant, accurate answers based on the official documentation directly within their workflow.

Category
访问服务器

README

Gemini CLI RAG MCP

This project builds a standalone RAG service, transforming the static gemini-cli documentation into a dynamic and queryable tool. This tool exposes knowledge via a protocol (like MCP), making it accessible to any integrated client. Therefore, environments like gemini-cli, VS Code, or Cursor can provide developers with instant, accurate answers in natural language, directly within their workflow. Accelerating learning and letting you intuitively leverage the tool's full potential.

Table of Contents

Project Overview

This project integrates a RAG pipeline and it consists of three main components:

  1. Data Extraction and Processing: Python scripts that extract content from all markdown files in the gemini-cli/docs directory and sub-directories, process it, and create a vector store.
  2. MCP Server: A Python-based MCP server that exposes the vector store as a queryable tool.
  3. Gemini CLI/VSCode/ClaudeCode/Windsurf/Cursor...etc: The official Gemini CLI, which can connect to the MCP server to answer questions about its documentation.

Features

  • RAG-based Q&A: Ask questions about the Gemini CLI in natural language and get answers based on its official documentation.
  • Local Vector Store: The entire documentation is stored and indexed locally using SKLearnVectorStore.
  • Extensible: The MCP server can be easily extended with new tools and data sources.

System Architecture

The system is composed of the following parts:

  1. extract.py: This script walks through the gemini-cli/docs directory, finds all .md files, and concatenates their content into a single gemini_cli_docs.txt file.
  2. create_vectorstore.py: This script loads the gemini_cli_docs.txt file, splits it into chunks, and creates a gemini_cli_vectorstore.parquet file using HuggingFaceEmbeddings and SKLearnVectorStore.
  3. gemini_cli_mcp.py: This script runs a FastMCP server that loads the vector store and exposes two endpoints:
    • gemini_cli_query_tool(query: str): A tool that takes a user query, retrieves relevant documents from the vector store, and returns them.
    • docs://gemini-cli/full: A resource that returns the entire content of the gemini_cli_docs.txt file.
  4. gemini-cli/: The official Gemini CLI, which can be configured to use the MCP server.

Getting Started

Prerequisites

  • Python 3.13
  • Node.js 18+
  • An existing gemini-cli installation. If you don't have it, you can clone the official repository:
    git clone https://github.com/google-gemini/gemini-cli.git
    

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/gemini-cli-rag-mcp.git
    cd gemini-cli-rag-mcp
    
  2. Install Python dependencies:

    pip install -r requirements.txt
    
  3. Prepare the documentation data: Run the extract.py script to gather all the markdown documentation into a single file.

    python extract.py
    
  4. Create the vector store: Run the create_vectorstore.py script to create the vector store from the documentation file.

    python create_vectorstore.py
    

Usage

Before running with docker, try running the mcp in dev mode and test:

mcp dev gemini_cli_mcp.py

On Command field type 'python' and on Arguments type 'gemini_cli_mcp.py' and press Connect.

1. Run the MCP Service with Docker

The most efficient way to run the MCP server is with Docker Compose. This starts a container in the background and keeps it ready for Gemini CLI to connect to.

docker-compose up -d

The container will keep running, but the Python MCP script itself will only be executed on-demand by Gemini CLI.

2. Configure Gemini CLI

To make Gemini CLI aware of your local MCP server, you need to create a configuration file.

  • Inside the .gemini directory add the following content to the settings.json file:

    {
      "mcpServers": {
        "local_rag_server": {
          "command": "docker",
          "args": [
            "exec",
            "-i",
            "gemini-cli-mcp-container",
            "python",
            "gemini_cli_mcp.py"
          ]
        }
      }
    }
    

This configuration tells Gemini CLI how to launch your MCP server using docker exec. Obs: To use it in VSCode, go to Settings type 'mcp' and click on settings.json. Then put on Agent mode and ask copilot to implement the gemini-cli-mcp server (give the json above as context).

3. Ask Questions

After restarting terminal to changes make effect, simply run gemini from your terminal. It will automatically discover the local_rag_server and use its tools when needed.

Example:

How do I customize my gemini-cli?

or something more specific:

My gemini cli is not showing an interactive prompt when I run it on my build server, it just exits. I have a CI_TOKEN environment variable set. Why is this happening and how can I fix it?

How It Works

Data Extraction and Vectorization

The extract.py script recursively finds all markdown files in the gemini-cli/docs directory. It reads their content and combines it into a single text file, gemini_cli_docs.txt.

The create_vectorstore.py script then takes this text file and:

  1. Loads the document.
  2. Splits it into smaller, overlapping chunks using RecursiveCharacterTextSplitter.
  3. Uses HuggingFaceEmbeddings (with the BAAI/bge-large-en-v1.5 model) to create embeddings for each chunk.
  4. Stores these embeddings in a SKLearnVectorStore, which is persisted to gemini_cli_vectorstore.parquet.

MCP Server

The gemini_cli_mcp.py script creates a FastMCP server. This server defines a tool, gemini_cli_query_tool, which can be called by the Gemini CLI or VSCode/Cursor/etc. When this tool is invoked, it:

  1. Loads the persisted SKLearnVectorStore.
  2. Uses the vector store as a retriever to find the most relevant document chunks for the given query.
  3. Returns the content of these chunks to the Gemini CLI.

Gemini CLI Integration

The Gemini CLI is designed to be extensible through MCP servers. The CLI discovers available tools by connecting to servers defined in the mcpServers object in a settings.json file (either in the project's .gemini directory or in the user's home ~/.gemini directory).

Gemini CLI supports three transport mechanisms for communication:

  • Stdio Transport: Spawns a subprocess and communicates with it over stdin and stdout. This is the method used in this project, with the command property in settings.json.
  • SSE Transport: Connects to a Server-Sent Events (SSE) endpoint, defined with a url property.
  • Streamable HTTP Transport: Uses HTTP streaming for communication, configured with an httpUrl property.

By using the docker exec command, we are leveraging the stdio transport to create a direct communication channel with the Python script inside the container.

Scripts

  • extract.py: Extracts documentation from markdown files.
  • create_vectorstore.py: Creates the vector store.
  • gemini_cli_mcp.py: Runs the MCP server.

Dependencies

Python

The main Python dependencies are listed in requirements.txt:

  • langchain: For text splitting, vector stores, and embeddings.
  • tiktoken: For token counting.
  • sentence-transformers: For the embedding model.
  • scikit-learn: For the vector store.
  • mcp: For the MCP server.
  • fastapi: For the MCP server.

Node.js

The project relies on the gemini-cli package and its dependencies. See gemini-cli/package.json for more details.

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选