Model Context Protocol (MCP)
Okay, here's a breakdown of a working pattern for SSE-based (Server-Sent Events) MCP (Message Channel Protocol) clients and servers using Gemini LLM, along with explanations and considerations: **Core Idea:** This pattern leverages SSE for real-time, unidirectional (server-to-client) communication of LLM-generated content. MCP provides a structured way to manage the conversation flow and metadata. The server uses Gemini LLM to generate responses, and streams them to the client via SSE. The client displays the content as it arrives. **Components:** 1. **MCP Client (Frontend - e.g., Web Browser, Mobile App):** * **Initiates Conversation:** Sends an initial message (e.g., user query) to the server via a standard HTTP request (POST or GET). This request includes MCP metadata (e.g., conversation ID, message ID, user ID). * **Establishes SSE Connection:** After the initial request, the client opens an SSE connection to a specific endpoint on the server. This endpoint is dedicated to receiving streaming responses for the given conversation. * **Receives SSE Events:** Listens for `message` events from the SSE stream. Each event contains a chunk of the LLM-generated response, along with MCP metadata. * **Reconstructs and Displays Response:** As events arrive, the client appends the data to a display area, providing a real-time, streaming experience. * **Handles Errors and Completion:** Listens for specific SSE events (e.g., `error`, `done`) to handle errors or signal the completion of the LLM response. * **Manages Conversation State:** The client may need to store the conversation ID and other relevant metadata to maintain context for subsequent requests. * **Sends Subsequent Messages:** After receiving a complete response, the client can send new messages to the server, continuing the conversation. These messages are sent via standard HTTP requests, and a new SSE stream is established for the response. 2. **MCP Server (Backend - e.g., Node.js, Python/Flask, Java/Spring Boot):** * **Receives Initial Request:** Handles the initial HTTP request from the client containing the user's query and MCP metadata. * **Validates Request:** Validates the request and MCP metadata. * **Interacts with Gemini LLM:** Sends the user's query to the Gemini LLM API. Crucially, it uses the streaming capabilities of the Gemini API (if available). * **Generates SSE Events:** As the Gemini LLM generates text, the server creates SSE `message` events. Each event contains a chunk of the generated text and relevant MCP metadata (e.g., conversation ID, message ID, chunk ID). * **Manages SSE Connections:** Maintains a list of active SSE connections. Each connection is associated with a specific conversation ID. * **Sends SSE Events to Client:** Pushes the SSE events to the appropriate client connection. * **Handles Errors:** If an error occurs during LLM generation, the server sends an `error` event to the client via SSE. * **Sends Completion Event:** When the LLM response is complete, the server sends a `done` event to the client via SSE. * **Manages Conversation State:** The server stores the conversation history and other relevant metadata. This is essential for maintaining context across multiple turns in the conversation. A database (e.g., PostgreSQL, MongoDB) is typically used for this purpose. * **MCP Implementation:** The server needs to implement the MCP protocol, including message formatting, routing, and error handling. **Gemini LLM Integration:** * **Streaming API:** Use the Gemini LLM's streaming API (if available). This allows the server to receive the LLM's response in chunks, which can then be immediately sent to the client via SSE. * **Prompt Engineering:** Carefully design prompts to guide the LLM's responses and ensure they are appropriate for the conversation. * **Rate Limiting:** Implement rate limiting to prevent abuse of the LLM API. * **Error Handling:** Handle errors from the LLM API gracefully. If an error occurs, send an `error` event to the client via SSE. **MCP Considerations:** * **Message Format:** Define a clear message format for MCP messages. This format should include fields for: * Conversation ID * Message ID * User ID * Message Type (e.g., `user_message`, `llm_response`, `error`, `done`) * Payload (the actual text of the message) * Chunk ID (for SSE streaming) * **Routing:** Implement a routing mechanism to direct messages to the correct conversation. * **Error Handling:** Define a standard way to handle errors. This should include error codes and error messages. * **Security:** Implement security measures to protect against unauthorized access and data breaches. **Example (Conceptual - Python/Flask):** ```python from flask import Flask, request, Response, stream_with_context import google.generativeai as genai import os import json import time app = Flask(__name__) # Configure Gemini API (replace with your API key) GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY") genai.configure(api_key=GOOGLE_API_KEY) model = genai.GenerativeModel('gemini-pro') # Or 'gemini-pro-vision' # In-memory conversation store (replace with a database in production) conversations = {} def generate_llm_response_stream(conversation_id, user_message): """Generates a streaming response from Gemini LLM.""" global conversations if conversation_id not in conversations: conversations[conversation_id] = [] conversations[conversation_id].append({"role": "user", "parts": [user_message]}) try: chat = model.start_chat(history=conversations[conversation_id]) response = chat.send_message(user_message, stream=True) for chunk in response: llm_text = chunk.text conversations[conversation_id].append({"role": "model", "parts": [llm_text]}) # Store LLM response mcp_message = { "conversation_id": conversation_id, "message_type": "llm_response", "payload": llm_text } yield f"data: {json.dumps(mcp_message)}\n\n" time.sleep(0.1) # Simulate processing time mcp_done_message = { "conversation_id": conversation_id, "message_type": "done" } yield f"data: {json.dumps(mcp_done_message)}\n\n" except Exception as e: mcp_error_message = { "conversation_id": conversation_id, "message_type": "error", "payload": str(e) } yield f"data: {json.dumps(mcp_error_message)}\n\n" @app.route('/chat', methods=['POST']) def chat_handler(): """Handles the initial chat request and establishes the SSE stream.""" data = request.get_json() conversation_id = data.get('conversation_id') user_message = data.get('message') if not conversation_id or not user_message: return "Missing conversation_id or message", 400 def stream(): yield from generate_llm_response_stream(conversation_id, user_message) return Response(stream_with_context(stream()), mimetype='text/event-stream') if __name__ == '__main__': app.run(debug=True, port=5000) ``` **Client-Side Example (JavaScript):** ```javascript const conversationId = 'unique-conversation-id'; // Generate a unique ID const messageInput = document.getElementById('messageInput'); const chatOutput = document.getElementById('chatOutput'); const sendButton = document.getElementById('sendButton'); sendButton.addEventListener('click', () => { const message = messageInput.value; messageInput.value = ''; // 1. Send initial message via HTTP POST fetch('/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ conversation_id: conversationId, message: message }) }).then(() => { // 2. Establish SSE connection const eventSource = new EventSource('/chat'); eventSource.onmessage = (event) => { const data = JSON.parse(event.data); console.log("Received SSE event:", data); if (data.message_type === 'llm_response') { chatOutput.textContent += data.payload; } else if (data.message_type === 'done') { console.log('LLM response complete.'); eventSource.close(); // Close the SSE connection } else if (data.message_type === 'error') { console.error('Error from server:', data.payload); chatOutput.textContent += `Error: ${data.payload}`; eventSource.close(); } }; eventSource.onerror = (error) => { console.error('SSE error:', error); eventSource.close(); }; }).catch(error => { console.error("Error sending initial message:", error); }); }); ``` **Key Improvements and Considerations:** * **Error Handling:** Robust error handling is crucial. The server should catch exceptions during LLM generation and send error events to the client. The client should display these errors to the user. * **Conversation History:** The server *must* maintain a conversation history to provide context for subsequent requests. This can be stored in a database. The example code uses an in-memory store, which is not suitable for production. * **Security:** Implement appropriate security measures, such as authentication and authorization, to protect against unauthorized access. * **Scalability:** For high-traffic applications, consider using a message queue (e.g., RabbitMQ, Kafka) to decouple the server from the LLM API. This can improve scalability and resilience. * **Rate Limiting:** Implement rate limiting to prevent abuse of the LLM API. * **Prompt Engineering:** Experiment with different prompts to optimize the LLM's responses. * **Token Management:** Be mindful of token limits for both input and output with the LLM. Implement strategies to truncate or summarize the conversation history if necessary. * **User Interface:** Design a user interface that provides a clear and intuitive experience for the user. Consider adding features such as: * Loading indicators * Error messages * Conversation history * **Metadata:** Include relevant metadata in the MCP messages, such as timestamps, user IDs, and message IDs. This can be helpful for debugging and analysis. * **Chunking Strategy:** Experiment with different chunking strategies to optimize the streaming experience. Smaller chunks will result in a more responsive UI, but may also increase overhead. * **Cancellation:** Implement a mechanism for the user to cancel a long-running LLM request. This can be done by sending a cancellation signal to the server, which can then terminate the LLM request. * **Context Management:** Consider using a more sophisticated context management strategy, such as retrieval-augmented generation (RAG), to provide the LLM with access to external knowledge sources. This comprehensive pattern provides a solid foundation for building SSE-based MCP clients and servers using Gemini LLM. Remember to adapt the code and configurations to your specific needs and environment. Good luck!
drkhan107
README
模型上下文协议 (MCP)
一个集成了 Google Gemini 的 MCP 工作演示。
🚀 快速开始
1. 克隆仓库
git clone https://github.com/drkhan107/mcp_gemini.git
cd your-repo-name
2. 设置环境变量
在根目录下创建一个 .env 文件,并添加你的 Google API 密钥:
GOOGLE_API_KEY="your_api_key_here"
3. 📦 安装依赖
从 requirements.txt 安装所有必需的包:
pip install -r requirements.txt
4. 🖥️ 运行 MCP 服务器
启动 MCP 服务器:
python sse_server.py
✅ 这将在配置的端口启动 MCP 服务器(默认为 http://localhost:8080/sse)。
5. 🧠 启动 MCP 客户端(可选)
服务器运行后,使用服务器 URL 启动 SSE 客户端:
python ssc_client.py http://localhost:8080/sse
6. 🧠 启动 FastAPI 服务器 (使用 GUI)
运行以下命令(要更改端口等,请编辑 fastapp.py 文件)
python fastapp.py
7. 启动 Streamlit 应用
- 确保 MCP 服务器正在运行 (http://localhost:8080/sse)
- 确保 FastAPI 正在运行。
运行以下命令
streamlit run app.py
这将在端口 8501 上启动 streamlit 应用
8. 浏览器
打开浏览器 (localhost:8501) 后,点击连接到 MCP 服务器。
✅ 完成! 你现在有了一个使用 Gemini 的模型上下文协议的工作演示。
推荐服务器
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
MCP Package Docs Server
促进大型语言模型高效访问和获取 Go、Python 和 NPM 包的结构化文档,通过多语言支持和性能优化来增强软件开发。
Claude Code MCP
一个实现了 Claude Code 作为模型上下文协议(Model Context Protocol, MCP)服务器的方案,它可以通过标准化的 MCP 接口来使用 Claude 的软件工程能力(代码生成、编辑、审查和文件操作)。
@kazuph/mcp-taskmanager
用于任务管理的模型上下文协议服务器。它允许 Claude Desktop(或任何 MCP 客户端)在基于队列的系统中管理和执行任务。
mermaid-mcp-server
一个模型上下文协议 (MCP) 服务器,用于将 Mermaid 图表转换为 PNG 图像。
Jira-Context-MCP
MCP 服务器向 AI 编码助手(如 Cursor)提供 Jira 工单信息。

Linear MCP Server
一个模型上下文协议(Model Context Protocol)服务器,它与 Linear 的问题跟踪系统集成,允许大型语言模型(LLM)通过自然语言交互来创建、更新、搜索和评论 Linear 问题。

Sequential Thinking MCP Server
这个服务器通过将复杂问题分解为顺序步骤来促进结构化的问题解决,支持修订,并通过完整的 MCP 集成来实现多条解决方案路径。
Curri MCP Server
通过管理文本笔记、提供笔记创建工具以及使用结构化提示生成摘要,从而实现与 Curri API 的交互。