MCP 服务器

Model Context Protocol (MCP)

Okay, here's a breakdown of a working pattern for SSE-based (Server-Sent Events) MCP (Message Channel Protocol) clients and servers using Gemini LLM, along with explanations and considerations: **Core Idea:** This pattern leverages SSE for real-time, unidirectional (server-to-client) communication of LLM-generated content. MCP provides a structured way to manage the conversation flow and metadata. The server uses Gemini LLM to generate responses, and streams them to the client via SSE. The client displays the content as it arrives. **Components:** 1. **MCP Client (Frontend - e.g., Web Browser, Mobile App):** * **Initiates Conversation:** Sends an initial message (e.g., user query) to the server via a standard HTTP request (POST or GET). This request includes MCP metadata (e.g., conversation ID, message ID, user ID). * **Establishes SSE Connection:** After the initial request, the client opens an SSE connection to a specific endpoint on the server. This endpoint is dedicated to receiving streaming responses for the given conversation. * **Receives SSE Events:** Listens for `message` events from the SSE stream. Each event contains a chunk of the LLM-generated response, along with MCP metadata. * **Reconstructs and Displays Response:** As events arrive, the client appends the data to a display area, providing a real-time, streaming experience. * **Handles Errors and Completion:** Listens for specific SSE events (e.g., `error`, `done`) to handle errors or signal the completion of the LLM response. * **Manages Conversation State:** The client may need to store the conversation ID and other relevant metadata to maintain context for subsequent requests. * **Sends Subsequent Messages:** After receiving a complete response, the client can send new messages to the server, continuing the conversation. These messages are sent via standard HTTP requests, and a new SSE stream is established for the response. 2. **MCP Server (Backend - e.g., Node.js, Python/Flask, Java/Spring Boot):** * **Receives Initial Request:** Handles the initial HTTP request from the client containing the user's query and MCP metadata. * **Validates Request:** Validates the request and MCP metadata. * **Interacts with Gemini LLM:** Sends the user's query to the Gemini LLM API. Crucially, it uses the streaming capabilities of the Gemini API (if available). * **Generates SSE Events:** As the Gemini LLM generates text, the server creates SSE `message` events. Each event contains a chunk of the generated text and relevant MCP metadata (e.g., conversation ID, message ID, chunk ID). * **Manages SSE Connections:** Maintains a list of active SSE connections. Each connection is associated with a specific conversation ID. * **Sends SSE Events to Client:** Pushes the SSE events to the appropriate client connection. * **Handles Errors:** If an error occurs during LLM generation, the server sends an `error` event to the client via SSE. * **Sends Completion Event:** When the LLM response is complete, the server sends a `done` event to the client via SSE. * **Manages Conversation State:** The server stores the conversation history and other relevant metadata. This is essential for maintaining context across multiple turns in the conversation. A database (e.g., PostgreSQL, MongoDB) is typically used for this purpose. * **MCP Implementation:** The server needs to implement the MCP protocol, including message formatting, routing, and error handling. **Gemini LLM Integration:** * **Streaming API:** Use the Gemini LLM's streaming API (if available). This allows the server to receive the LLM's response in chunks, which can then be immediately sent to the client via SSE. * **Prompt Engineering:** Carefully design prompts to guide the LLM's responses and ensure they are appropriate for the conversation. * **Rate Limiting:** Implement rate limiting to prevent abuse of the LLM API. * **Error Handling:** Handle errors from the LLM API gracefully. If an error occurs, send an `error` event to the client via SSE. **MCP Considerations:** * **Message Format:** Define a clear message format for MCP messages. This format should include fields for: * Conversation ID * Message ID * User ID * Message Type (e.g., `user_message`, `llm_response`, `error`, `done`) * Payload (the actual text of the message) * Chunk ID (for SSE streaming) * **Routing:** Implement a routing mechanism to direct messages to the correct conversation. * **Error Handling:** Define a standard way to handle errors. This should include error codes and error messages. * **Security:** Implement security measures to protect against unauthorized access and data breaches. **Example (Conceptual - Python/Flask):** ```python from flask import Flask, request, Response, stream_with_context import google.generativeai as genai import os import json import time app = Flask(__name__) # Configure Gemini API (replace with your API key) GOOGLE_API_KEY = os.environ.get("GOOGLE_API_KEY") genai.configure(api_key=GOOGLE_API_KEY) model = genai.GenerativeModel('gemini-pro') # Or 'gemini-pro-vision' # In-memory conversation store (replace with a database in production) conversations = {} def generate_llm_response_stream(conversation_id, user_message): """Generates a streaming response from Gemini LLM.""" global conversations if conversation_id not in conversations: conversations[conversation_id] = [] conversations[conversation_id].append({"role": "user", "parts": [user_message]}) try: chat = model.start_chat(history=conversations[conversation_id]) response = chat.send_message(user_message, stream=True) for chunk in response: llm_text = chunk.text conversations[conversation_id].append({"role": "model", "parts": [llm_text]}) # Store LLM response mcp_message = { "conversation_id": conversation_id, "message_type": "llm_response", "payload": llm_text } yield f"data: {json.dumps(mcp_message)}\n\n" time.sleep(0.1) # Simulate processing time mcp_done_message = { "conversation_id": conversation_id, "message_type": "done" } yield f"data: {json.dumps(mcp_done_message)}\n\n" except Exception as e: mcp_error_message = { "conversation_id": conversation_id, "message_type": "error", "payload": str(e) } yield f"data: {json.dumps(mcp_error_message)}\n\n" @app.route('/chat', methods=['POST']) def chat_handler(): """Handles the initial chat request and establishes the SSE stream.""" data = request.get_json() conversation_id = data.get('conversation_id') user_message = data.get('message') if not conversation_id or not user_message: return "Missing conversation_id or message", 400 def stream(): yield from generate_llm_response_stream(conversation_id, user_message) return Response(stream_with_context(stream()), mimetype='text/event-stream') if __name__ == '__main__': app.run(debug=True, port=5000) ``` **Client-Side Example (JavaScript):** ```javascript const conversationId = 'unique-conversation-id'; // Generate a unique ID const messageInput = document.getElementById('messageInput'); const chatOutput = document.getElementById('chatOutput'); const sendButton = document.getElementById('sendButton'); sendButton.addEventListener('click', () => { const message = messageInput.value; messageInput.value = ''; // 1. Send initial message via HTTP POST fetch('/chat', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ conversation_id: conversationId, message: message }) }).then(() => { // 2. Establish SSE connection const eventSource = new EventSource('/chat'); eventSource.onmessage = (event) => { const data = JSON.parse(event.data); console.log("Received SSE event:", data); if (data.message_type === 'llm_response') { chatOutput.textContent += data.payload; } else if (data.message_type === 'done') { console.log('LLM response complete.'); eventSource.close(); // Close the SSE connection } else if (data.message_type === 'error') { console.error('Error from server:', data.payload); chatOutput.textContent += `Error: ${data.payload}`; eventSource.close(); } }; eventSource.onerror = (error) => { console.error('SSE error:', error); eventSource.close(); }; }).catch(error => { console.error("Error sending initial message:", error); }); }); ``` **Key Improvements and Considerations:** * **Error Handling:** Robust error handling is crucial. The server should catch exceptions during LLM generation and send error events to the client. The client should display these errors to the user. * **Conversation History:** The server *must* maintain a conversation history to provide context for subsequent requests. This can be stored in a database. The example code uses an in-memory store, which is not suitable for production. * **Security:** Implement appropriate security measures, such as authentication and authorization, to protect against unauthorized access. * **Scalability:** For high-traffic applications, consider using a message queue (e.g., RabbitMQ, Kafka) to decouple the server from the LLM API. This can improve scalability and resilience. * **Rate Limiting:** Implement rate limiting to prevent abuse of the LLM API. * **Prompt Engineering:** Experiment with different prompts to optimize the LLM's responses. * **Token Management:** Be mindful of token limits for both input and output with the LLM. Implement strategies to truncate or summarize the conversation history if necessary. * **User Interface:** Design a user interface that provides a clear and intuitive experience for the user. Consider adding features such as: * Loading indicators * Error messages * Conversation history * **Metadata:** Include relevant metadata in the MCP messages, such as timestamps, user IDs, and message IDs. This can be helpful for debugging and analysis. * **Chunking Strategy:** Experiment with different chunking strategies to optimize the streaming experience. Smaller chunks will result in a more responsive UI, but may also increase overhead. * **Cancellation:** Implement a mechanism for the user to cancel a long-running LLM request. This can be done by sending a cancellation signal to the server, which can then terminate the LLM request. * **Context Management:** Consider using a more sophisticated context management strategy, such as retrieval-augmented generation (RAG), to provide the LLM with access to external knowledge sources. This comprehensive pattern provides a solid foundation for building SSE-based MCP clients and servers using Gemini LLM. Remember to adapt the code and configurations to your specific needs and environment. Good luck!

drkhan107

开发者工具

访问服务器

README

模型上下文协议 (MCP)

一个集成了 Google Gemini 的 MCP 工作演示。

🚀 快速开始

1. 克隆仓库

git clone https://github.com/drkhan107/mcp_gemini.git
cd your-repo-name

2. 设置环境变量

在根目录下创建一个 .env 文件，并添加你的 Google API 密钥：

GOOGLE_API_KEY="your_api_key_here"

3. 📦 安装依赖

从 requirements.txt 安装所有必需的包：

pip install -r requirements.txt

4. 🖥️ 运行 MCP 服务器

启动 MCP 服务器：

python sse_server.py

✅ 这将在配置的端口启动 MCP 服务器（默认为 http://localhost:8080/sse）。

5. 🧠 启动 MCP 客户端（可选）

服务器运行后，使用服务器 URL 启动 SSE 客户端：


python ssc_client.py http://localhost:8080/sse

6. 🧠 启动 FastAPI 服务器 (使用 GUI)

运行以下命令（要更改端口等，请编辑 fastapp.py 文件）


python fastapp.py

7. 启动 Streamlit 应用

确保 MCP 服务器正在运行 (http://localhost:8080/sse)
确保 FastAPI 正在运行。

运行以下命令


streamlit run app.py

这将在端口 8501 上启动 streamlit 应用

8. 浏览器

打开浏览器 (localhost:8501) 后，点击连接到 MCP 服务器。

alt text

✅ 完成！你现在有了一个使用 Gemini 的模型上下文协议的工作演示。