SuiAgentic
A FastAPI-based application that enables document embedding and semantic retrieval using Qdrant vector database, allowing users to convert documents into embeddings and retrieve relevant content through natural language queries.
README
🧠 SuiAgentic
SuiAgentic is a FastAPI-based application for document embedding and semantic retrieval, powered by the Qdrant vector database. It enables you to convert documents (from URLs or local files) into embeddings, store them efficiently, and retrieve relevant content using natural language queries. It is designed to support AI-enhanced tools like Cursor, Copilot, Claude, and other MCP-compatible clients.
💡 Why SuiAgentic? Many organizations need to integrate context from internal documents (e.g., PRDs, design specs, wikis) into tools used by developers and knowledge workers. However, consolidating documents from various sources into a centralized, searchable knowledge base is complex and fragmented.
SuiAgentic solves this by providing a centralized context server that ingests, chunks, embeds, and indexes your content—making it available via a simple REST API and web interface. It also supports being used as an MCP server for AI agents.
🚀 Key Features Document Embedding: Extracts content from URLs (with or without authentication), splits it into chunks, generates embeddings, and stores them in Qdrant.
Semantic Search: Query your knowledge base with natural language and retrieve relevant chunks or documents.
Web UI: Easy-to-use web interface for embedding and searching.
REST API: Fully accessible via HTTP endpoints for automation or integration.
MCP Server Ready: Use it with MCP-compatible clients like Cursor, Copilot, Claude, etc.
Authentication Support: Supports Basic Auth and Bearer Token for protected documents.
⚙️ Quick Start
- Clone the Repository
git clone https://github.com/AnhQuan2004/mcp_agent.git
cd mcp_agent
- Set up Python Environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
- Install Dependencies
pip install -r requirements.txt
- Create .env file (or use the provided .env.example)
QDRANT_URL=localhost
QDRANT_PORT=6333
QDRANT_COLLECTION_NAME=documents
- Start Qdrant (Vector DB)
Using Docker:
docker run -p 6333:6333 qdrant/qdrant
Or using the helper script:
./runqdrant.sh
- Run the Agentic App
uvicorn app.main:app --reload
# or:
python run.py
Visit http://localhost:8000
🌐 Web Interface & API
Web UI:
- / — Home
- /embed — Embed documents via UI
- /retrieve — Semantic search UI
🔍 POST /retrieve
{
"query": "What is the architecture of Sui?",
"top_k": 5,
"group_by_doc": true
}
🌍 Embedding from URLs
Public URLs:
- Just provide the URL via the API or UI — no auth needed.
🤖 Using as an MCP Server
To use sui as an MCP server:
{
"mcpServers": {
"suiAgentic": {
"url": "http://localhost:8000/mcp"
}
}
}
Document Upload Tools
This directory contains tools to bulk upload documents to your SuiAgentic Qdrant database.
Available Tools
upload_folder.py- A simple script to upload PDF files from a folderupload_documents.py- An advanced script to upload PDF, DOCX, and TXT files with more options
Prerequisites
- Python 3.8+
- SuiAgentic application installed and configured
- Qdrant server running locally or accessible via network
- Required dependencies installed (PyPDF2, python-docx)
Basic Usage
Upload PDF Files from a Folder
# Upload all PDFs from a folder
python upload_folder.py /path/to/pdf/folder
# Upload with a prefix (useful for categorizing documents)
python upload_folder.py /path/to/pdf/folder --prefix "Research Papers"
Advanced Document Upload
# Upload all supported documents from a folder and subfolders
python upload_documents.py /path/to/documents --recursive
# Add metadata tags to all documents
python upload_documents.py /path/to/documents --tag category=research --tag project=alpha
# Specify collection name (if not using default)
python upload_documents.py /path/to/documents --collection my_collection
# Complete example with all options
python upload_documents.py /path/to/documents --recursive --prefix "Project X" --tag department=marketing --tag status=final
What These Tools Do
- Find supported documents in the specified folder
- Extract text content from each document
- Split text into manageable chunks
- Generate 3072-dimensional embeddings for each chunk
- Store chunks and embeddings in Qdrant
- Track metadata for each document
Command-line Arguments
upload_folder.py
folder- Path to the folder containing PDF files--prefix- Prefix to add to document names
upload_documents.py
folder- Path to the folder containing documents--prefix- Prefix to add to document names--recursive- Search for files recursively in subfolders--collection- Name of the Qdrant collection to use--tag- Add metadata tags to documents (can be used multiple times:--tag key=value)
Examples
Organize documents by project
python upload_documents.py /path/to/projects/project1 --recursive --prefix "Project 1" --tag project=alpha
python upload_documents.py /path/to/projects/project2 --recursive --prefix "Project 2" --tag project=beta
Categorize documents
python upload_documents.py /path/to/contracts --prefix "Legal" --tag department=legal --tag confidential=true
python upload_documents.py /path/to/manuals --prefix "Technical" --tag department=engineering
Troubleshooting
- If you encounter memory errors with large documents, try breaking them into smaller files
- For large collections of documents, consider processing in smaller batches
- Check the log output for any errors during processing
🪪 License
Licensed under the Apache License 2.0.
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。