
SourceSync.ai MCP Server
一个模型上下文协议服务器,使人工智能模型能够与 SourceSync.ai 的知识管理平台交互,从而管理文档、从各种来源摄取内容并执行语义搜索。
Tools
validateApiKey
Validates the API key by attempting to list namespaces. Returns the list of namespaces if successful.
createNamespace
Creates a new namespace with the provided configuration. Requires a name, file storage configuration, vector storage configuration, and embedding model configuration.
listNamespaces
Lists all namespaces available for the current API key and optional tenant ID.
getNamespace
Retrieves a specific namespace by its ID.
updateNamespace
Updates an existing namespace with the provided configuration parameters.
deleteNamespace
Permanently deletes a namespace by its ID.
ingestText
Ingests raw text content into the namespace. Supports optional metadata and chunk configuration.
ingestFile
Ingests a file into the namespace. Supports various file formats with automatic parsing.
ingestUrls
Ingests content from a list of URLs. Supports scraping options and metadata.
ingestSitemap
Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.
ingestWebsite
Crawls and ingests content from a website recursively. Supports depth control and path filtering.
ingestConnector
Ingests all documents in the connector that are in backlog or failed status. No need to provide the document ids or file ids for the ingestion. Ids are already in the backlog when picked thorough the picker. If not, the user has to go through the authorization flow again, where they will be asked to pick the documents again.
getIngestJobRunStatus
Checks the status of a previously submitted ingestion job.
fetchDocuments
Fetches documents from the namespace based on filter criteria. Supports pagination and including specific document properties.
updateDocuments
Updates metadata for documents that match the specified filter criteria.
deleteDocuments
Permanently deletes documents that match the specified filter criteria.
resyncDocuments
Reprocesses documents that match the specified filter criteria. Useful for updating after schema changes.
semanticSearch
Performs semantic search across the namespace to find relevant content based on meaning rather than exact keyword matches.
hybridSearch
Performs a combined keyword and semantic search, balancing between exact matches and semantic similarity. Requires hybridConfig with weights for both search types.
createConnection
Creates a new connection to a specific source. The connector parameter should be a valid SourceSync connector enum value. The clientRedirectUrl parameter is optional and can be used to specify a custom redirect URL for the connection. This will give you a authorization url which you can redirect the user to. The user will then be asked to pick the documents they want to ingest.
listConnections
Lists all connections for the current namespace, optionally filtered by connector type.
getConnection
Retrieves details for a specific connection by its ID.
updateConnection
Updates a connection to a specific source. The connector parameter should be a valid SourceSync connector enum value. The clientRedirectUrl parameter is optional and can be used to specify a custom redirect URL for the connection. This will give you a authorization url which you can redirect the user to. The user will then be asked to pick the documents they want to ingest. This is useful if you want to update the connection to a different source or if you want to update the clientRedirectUrl or if you want to pick a different or new set of documents.
revokeConnection
Revokes access for a specific connection, removing the integration with the external service.
fetchUrlContent
Fetches the content of a URL. Particularly useful for fetching parsed text file URLs.
README
SourceSync.ai MCP 服务器
一个用于 SourceSync.ai API 的模型上下文协议 (MCP) 服务器实现。此服务器允许 AI 模型通过标准化接口与 SourceSync.ai 的知识管理平台进行交互。
特性
- 管理用于组织知识的命名空间
- 从各种来源(文本、URL、网站、外部服务)摄取内容
- 检索、更新和管理存储在知识库中的文档
- 对知识库执行语义和混合搜索
- 直接从解析的文本 URL 访问文档内容
- 管理与外部服务的连接
- 默认配置支持,实现无缝 AI 集成
安装
使用 npx 运行
# 使用您的 API 密钥和租户 ID 安装并运行
env SOURCESYNC_API_KEY=your_api_key npx -y sourcesyncai-mcp
通过 Smithery 安装
要通过 Smithery 为 Claude Desktop 自动安装 sourcesyncai-mcp,请执行以下操作:
npx -y @smithery/cli install @pbteja1998/sourcesyncai-mcp --client claude
手动安装
# 克隆存储库
git clone https://github.com/yourusername/sourcesyncai-mcp.git
cd sourcesyncai-mcp
# 安装依赖项
npm install
# 构建项目
npm run build
# 运行服务器
node dist/index.js
在 Cursor 上运行
要在 Cursor 中配置 SourceSync.ai MCP,请执行以下操作:
- 打开 Cursor 设置
- 转到
Features > MCP Servers
- 单击
+ Add New MCP Server
- 输入以下内容:
- 名称:
sourcesyncai-mcp
(或您喜欢的名称) - 类型:
command
- 命令:
env SOURCESYNCAI_API_KEY=your-api-key npx -y sourcesyncai-mcp
- 名称:
添加后,您可以通过描述您的知识管理需求,将 SourceSync.ai 工具与 Cursor 的 AI 功能一起使用。
在 Windsurf 上运行
将此添加到您的 ./codeium/windsurf/model_config.json
:
{
"mcpServers": {
"sourcesyncai-mcp": {
"command": "npx",
"args": ["-y", "soucesyncai-mcp"],
"env": {
"SOURCESYNC_API_KEY": "your_api_key",
"SOURCESYNC_NAMESPACE_ID": "your_namespace_id",
"SOURCESYNC_TENANT_ID": "your_tenant_id"
}
}
}
}
在 Claude Desktop 上运行
要将此 MCP 服务器与 Claude Desktop 一起使用,请执行以下操作:
-
找到 Claude Desktop 配置文件:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- Windows:
%APPDATA%\Claude\claude_desktop_config.json
- Linux:
~/.config/Claude/claude_desktop_config.json
- macOS:
-
编辑配置文件以添加 SourceSync.ai MCP 服务器:
{
"mcpServers": {
"sourcesyncai-mcp": {
"command": "npx",
"args": ["-y", "sourcesyncai-mcp"],
"env": {
"SOURCESYNC_API_KEY": "your_api_key",
"SOURCESYNC_NAMESPACE_ID": "your_namespace_id",
"SOURCESYNC_TENANT_ID": "your_tenant_id"
}
}
}
}
- 保存配置文件并重启 Claude Desktop
配置
环境变量
必需
SOURCESYNC_API_KEY
: 您的 SourceSync.ai API 密钥(必需)
可选
SOURCESYNC_NAMESPACE_ID
: 用于操作的默认命名空间 IDSOURCESYNC_TENANT_ID
: 您的租户 ID(可选)
配置示例
具有默认值的基本配置:
export SOURCESYNC_API_KEY=your_api_key
export SOURCESYNC_TENANT_ID=your_tenant_id
export SOURCESYNC_NAMESPACE_ID=your_namespace_id
可用工具
身份验证
validate_api_key
: 验证 SourceSync.ai API 密钥
{
"name": "validate_api_key",
"arguments": {}
}
命名空间
create_namespace
: 创建一个新的命名空间list_namespaces
: 列出所有命名空间get_namespace
: 获取特定命名空间的详细信息update_namespace
: 更新命名空间delete_namespace
: 删除命名空间
{
"name": "create_namespace",
"arguments": {
"name": "my-namespace",
"fileStorageConfig": {
"provider": "S3_COMPATIBLE",
"config": {
"endpoint": "s3.amazonaws.com",
"accessKey": "your_access_key",
"secretKey": "your_secret_key",
"bucket": "your_bucket",
"region": "us-east-1"
}
},
"vectorStorageConfig": {
"provider": "PINECONE",
"config": {
"apiKey": "your_pinecone_api_key",
"environment": "your_environment",
"index": "your_index"
}
},
"embeddingModelConfig": {
"provider": "OPENAI",
"config": {
"apiKey": "your_openai_api_key",
"model": "text-embedding-3-small"
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "list_namespaces",
"arguments": {
"tenantId": "tenant_XXX"
}
}
{
"name": "get_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX"
}
}
{
"name": "update_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"name": "updated-namespace-name"
}
}
{
"name": "delete_namespace",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX"
}
}
数据摄取
ingest_text
: 摄取文本内容ingest_urls
: 摄取来自 URL 的内容ingest_sitemap
: 摄取来自站点地图的内容ingest_website
: 摄取来自网站的内容ingest_notion
: 摄取来自 Notion 的内容ingest_google_drive
: 摄取来自 Google Drive 的内容ingest_dropbox
: 摄取来自 Dropbox 的内容ingest_onedrive
: 摄取来自 OneDrive 的内容ingest_box
: 摄取来自 Box 的内容get_ingest_job_run_status
: 获取摄取作业运行的状态
{
"name": "ingest_text",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "TEXT",
"config": {
"name": "example-document",
"text": "This is an example document for ingestion.",
"metadata": {
"category": "example",
"author": "AI Assistant"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_urls",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "URLS",
"config": {
"urls": ["https://example.com/page1", "https://example.com/page2"],
"metadata": {
"source": "web",
"category": "documentation"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_sitemap",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "SITEMAP",
"config": {
"url": "https://example.com/sitemap.xml",
"metadata": {
"source": "sitemap",
"website": "example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_website",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "WEBSITE",
"config": {
"url": "https://example.com",
"maxDepth": 3,
"maxPages": 100,
"metadata": {
"source": "website",
"domain": "example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_notion",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "NOTION",
"config": {
"connectionId": "your_notion_connection_id",
"metadata": {
"source": "notion",
"workspace": "My Workspace"
}
}
},
"tenantId": "your_tenant_id"
}
}
{
"name": "ingest_google_drive",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "GOOGLE_DRIVE",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "google_drive",
"owner": "user@example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_dropbox",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "DROPBOX",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "dropbox",
"account": "user@example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_onedrive",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "ONEDRIVE",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "onedrive",
"account": "user@example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "ingest_box",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestConfig": {
"source": "BOX",
"config": {
"connectionId": "connection_XXX",
"metadata": {
"source": "box",
"owner": "user@example.com"
}
}
},
"tenantId": "tenant_XXX"
}
}
{
"name": "get_ingest_job_run_status",
"arguments": {
"namespaceId": "your_namespace_id",
"ingestJobRunId": "ingest_job_run_XXX",
"tenantId": "tenant_XXX"
}
}
文档
getDocuments
: 检索具有可选过滤器的文档updateDocuments
: 更新文档元数据deleteDocuments
: 删除文档resyncDocuments
: 重新同步文档fetchUrlContent
: 从文档 URL 获取文本内容
{
"name": "getDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"filterConfig": {
"documentTypes": ["PDF"]
},
"includeConfig": {
"parsedTextFileUrl": true
}
}
}
{
"name": "updateDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
},
"data": {
"metadata": {
"status": "reviewed",
"category": "technical"
}
}
}
}
{
"name": "deleteDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
}
}
}
{
"name": "resyncDocuments",
"arguments": {
"namespaceId": "namespace_XXX",
"tenantId": "tenant_XXX",
"documentIds": ["doc_XXX", "doc_YYY"],
"filterConfig": {
"documentIds": ["doc_XXX", "doc_YYY"]
}
}
}
{
"name": "fetchUrlContent",
"arguments": {
"url": "https://api.sourcesync.ai/v1/documents/doc_XXX/content?format=text",
"apiKey": "your_api_key",
"tenantId": "tenant_XXX"
}
}
搜索
semantic_search
: 执行语义搜索hybrid_search
: 执行混合搜索(语义 + 关键字)
{
"name": "semantic_search",
"arguments": {
"namespaceId": "your_namespace_id",
"query": "example document",
"topK": 5,
"tenantId": "tenant_XXX"
}
}
{
"name": "hybrid_search",
"arguments": {
"namespaceId": "your_namespace_id",
"query": "example document",
"topK": 5,
"tenantId": "tenant_XXX",
"hybridConfig": {
"semanticWeight": 0.7,
"keywordWeight": 0.3
}
}
}
连接
create_connection
: 创建与外部服务的新连接list_connections
: 列出所有连接get_connection
: 获取特定连接的详细信息update_connection
: 更新连接revoke_connection
: 撤销连接
{
"name": "create_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"name": "My Connection",
"connector": "GOOGLE_DRIVE",
"clientRedirectUrl": "https://your-app.com/callback"
}
}
{
"name": "list_connections",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX"
}
}
{
"name": "get_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX"
}
}
{
"name": "update_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX",
"name": "Updated Connection Name",
"clientRedirectUrl": "https://your-app.com/updated-callback"
}
}
{
"name": "revoke_connection",
"arguments": {
"tenantId": "tenant_XXX",
"namespaceId": "namespace_XXX",
"connectionId": "connection_XXX"
}
}
示例提示
以下是一些在配置 MCP 服务器后可以与 Claude 或 Cursor 一起使用的示例提示:
- "在我的 SourceSync 知识库中搜索有关机器学习的信息。"
- "将这篇文章摄取到我的 SourceSync 知识库中:[URL]"
- "在 SourceSync 中为我的项目文档创建一个新的命名空间。"
- "列出我的 SourceSync 命名空间中的所有文档。"
- "从我的 SourceSync 命名空间中获取文档 [document_id] 的文本内容。"
故障排除
连接问题
如果您在连接 SourceSync.ai MCP 服务器时遇到问题:
-
验证路径:确保配置中的所有路径都是绝对路径,而不是相对路径。
-
检查权限:确保服务器文件具有执行权限 (
chmod +x dist/index.js
)。 -
启用开发者模式:在 Claude Desktop 中,启用开发者模式并检查 MCP 日志文件。
-
测试服务器:直接从命令行运行服务器:
node /path/to/sourcesyncai-mcp/dist/index.js
-
重启 AI 客户端:进行更改后,完全重启 Claude Desktop 或 Cursor。
-
检查环境变量:确保所有必需的环境变量都已正确设置。
调试日志
要进行详细日志记录,请添加 DEBUG 环境变量:
开发
项目结构
src/index.ts
: 主要入口点和服务器设置src/schemas.ts
: 所有工具的模式定义src/sourcesync.ts
: 用于与 SourceSync.ai API 交互的客户端src/sourcesync.types.ts
: TypeScript 类型定义
构建和测试
# 构建项目
npm run build
# 运行测试
npm test
许可证
MIT
链接
文档内容检索工作流程:
- 首先,使用
getDocuments
和includeConfig.parsedTextFileUrl: true
获取带有内容 URL 的文档 - 从文档响应中提取 URL
- 使用
fetchUrlContent
检索实际内容:
{
"name": "fetchUrlContent",
"arguments": {
"url": "https://example.com"
}
}
推荐服务器
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。
Excel MCP Server
一个模型上下文协议服务器,使 AI 助手能够读取和写入 Microsoft Excel 文件,支持诸如 xlsx、xlsm、xltx 和 xltm 等格式。
Claude Code MCP
一个实现了 Claude Code 作为模型上下文协议(Model Context Protocol, MCP)服务器的方案,它可以通过标准化的 MCP 接口来使用 Claude 的软件工程能力(代码生成、编辑、审查和文件操作)。
serper-search-scrape-mcp-server
这个 Serper MCP 服务器支持搜索和网页抓取,并且支持 Serper API 引入的所有最新参数,例如位置信息。
The Verge News MCP Server
提供从The Verge的RSS feed获取和搜索新闻的工具,允许用户获取今日新闻、检索过去一周的随机文章,以及在最近的Verge内容中搜索特定关键词。
Crypto Price & Market Analysis MCP Server
一个模型上下文协议 (MCP) 服务器,它使用 CoinCap API 提供全面的加密货币分析。该服务器通过一个易于使用的界面提供实时价格数据、市场分析和历史趋势。 (Alternative, slightly more formal and technical translation): 一个模型上下文协议 (MCP) 服务器,利用 CoinCap API 提供全面的加密货币分析服务。该服务器通过用户友好的界面,提供实时价格数据、市场分析以及历史趋势数据。
MCP PubMed Search
用于搜索 PubMed 的服务器(PubMed 是一个免费的在线数据库,用户可以在其中搜索生物医学和生命科学文献)。 我是在 MCP 发布当天创建的,但当时正在度假。 我看到有人在您的数据库中发布了类似的服务器,但还是决定发布我的。
MCP DuckDB Knowledge Graph Memory Server
一个为 Claude 设计的记忆服务器,它使用 DuckDB 存储和检索知识图谱数据,从而增强了对话的性能和查询能力,并能持久保存用户信息。