Salesforce Metadata-Aware RAG MCP
Enables AI copilots to understand and query Salesforce org configurations through intelligent metadata chunking and semantic search. Provides access to Apex classes, custom objects, flows, layouts, and other metadata with hybrid vector and keyword search capabilities.
README
Salesforce Metadata-Aware RAG MCP
A Model Context Protocol (MCP) server that provides advanced RAG capabilities for Salesforce metadata and code, enabling AI copilots to understand your Salesforce org configuration through intelligent chunking and vector search.
Features
Core Salesforce Integration
- Metadata API Integration: Access layouts, flows, custom objects, profiles, and permission sets
- Tooling API Integration: Retrieve Apex classes, triggers, and validation rules
- REST API Integration: Object schema descriptions and SOQL execution
- Rate Limiting: Built-in API quota management and retry logic
- Incremental Sync: Efficient updates for large orgs
Advanced RAG Capabilities
- Intelligent Chunking: Metadata-aware chunking system that splits Apex classes by methods, objects by fields, etc.
- Vector Indexing: PostgreSQL + pgvector for semantic similarity search
- Keyword Search: Full-text search with BM25 ranking
- Symbol Search: Exact matching for Salesforce objects, fields, and code symbols
- Hybrid Search: Combined vector + keyword search with intelligent reranking
MCP Integration
- Direct Claude Code Integration: Real-time Salesforce org exploration
- Structured Metadata Access: Type-aware retrieval and processing
- Symbol Extraction: Automatic discovery of relationships and dependencies
Available MCP Tools
sf_metadata_list- List metadata components of specified typessf_tooling_getApexClasses- Retrieve all Apex classes from the orgsf_describe_object- Describe a Salesforce object schemarag_status- Get system status and API usage stats
Development Setup
Prerequisites
- Node.js 18+ and npm
- Docker and Docker Compose (for PostgreSQL + pgvector)
- Salesforce org access (sandbox recommended for testing)
- Connected App or Username/Password authentication
- Python 3.8+ with sentence-transformers (optional, for production embeddings)
Installation
- Clone and install dependencies:
npm install
- Configure Salesforce credentials in
.env:
# Copy the example file
cp .env.example .env
# Edit .env with your Salesforce credentials
SF_USERNAME="your_username@company.com"
SF_PASSWORD="your_password"
SF_SECURITY_TOKEN="your_security_token"
SF_LOGIN_URL="https://test.salesforce.com" # Use for sandbox
- Start PostgreSQL with pgvector:
docker compose up -d postgres
- Build the project:
npm run build
Running the Server
MCP Server (for Claude Code):
npm run dev
Vector Integration Testing:
# Test chunking system
node dist/test-chunking.js
# Test full vector integration
node dist/test-vector-integration.js
# Test with live Salesforce data
node dist/test-mcp-chunking.js
Type checking:
npm run typecheck
MCP Integration
To integrate with VS Code or Claude Desktop, add this configuration to your MCP settings:
For Claude Desktop (add to claude_desktop_config.json):
{
"mcpServers": {
"salesforce-rag": {
"command": "node",
"args": ["/path/to/sfdxrag/dist/index.js"],
"cwd": "/path/to/sfdxrag",
"env": {
"NODE_ENV": "production",
"SF_LOGIN_URL": "https://your-org.my.salesforce.com/",
"SF_USERNAME": "your_username@company.com",
"SF_PASSWORD": "your_password",
"SF_SECURITY_TOKEN": "your_security_token",
"DOTENV_SILENT": "true",
"LOG_LEVEL": "error"
}
}
}
}
For Claude Code (add to .mcp.json in your workspace):
{
"mcpServers": {
"salesforce-rag": {
"command": "npm",
"args": ["run", "dev"],
"cwd": "/path/to/sfdxrag",
"env": {
"SF_LOGIN_URL": "https://your-org.my.salesforce.com/",
"SF_USERNAME": "your_username@company.com",
"SF_PASSWORD": "your_password",
"SF_SECURITY_TOKEN": "your_security_token",
"DOTENV_SILENT": "true",
"LOG_LEVEL": "error"
}
}
}
}
Adding to Claude Code MCP:
- Create or update
.mcp.jsonin your workspace root:
# Navigate to your project directory
cd /path/to/your-project
# Create .mcp.json with the salesforce-rag server configuration above
# Update the "cwd" path to point to your sfdxrag installation directory
- Alternative: Use Claude Code MCP command:
claude mcp add salesforce-rag --env SF_LOGIN_URL=https://test.salesforce.com --env SF_USERNAME=your_salesforce_username --env SF_PASSWORD=your_salesforce_password --env SF_SECURITY_TOKEN=your_security_token --env NODE_ENV=development --env LOG_LEVEL=info -- npm run dev --cwd="/path to sfdxrag/"
After setting up:
- Restart Claude Code/Desktop to reload MCP configuration
- Test MCP tools:
sf_describe_objectwith{"objectName": "Account"}sf_metadata_listwith{"types": ["ApexClass", "Layout"]}rag_statusto check system health
Project Structure
src/
├── salesforce/ # Salesforce API clients
│ ├── connection.ts # Authentication layer
│ ├── metadataClient.ts # Metadata API wrapper
│ ├── toolingClient.ts # Tooling API wrapper
│ └── restClient.ts # REST API wrapper
├── chunking/ # Metadata chunking system
│ ├── types.ts # Core interfaces and types
│ ├── base.ts # Base chunker implementation
│ ├── apexChunker.ts # Apex class method-level chunking
│ ├── customObjectChunker.ts # Object field-level chunking
│ ├── factory.ts # Chunker selection factory
│ └── processor.ts # Main processing pipeline
├── vector/ # Vector storage and search
│ ├── embedding.ts # Embedding model interface and implementations
│ └── store.ts # PostgreSQL + pgvector client
├── utils/ # Utilities
│ ├── logger.ts # Winston logging setup
│ ├── errorHandler.ts # Global error handling
│ ├── rateLimiter.ts # API rate limiting
│ └── packageGenerator.ts # package.xml generation
├── config/ # Configuration management
│ └── index.ts # Environment config loader
├── mcp/ # MCP server implementation
│ └── server.ts # MCP tool handlers
└── index.ts # Main entry point
Environment Variables
Required for Salesforce connectivity:
SF_USERNAME- Salesforce usernameSF_PASSWORD- Salesforce passwordSF_SECURITY_TOKEN- Salesforce security tokenSF_LOGIN_URL- Login URL (https://login.salesforce.com or https://test.salesforce.com)
Optional configuration:
NODE_ENV- Environment (development/production)LOG_LEVEL- Logging level (debug/info/warn/error)PORT- Server port (default: 3000)DB_HOST- PostgreSQL host (default: localhost)DB_PORT- PostgreSQL port (default: 5433)DB_NAME- Database name (default: sfdxrag)DB_USER- Database user (default: postgres)DB_PASSWORD- Database password (default: postgres)
Architecture
Data Flow
- Metadata Extraction: Retrieve Salesforce metadata via API clients
- Intelligent Chunking: Process metadata using type-specific chunkers
- Vector Indexing: Generate embeddings and store in PostgreSQL + pgvector
- Search & Retrieval: Multi-modal search (vector + keyword + symbol)
Chunking System
The system includes specialized chunkers for different metadata types:
- ApexChunker: Splits classes by methods, preserving signatures and docblocks
- CustomObjectChunker: Splits objects by fields, validation rules, and metadata
- GenericChunker: Fallback for unsupported types
Vector Search
- Vector Search: Semantic similarity using sentence transformers
- Keyword Search: Full-text search with BM25 ranking
- Symbol Search: Exact matching for Salesforce symbols (objects, fields, classes)
- Hybrid Search: Combined search with intelligent reranking (70% vector, 30% keyword)
Testing
Current Test Coverage
✅ Chunking System: Apex classes split into method-level chunks with symbol extraction
✅ Vector Storage: PostgreSQL + pgvector integration with batch operations
✅ Search Functions: Vector, keyword, symbol, and hybrid search working
✅ MCP Integration: Live Salesforce data retrieval and processing
✅ Symbol Detection: Automatic discovery of custom objects and dependencies
Example Results
From Apex class analysis:
- Method-level chunking with separate chunks for class declaration and each method
- Symbol extraction working for custom objects, standard objects, and system calls
- Search functionality verified across all modes: semantic, keyword, symbol, hybrid
Production Deployment
For production use:
- Configure real embedding models using
SentenceTransformerEmbedding - Set up persistent PostgreSQL instance with appropriate resource allocation
- Configure proper authentication and security for multi-tenant access
- Implement monitoring and performance optimization for large metadata volumes
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。