MCP BigQuery Server
Production-ready MCP server for BigQuery that translates natural language questions to SQL, executes queries securely, and delivers results via stdio or HTTP for integration with GitHub Copilot, Power BI, and web applications.
README
MCP BigQuery Server
Production-ready Model Context Protocol (MCP) server for BigQuery with natural language query support. Translate questions to SQL, execute securely, and deliver results via stdio (local) or HTTP (remote) for GitHub Copilot, Power BI, and web applications.
Features
- Natural Language to SQL: Ask questions in plain English, get BigQuery results
- Secure Query Execution: Read-only, parameterized queries with dataset whitelisting
- Multiple Transports: Run locally (stdio) or as a remote HTTP server
- Enterprise Ready: JWT auth, RBAC, rate limiting, caching, observability
- Power BI Integration: REST API with JSON and CSV endpoints
- Azure Native: Key Vault secrets, Managed Identity, Container Apps deployment
- High Performance: Query caching, connection pooling, concurrent request handling
Architecture
┌─────────────────┐
│ GitHub Copilot │
│ or Web Client │
└────────┬────────┘
│
┌────▼────┐
│ MCP │ (stdio or HTTP)
│ Server │
└────┬────┘
│
┌────▼──────────────────────────────┐
│ Services │
│ • NL2SQL (Azure OpenAI) │
│ • SQL Validator & Guardrails │
│ • Query Cache (Memory + Redis) │
│ • Schema Cache │
└────┬──────────────────────────────┘
│
┌────▼────────┐ ┌──────────────┐
│ BigQuery │◄────────┤ Key Vault │
│ Client │ │ (Secrets) │
└─────────────┘ └──────────────┘
Quick Start
Prerequisites
- Node.js 20+
- Google Cloud Platform project with BigQuery enabled
- BigQuery service account JSON key
- Azure OpenAI deployment (for NL→SQL)
- (Optional) Azure Key Vault for secrets
- (Optional) Redis for distributed caching
Local Development
- Install dependencies:
npm install
- Configure environment:
cp .env.example .env
# Edit .env with your configuration
- Build:
npm run build
- Run in stdio mode (for GitHub Copilot):
npm run start:stdio
- Run in HTTP mode (for remote access):
npm run start:http
Running with GitHub Copilot
- Add to your MCP settings (
~/.config/Code/User/globalStorage/github.copilot-chat/mcp.json):
{
"mcpServers": {
"bigquery": {
"command": "node",
"args": ["/path/to/mcp-bigquery-server/dist/index.js", "stdio"],
"env": {
"GCP_PROJECT_ID": "your-project-id",
"GCP_SA_KEY_JSON": "{...}",
"ALLOWED_DATASETS": "dataset1,dataset2",
"AZURE_OPENAI_ENDPOINT": "https://your-openai.openai.azure.com/",
"AZURE_OPENAI_API_KEY": "your-key"
}
}
}
}
-
Restart VS Code
-
Ask questions in Copilot Chat:
@workspace Ask the BigQuery server: What were total sales by region last quarter?
Configuration
Required Environment Variables
# BigQuery
GCP_PROJECT_ID=your-gcp-project-id
GCP_SA_KEY_JSON='{"type":"service_account",...}'
ALLOWED_DATASETS=dataset1,dataset2,dataset3
# Azure OpenAI (for NL→SQL)
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_DEPLOYMENT=gpt-4o-mini
# JWT Authentication (HTTP mode)
JWT_ISSUER=https://login.microsoftonline.com/tenant-id/v2.0
JWT_AUDIENCE=api://your-app-id
JWKS_URI=https://login.microsoftonline.com/tenant-id/discovery/v2.0/keys
Optional Environment Variables
# Azure Key Vault (recommended for production)
AZURE_KEY_VAULT_URI=https://your-vault.vault.azure.net/
USE_MANAGED_IDENTITY=true
# Redis (for distributed caching)
REDIS_URL=redis://localhost:6379
# Limits
MAX_ROWS_DEFAULT=10000
MAX_ROWS_ABSOLUTE=100000
QUERY_TIMEOUT_MS=30000
CACHE_TTL_SEC=3600
# Observability
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
ENABLE_TRACING=true
ENABLE_METRICS=true
Azure Key Vault Setup
1. Create Key Vault
az keyvault create \
--name kv-mcp-bigquery \
--resource-group your-rg \
--location eastus
2. Upload BigQuery Service Account
az keyvault secret set \
--vault-name kv-mcp-bigquery \
--name bigquery-service-account \
--file service-account.json
3. Upload Azure OpenAI Key
az keyvault secret set \
--vault-name kv-mcp-bigquery \
--name azure-openai-key \
--value "your-api-key"
4. Grant Access
# For managed identity (production)
az keyvault set-policy \
--name kv-mcp-bigquery \
--object-id <managed-identity-principal-id> \
--secret-permissions get list
# For service principal (dev)
az keyvault set-policy \
--name kv-mcp-bigquery \
--spn <client-id> \
--secret-permissions get list
API Usage
Natural Language Query (POST /api/query)
curl -X POST https://your-server.azurecontainerapps.io/api/query \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"question": "What were total orders and average price by region last quarter?",
"maxRows": 100
}'
Response:
{
"success": true,
"data": {
"sql": "SELECT region, COUNT(*) as total_orders, AVG(price) as avg_price...",
"explanation": "This query calculates total orders and average price by region...",
"confidence": 0.95,
"rows": [...],
"schema": {...},
"metadata": {...}
}
}
Direct SQL Query (POST /api/sql)
curl -X POST https://your-server.azurecontainerapps.io/api/sql \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"sql": "SELECT region, SUM(amount) as total FROM `project.dataset.orders` WHERE date >= @start_date GROUP BY region LIMIT 10",
"params": {"start_date": "2024-01-01"}
}'
CSV Export (POST /api/query.csv)
curl -X POST https://your-server.azurecontainerapps.io/api/query.csv \
-H "Authorization: Bearer $JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "Top 100 customers by revenue"}' \
--output results.csv
Power BI Integration
Option 1: Native BigQuery Connector (Recommended)
Use Power BI's built-in BigQuery connector for live DirectQuery at scale:
- Get Data → Google BigQuery
- Enter GCP project ID and dataset
- Authenticate with service account
- Select tables and configure DirectQuery
Pros: Native integration, full Power BI optimization, live data Cons: No natural language queries
Option 2: Web API Connector (for NL→SQL)
Use when you need natural language queries or pre/post processing:
- Get Data → Web
- Set URL:
https://your-server.azurecontainerapps.io/api/query.csv - Set Method: POST
- Add Header:
Authorization: Bearer <token> - Set Body:
{
"question": "Monthly sales trends for last 12 months",
"maxRows": 10000
}
Pros: Natural language queries, custom processing Cons: Snapshot data (not live), token management
Deployment
Deploy to Azure Container Apps
- Create resource group:
az group create --name rg-mcp-bigquery --location eastus
- Deploy Key Vault:
az deployment group create \
--resource-group rg-mcp-bigquery \
--template-file bicep/keyvault.bicep \
--parameters adminObjectId=$USER_OBJECT_ID
-
Upload secrets (see Key Vault Setup above)
-
Build and push image:
docker build -t yourregistry.azurecr.io/mcp-bigquery-server:latest .
docker push yourregistry.azurecr.io/mcp-bigquery-server:latest
- Deploy Container App:
az deployment group create \
--resource-group rg-mcp-bigquery \
--template-file bicep/main.bicep \
--parameters bicep/main.parameters.json
GitHub Actions CI/CD
The included workflow (.github/workflows/ci-cd.yml) automates:
- Linting, testing, and type checking
- Docker image build and security scanning
- Push to Azure Container Registry
- Deployment to Azure Container Apps
Configure secrets in GitHub:
ACR_USERNAME,ACR_PASSWORDAZURE_CREDENTIALSAZURE_RG
Security
Query Safety Guardrails
- ✅ Read-only: Only SELECT queries allowed
- ✅ Parameterized: No SQL injection via parameter sanitization
- ✅ Dataset whitelist: Only approved datasets accessible
- ✅ Row limits: Enforced maximum rows per query
- ✅ Timeouts: Query execution time limits
- ✅ No DDL/DML: Blocks INSERT, UPDATE, DELETE, DROP, etc.
Authentication & Authorization
- JWT bearer tokens (Azure Entra ID compatible)
- Role-based access control (viewer, analyst, admin)
- Per-dataset access rules
- Rate limiting per user/IP
- Request audit logging
Secrets Management
- Azure Key Vault for production secrets
- Environment variables for local dev only
- Managed Identity (no credentials in code)
- Secret caching with TTL
- PII redaction in logs
See SECURITY.md for full threat model and compliance.
Performance
Caching Strategy
-
Query Cache: In-memory + optional Redis
- Keyed by normalized SQL + parameters
- Configurable TTL (default: 1 hour)
- Reduces BigQuery costs and latency
-
Schema Cache: In-memory with refresh
- Cached dataset/table schemas
- TTL-based invalidation
- Warms on startup
Scaling Configuration
Container Apps auto-scaling:
- Min replicas: 1 (dev), 2 (prod)
- Max replicas: 10+
- Scale rule: 50 concurrent requests per replica
- KEDA for advanced metrics
Expected performance:
- Throughput: 1000+ queries/second (cached)
- Latency: <100ms (cached), <2s (uncached)
- Concurrency: 10 concurrent BigQuery queries per instance
Development
Project Structure
mcp-bigquery/
├── src/
│ ├── mcp/ # MCP server, tools, resources
│ ├── api/ # Express HTTP API
│ ├── services/ # BigQuery, NL2SQL, caching
│ ├── config.ts # Configuration loader
│ ├── logger.ts # Structured logging
│ ├── telemetry.ts # OpenTelemetry
│ └── index.ts # Entry point
├── test/ # Unit and integration tests
├── loadtest/ # k6 load tests
├── bicep/ # Azure infrastructure
├── .github/workflows/ # CI/CD
└── Dockerfile # Container image
Available Scripts
npm run dev # Watch mode build
npm run build # Production build
npm run start:stdio # Run stdio mode
npm run start:http # Run HTTP mode
npm run test # Run tests
npm run test:coverage # Coverage report
npm run lint # Lint code
npm run format # Format code
npm run loadtest # Run k6 load test
Adding New Tools
- Create tool file in
src/mcp/tools/ - Implement input schema (zod) and handler
- Register in
src/mcp/server.ts - Add tests in
test/
Example:
export const myTool = {
name: 'my_tool',
description: 'Does something useful',
inputSchema: {...},
};
export async function myToolHandler(input: MyInput) {
// Implementation
}
Troubleshooting
Common Issues
"Azure OpenAI API key not configured"
- Ensure
AZURE_OPENAI_API_KEYis set or available in Key Vault - Check Key Vault permissions
"Dataset not allowed"
- Add dataset to
ALLOWED_DATASETSenvironment variable - Verify dataset exists in BigQuery
"Authentication failed"
- Verify JWT token is valid and not expired
- Check
JWT_ISSUER,JWT_AUDIENCE, andJWKS_URIconfiguration - Ensure user has required roles (analyst/admin for NL queries)
"Query timeout"
- Increase
QUERY_TIMEOUT_MS - Optimize query or add indexes in BigQuery
- Check BigQuery quotas
Logs
View logs in Azure:
az containerapp logs show \
--name mcp-bigquery-server-prod \
--resource-group rg-mcp-bigquery \
--follow
Health Checks
/healthz: Basic health (BigQuery connectivity)/readyz: Full readiness (all services initialized)
License
MIT
Support
For issues and questions:
- GitHub Issues: yourorg/mcp-bigquery-server
- Documentation: docs/
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。
mcp-server-qdrant
这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。