saft-mcp
An MCP server that enables AI assistants to parse, validate, and analyze Portuguese SAF-T tax files, providing tools for querying invoices, customers, products, VAT breakdowns, and anomaly detection.
README
<div align="center">
SAF-T MCP Server
Parse and analyze Portuguese SAF-T tax files with AI assistants
Getting Started · Available Tools · Configuration
Versao em Portugues · 13 tools · 152 tests
</div>
A Model Context Protocol (MCP) server that enables AI assistants like Claude, Cursor, and Windsurf to load, validate, and analyze Portuguese SAF-T (Standard Audit File for Tax Purposes) XML files. Load a SAF-T file and immediately query invoices, get revenue summaries, VAT breakdowns, and validate compliance with Portuguese tax rules.
What is SAF-T PT?
SAF-T PT is a mandatory XML file that all Portuguese companies must be able to export from their accounting/billing software. It contains the company's invoices, payments, customers, products, tax entries, and more. This MCP server turns that XML into a queryable data source for AI assistants.
Quick Start
Prerequisites
- Python 3.11+ and uv (recommended) or pip
- A SAF-T PT XML file exported from any Portuguese billing/accounting software (PHC, Sage, Primavera, etc.)
1. Install
pip install saft-mcp
Or from source:
git clone https://github.com/bybloom-ai/saft-mcp.git
cd saft-mcp
uv sync
2. Add to your AI assistant
<details open> <summary><strong>Claude Code</strong></summary>
claude mcp add saft-mcp -- /path/to/saft-mcp/.venv/bin/python -m saft_mcp
</details>
<details> <summary><strong>Claude Desktop</strong></summary>
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"saft-mcp": {
"command": "/path/to/saft-mcp/.venv/bin/python",
"args": ["-m", "saft_mcp"]
}
}
}
</details>
<details> <summary><strong>Cursor / VS Code / Other MCP clients</strong></summary>
Add to your MCP client configuration:
{
"mcpServers": {
"saft-mcp": {
"command": "/path/to/saft-mcp/.venv/bin/python",
"args": ["-m", "saft_mcp"]
}
}
}
</details>
3. Start using it
Ask your AI assistant:
"Load my SAF-T file at ~/Documents/saft_2025.xml and give me a revenue summary"
The server will parse the file, extract all invoices and tax data, and make it available for querying through natural conversation.
Available Tools
saft_load
Load and parse a SAF-T PT XML file. This must be called first before using any other tool.
| Parameter | Type | Description |
|---|---|---|
file_path |
string | Path to the SAF-T XML file |
Returns company name, NIF, fiscal period, SAF-T version, and record counts (customers, products, invoices, payments).
Handles Windows-1252 and UTF-8 encodings, BOM stripping, and automatic namespace detection. Files under 50 MB are parsed with full DOM; larger files use streaming.
saft_validate
Validate the loaded file against the official XSD schema and Portuguese business rules.
| Parameter | Type | Default | Description |
|---|---|---|---|
rules |
list[string] | all | Specific rules to check |
Available rules:
| Rule | What it checks |
|---|---|
xsd |
XML structure against SAF-T PT 1.04_01 XSD schema |
numbering |
Sequential invoice numbering within each series |
nif |
NIF (tax ID) mod-11 check digit validation |
tax_codes |
Tax percentages match known Portuguese VAT rates |
atcud |
ATCUD unique document codes are present and well-formed |
hash_chain |
Hash continuity across invoice sequences |
control_totals |
Calculated totals match declared control totals |
Returns results with severity (error/warning), location, and fix suggestions.
saft_summary
Generate an executive summary of the loaded file. No parameters needed.
Returns:
- Revenue totals (gross, credit notes, net)
- Invoice and credit note counts
- VAT breakdown by rate
- Top 10 customers by revenue
- Document type distribution (FT, FR, NC, ND, FS)
saft_query_invoices
Search and filter invoices with full pagination.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
customer_nif |
string | - | Filter by tax ID (partial match) |
customer_name |
string | - | Filter by name (case-insensitive, partial) |
doc_type |
string | - | FT, FR, NC, ND, or FS |
min_amount |
number | - | Minimum gross total |
max_amount |
number | - | Maximum gross total |
status |
string | - | N (normal), A (cancelled), F (invoiced) |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns matching invoices with document number, date, type, customer, amounts, status, and line count.
saft_tax_summary
Generate a VAT analysis grouped by rate, month, or document type.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
group_by |
string | rate |
Group by rate, month, or doc_type |
Returns taxable base, VAT amount, and gross total per group, plus overall totals.
saft_query_customers
Search and filter customer master data with revenue enrichment.
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
string | - | Company name (case-insensitive, partial) |
nif |
string | - | Tax ID (partial match) |
city |
string | - | Billing city (case-insensitive, partial) |
country |
string | - | Country code (exact, e.g. "PT", "ES") |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns customers with invoice count and total revenue per customer.
saft_query_products
Search and filter the product catalog with sales statistics.
| Parameter | Type | Default | Description |
|---|---|---|---|
description |
string | - | Product description (case-insensitive, partial) |
code |
string | - | Product code (partial match) |
product_type |
string | - | P (product), S (service), O (other), I (import), E (export) |
group |
string | - | Product group (case-insensitive, partial) |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns products with times sold, total quantity, and total revenue.
saft_get_invoice
Get full detail for a single invoice including all line items.
| Parameter | Type | Description |
|---|---|---|
invoice_no |
string | Exact invoice number (e.g. "FR 2025A15/90") |
Returns complete invoice with header, document totals, special regimes, and all lines with product, quantity, price, tax, exemptions, and references.
saft_anomaly_detect
Detect suspicious patterns and irregularities in the loaded file.
| Parameter | Type | Default | Description |
|---|---|---|---|
checks |
list[string] | all | Specific checks to run |
Available checks:
| Check | What it detects |
|---|---|
duplicate_invoices |
Same customer + amount + date combinations |
numbering_gaps |
Missing sequential numbers within each series |
weekend_invoices |
Invoices issued on Saturdays or Sundays |
unusual_amounts |
Invoice amounts > 3 standard deviations from the mean |
cancelled_ratio |
High cancellation rates per series |
zero_amount |
Invoices with zero gross total |
Returns anomalies with type, severity, description, and affected documents.
saft_compare
Compare the loaded SAF-T file against a second file (e.g. month-over-month, year-over-year).
| Parameter | Type | Default | Description |
|---|---|---|---|
file_path |
string | - | Path to the second SAF-T XML file |
metrics |
list[string] | all | Metrics to compare |
Available metrics: revenue, customers, products, doc_types, vat.
Returns period labels and a changes dict with before/after/delta per metric. Includes top new/lost customers, top movers, and percentage changes.
saft_aging
Compute accounts receivable aging from invoices and payments.
| Parameter | Type | Default | Description |
|---|---|---|---|
reference_date |
string | today | Date to age from (YYYY-MM-DD) |
buckets |
list[int] | [30,60,90,120] | Aging bucket boundaries in days |
Returns per-customer aging with amounts in each bucket, sorted by total outstanding. Uses FIFO allocation of payments against invoices.
saft_export
Export data to CSV files for use in spreadsheets or other tools.
| Parameter | Type | Default | Description |
|---|---|---|---|
export_type |
string | - | invoices, customers, products, tax_summary, or anomalies |
file_path |
string | - | Output CSV file path |
filters |
dict | - | Optional filters (same as corresponding query tool) |
Returns file path, row count, and column names.
saft_stats
Generate a statistical overview of invoicing data.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
Returns invoice statistics (mean, median, std deviation), daily/weekly/monthly distributions, customer concentration (Pareto analysis), and top/bottom invoices.
Typical Workflow
1. saft_load -> Parse the XML file
2. saft_validate -> Check compliance (XSD + business rules)
3. saft_summary -> Get the big picture (revenue, top customers, VAT)
4. saft_query_invoices -> Drill into specific invoices
5. saft_get_invoice -> Full detail for a single invoice
6. saft_tax_summary -> VAT analysis by rate, month, or doc type
7. saft_anomaly_detect -> Flag suspicious patterns
8. saft_stats -> Statistical distributions and trends
9. saft_compare -> Diff against another SAF-T file
10. saft_export -> Export results to CSV
Example questions you can ask after loading a file:
- "How much revenue did the company make this year?"
- "Show me all credit notes above 500 euros"
- "What's the monthly VAT breakdown?"
- "Are there any validation errors in this file?"
- "List invoices for customer XPTO in Q3"
- "What percentage of revenue comes from the top 5 customers?"
- "Are there any suspicious patterns or anomalies?"
- "Compare this file against last month's SAF-T"
- "What's the accounts receivable aging?"
- "Export all invoices to CSV"
Configuration
All settings are configurable via environment variables with the SAFT_MCP_ prefix:
| Variable | Default | Description |
|---|---|---|
SAFT_MCP_STREAMING_THRESHOLD_BYTES |
52428800 (50 MB) | Files above this use streaming parser |
SAFT_MCP_MAX_FILE_SIZE_BYTES |
524288000 (500 MB) | Maximum file size accepted |
SAFT_MCP_SESSION_TIMEOUT_SECONDS |
1800 (30 min) | Session expiry after inactivity |
SAFT_MCP_MAX_CONCURRENT_SESSIONS |
5 | Maximum simultaneous loaded files |
SAFT_MCP_DEFAULT_QUERY_LIMIT |
50 | Default results per page |
SAFT_MCP_MAX_QUERY_LIMIT |
500 | Maximum results per page |
SAFT_MCP_LOG_LEVEL |
INFO | Logging level |
Architecture
AI Assistant (Claude, Cursor, etc.)
|
| MCP Protocol (stdio)
v
+------------------------------------------+
| saft-mcp server |
| |
| server.py FastMCP entry point |
| state.py Session management |
| |
| parser/ |
| detector.py Namespace detection |
| encoding.py Charset handling |
| full_parser.py DOM parse (< 50 MB) |
| models.py Pydantic data models |
| |
| tools/ |
| load.py saft_load |
| validate.py saft_validate |
| summary.py saft_summary |
| query_invoices.py saft_query_invoices|
| query_customers.py saft_query_customer|
| query_products.py saft_query_products|
| get_invoice.py saft_get_invoice |
| tax_summary.py saft_tax_summary |
| anomaly_detect.py saft_anomaly_detect|
| compare.py saft_compare |
| aging.py saft_aging |
| export.py saft_export |
| stats.py saft_stats |
| |
| validators/ |
| xsd_validator.py XSD 1.04_01 |
| business_rules.py Numbering, totals |
| nif.py NIF mod-11 |
| hash_chain.py Hash continuity |
| |
| schemas/ |
| saftpt1.04_01.xsd Official XSD |
+------------------------------------------+
Key design decisions:
- All monetary values use
Decimalto avoid floating-point rounding in tax calculations - lxml for XML parsing, with automatic XSD 1.1 feature stripping (the official Portuguese XSD uses
xs:assertandxs:allwith unbounded children, which lxml's XSD 1.0 engine cannot handle natively) - Pydantic v2 models validated against real PHC Corporate exports
- Namespace auto-detection by scanning the first 4 KB of the file (never hardcoded)
- Windows-1252 encoding handled natively via the XML declaration
Development
# Install with dev dependencies
uv sync --extra dev
# Run tests (152 tests)
pytest
# Lint
ruff check src/ tests/
# Format
ruff format src/ tests/
# Type check
mypy src/
Project structure
saft-mcp/
src/saft_mcp/ # Source code
server.py # FastMCP entry point, tool registration
config.py # Settings (pydantic-settings, env vars)
state.py # Session store, parsed file state
exceptions.py # SaftError hierarchy
parser/ # XML parsing (encoding, detection, models)
tools/ # One file per MCP tool
validators/ # XSD, business rules, NIF, hash chain
schemas/ # Official XSD file
tests/ # Mirrors src/ structure
pyproject.toml # Project config (hatch build, ruff, mypy, pytest)
Roadmap
- [x]
saft_query_customers-- search and filter customer master data - [x]
saft_query_products-- search and filter product catalog - [x]
saft_get_invoice-- full invoice detail with line items - [x]
saft_anomaly_detect-- flag duplicate invoices, numbering gaps, unusual amounts - [x]
saft_compare-- diff two SAF-T files (e.g. month-over-month) - [x]
saft_aging-- accounts receivable aging analysis - [x]
saft_export-- export data to CSV - [x]
saft_stats-- statistical overview and distributions - [ ] Streaming parser for large files (>= 50 MB)
- [ ] Accounting SAF-T support (journal entries, general ledger, trial balance)
- [ ]
saft_trial_balance-- generate trial balance from accounting data - [ ]
saft_ies_prepare-- pre-fill IES annual tax return fields - [ ]
saft_cross_check-- cross-reference invoicing vs accounting SAF-T - [x] PyPI package (
pip install saft-mcp) - [x] GitHub Actions CI (pytest + ruff + mypy)
Supported SAF-T versions
- SAF-T PT 1.04_01 (current Portuguese standard)
Tested with real exports from PHC Corporate. Should work with SAF-T files from any compliant Portuguese software (Sage, Primavera, PHC, Moloni, InvoiceXpress, etc.).
License
MIT
Built by bybloom.ai, a business unit of Bloomidea
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器
Exa MCP Server
模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。