saft-mcp

saft-mcp

An MCP server that enables AI assistants to parse, validate, and analyze Portuguese SAF-T tax files, providing tools for querying invoices, customers, products, VAT breakdowns, and anomaly detection.

Category
访问服务器

README

<div align="center">

SAF-T MCP Server

Parse and analyze Portuguese SAF-T tax files with AI assistants

PyPI License: MIT CI Python 3.11+ SAF-T PT 1.04_01

Getting Started · Available Tools · Configuration

Versao em Portugues · 13 tools · 152 tests

</div>


A Model Context Protocol (MCP) server that enables AI assistants like Claude, Cursor, and Windsurf to load, validate, and analyze Portuguese SAF-T (Standard Audit File for Tax Purposes) XML files. Load a SAF-T file and immediately query invoices, get revenue summaries, VAT breakdowns, and validate compliance with Portuguese tax rules.

What is SAF-T PT?

SAF-T PT is a mandatory XML file that all Portuguese companies must be able to export from their accounting/billing software. It contains the company's invoices, payments, customers, products, tax entries, and more. This MCP server turns that XML into a queryable data source for AI assistants.


Quick Start

Prerequisites

  • Python 3.11+ and uv (recommended) or pip
  • A SAF-T PT XML file exported from any Portuguese billing/accounting software (PHC, Sage, Primavera, etc.)

1. Install

pip install saft-mcp

Or from source:

git clone https://github.com/bybloom-ai/saft-mcp.git
cd saft-mcp
uv sync

2. Add to your AI assistant

<details open> <summary><strong>Claude Code</strong></summary>

claude mcp add saft-mcp -- /path/to/saft-mcp/.venv/bin/python -m saft_mcp

</details>

<details> <summary><strong>Claude Desktop</strong></summary>

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "saft-mcp": {
      "command": "/path/to/saft-mcp/.venv/bin/python",
      "args": ["-m", "saft_mcp"]
    }
  }
}

</details>

<details> <summary><strong>Cursor / VS Code / Other MCP clients</strong></summary>

Add to your MCP client configuration:

{
  "mcpServers": {
    "saft-mcp": {
      "command": "/path/to/saft-mcp/.venv/bin/python",
      "args": ["-m", "saft_mcp"]
    }
  }
}

</details>

3. Start using it

Ask your AI assistant:

"Load my SAF-T file at ~/Documents/saft_2025.xml and give me a revenue summary"

The server will parse the file, extract all invoices and tax data, and make it available for querying through natural conversation.


Available Tools

saft_load

Load and parse a SAF-T PT XML file. This must be called first before using any other tool.

Parameter Type Description
file_path string Path to the SAF-T XML file

Returns company name, NIF, fiscal period, SAF-T version, and record counts (customers, products, invoices, payments).

Handles Windows-1252 and UTF-8 encodings, BOM stripping, and automatic namespace detection. Files under 50 MB are parsed with full DOM; larger files use streaming.


saft_validate

Validate the loaded file against the official XSD schema and Portuguese business rules.

Parameter Type Default Description
rules list[string] all Specific rules to check

Available rules:

Rule What it checks
xsd XML structure against SAF-T PT 1.04_01 XSD schema
numbering Sequential invoice numbering within each series
nif NIF (tax ID) mod-11 check digit validation
tax_codes Tax percentages match known Portuguese VAT rates
atcud ATCUD unique document codes are present and well-formed
hash_chain Hash continuity across invoice sequences
control_totals Calculated totals match declared control totals

Returns results with severity (error/warning), location, and fix suggestions.


saft_summary

Generate an executive summary of the loaded file. No parameters needed.

Returns:

  • Revenue totals (gross, credit notes, net)
  • Invoice and credit note counts
  • VAT breakdown by rate
  • Top 10 customers by revenue
  • Document type distribution (FT, FR, NC, ND, FS)

saft_query_invoices

Search and filter invoices with full pagination.

Parameter Type Default Description
date_from string - Start date (YYYY-MM-DD)
date_to string - End date (YYYY-MM-DD)
customer_nif string - Filter by tax ID (partial match)
customer_name string - Filter by name (case-insensitive, partial)
doc_type string - FT, FR, NC, ND, or FS
min_amount number - Minimum gross total
max_amount number - Maximum gross total
status string - N (normal), A (cancelled), F (invoiced)
limit integer 50 Results per page (max 500)
offset integer 0 Pagination offset

Returns matching invoices with document number, date, type, customer, amounts, status, and line count.


saft_tax_summary

Generate a VAT analysis grouped by rate, month, or document type.

Parameter Type Default Description
date_from string - Start date (YYYY-MM-DD)
date_to string - End date (YYYY-MM-DD)
group_by string rate Group by rate, month, or doc_type

Returns taxable base, VAT amount, and gross total per group, plus overall totals.


saft_query_customers

Search and filter customer master data with revenue enrichment.

Parameter Type Default Description
name string - Company name (case-insensitive, partial)
nif string - Tax ID (partial match)
city string - Billing city (case-insensitive, partial)
country string - Country code (exact, e.g. "PT", "ES")
limit integer 50 Results per page (max 500)
offset integer 0 Pagination offset

Returns customers with invoice count and total revenue per customer.


saft_query_products

Search and filter the product catalog with sales statistics.

Parameter Type Default Description
description string - Product description (case-insensitive, partial)
code string - Product code (partial match)
product_type string - P (product), S (service), O (other), I (import), E (export)
group string - Product group (case-insensitive, partial)
limit integer 50 Results per page (max 500)
offset integer 0 Pagination offset

Returns products with times sold, total quantity, and total revenue.


saft_get_invoice

Get full detail for a single invoice including all line items.

Parameter Type Description
invoice_no string Exact invoice number (e.g. "FR 2025A15/90")

Returns complete invoice with header, document totals, special regimes, and all lines with product, quantity, price, tax, exemptions, and references.


saft_anomaly_detect

Detect suspicious patterns and irregularities in the loaded file.

Parameter Type Default Description
checks list[string] all Specific checks to run

Available checks:

Check What it detects
duplicate_invoices Same customer + amount + date combinations
numbering_gaps Missing sequential numbers within each series
weekend_invoices Invoices issued on Saturdays or Sundays
unusual_amounts Invoice amounts > 3 standard deviations from the mean
cancelled_ratio High cancellation rates per series
zero_amount Invoices with zero gross total

Returns anomalies with type, severity, description, and affected documents.


saft_compare

Compare the loaded SAF-T file against a second file (e.g. month-over-month, year-over-year).

Parameter Type Default Description
file_path string - Path to the second SAF-T XML file
metrics list[string] all Metrics to compare

Available metrics: revenue, customers, products, doc_types, vat.

Returns period labels and a changes dict with before/after/delta per metric. Includes top new/lost customers, top movers, and percentage changes.


saft_aging

Compute accounts receivable aging from invoices and payments.

Parameter Type Default Description
reference_date string today Date to age from (YYYY-MM-DD)
buckets list[int] [30,60,90,120] Aging bucket boundaries in days

Returns per-customer aging with amounts in each bucket, sorted by total outstanding. Uses FIFO allocation of payments against invoices.


saft_export

Export data to CSV files for use in spreadsheets or other tools.

Parameter Type Default Description
export_type string - invoices, customers, products, tax_summary, or anomalies
file_path string - Output CSV file path
filters dict - Optional filters (same as corresponding query tool)

Returns file path, row count, and column names.


saft_stats

Generate a statistical overview of invoicing data.

Parameter Type Default Description
date_from string - Start date (YYYY-MM-DD)
date_to string - End date (YYYY-MM-DD)

Returns invoice statistics (mean, median, std deviation), daily/weekly/monthly distributions, customer concentration (Pareto analysis), and top/bottom invoices.


Typical Workflow

1. saft_load            -> Parse the XML file
2. saft_validate        -> Check compliance (XSD + business rules)
3. saft_summary         -> Get the big picture (revenue, top customers, VAT)
4. saft_query_invoices  -> Drill into specific invoices
5. saft_get_invoice     -> Full detail for a single invoice
6. saft_tax_summary     -> VAT analysis by rate, month, or doc type
7. saft_anomaly_detect  -> Flag suspicious patterns
8. saft_stats           -> Statistical distributions and trends
9. saft_compare         -> Diff against another SAF-T file
10. saft_export         -> Export results to CSV

Example questions you can ask after loading a file:

  • "How much revenue did the company make this year?"
  • "Show me all credit notes above 500 euros"
  • "What's the monthly VAT breakdown?"
  • "Are there any validation errors in this file?"
  • "List invoices for customer XPTO in Q3"
  • "What percentage of revenue comes from the top 5 customers?"
  • "Are there any suspicious patterns or anomalies?"
  • "Compare this file against last month's SAF-T"
  • "What's the accounts receivable aging?"
  • "Export all invoices to CSV"

Configuration

All settings are configurable via environment variables with the SAFT_MCP_ prefix:

Variable Default Description
SAFT_MCP_STREAMING_THRESHOLD_BYTES 52428800 (50 MB) Files above this use streaming parser
SAFT_MCP_MAX_FILE_SIZE_BYTES 524288000 (500 MB) Maximum file size accepted
SAFT_MCP_SESSION_TIMEOUT_SECONDS 1800 (30 min) Session expiry after inactivity
SAFT_MCP_MAX_CONCURRENT_SESSIONS 5 Maximum simultaneous loaded files
SAFT_MCP_DEFAULT_QUERY_LIMIT 50 Default results per page
SAFT_MCP_MAX_QUERY_LIMIT 500 Maximum results per page
SAFT_MCP_LOG_LEVEL INFO Logging level

Architecture

AI Assistant (Claude, Cursor, etc.)
        |
        | MCP Protocol (stdio)
        v
+------------------------------------------+
|           saft-mcp server                |
|                                          |
|  server.py       FastMCP entry point     |
|  state.py        Session management      |
|                                          |
|  parser/                                 |
|    detector.py   Namespace detection     |
|    encoding.py   Charset handling        |
|    full_parser.py   DOM parse (< 50 MB)  |
|    models.py     Pydantic data models    |
|                                          |
|  tools/                                  |
|    load.py            saft_load          |
|    validate.py        saft_validate      |
|    summary.py         saft_summary       |
|    query_invoices.py  saft_query_invoices|
|    query_customers.py saft_query_customer|
|    query_products.py  saft_query_products|
|    get_invoice.py     saft_get_invoice   |
|    tax_summary.py     saft_tax_summary   |
|    anomaly_detect.py  saft_anomaly_detect|
|    compare.py         saft_compare       |
|    aging.py           saft_aging         |
|    export.py          saft_export        |
|    stats.py           saft_stats         |
|                                          |
|  validators/                             |
|    xsd_validator.py   XSD 1.04_01        |
|    business_rules.py  Numbering, totals  |
|    nif.py             NIF mod-11         |
|    hash_chain.py      Hash continuity    |
|                                          |
|  schemas/                                |
|    saftpt1.04_01.xsd  Official XSD       |
+------------------------------------------+

Key design decisions:

  • All monetary values use Decimal to avoid floating-point rounding in tax calculations
  • lxml for XML parsing, with automatic XSD 1.1 feature stripping (the official Portuguese XSD uses xs:assert and xs:all with unbounded children, which lxml's XSD 1.0 engine cannot handle natively)
  • Pydantic v2 models validated against real PHC Corporate exports
  • Namespace auto-detection by scanning the first 4 KB of the file (never hardcoded)
  • Windows-1252 encoding handled natively via the XML declaration

Development

# Install with dev dependencies
uv sync --extra dev

# Run tests (152 tests)
pytest

# Lint
ruff check src/ tests/

# Format
ruff format src/ tests/

# Type check
mypy src/

Project structure

saft-mcp/
  src/saft_mcp/         # Source code
    server.py           # FastMCP entry point, tool registration
    config.py           # Settings (pydantic-settings, env vars)
    state.py            # Session store, parsed file state
    exceptions.py       # SaftError hierarchy
    parser/             # XML parsing (encoding, detection, models)
    tools/              # One file per MCP tool
    validators/         # XSD, business rules, NIF, hash chain
    schemas/            # Official XSD file
  tests/                # Mirrors src/ structure
  pyproject.toml        # Project config (hatch build, ruff, mypy, pytest)

Roadmap

  • [x] saft_query_customers -- search and filter customer master data
  • [x] saft_query_products -- search and filter product catalog
  • [x] saft_get_invoice -- full invoice detail with line items
  • [x] saft_anomaly_detect -- flag duplicate invoices, numbering gaps, unusual amounts
  • [x] saft_compare -- diff two SAF-T files (e.g. month-over-month)
  • [x] saft_aging -- accounts receivable aging analysis
  • [x] saft_export -- export data to CSV
  • [x] saft_stats -- statistical overview and distributions
  • [ ] Streaming parser for large files (>= 50 MB)
  • [ ] Accounting SAF-T support (journal entries, general ledger, trial balance)
  • [ ] saft_trial_balance -- generate trial balance from accounting data
  • [ ] saft_ies_prepare -- pre-fill IES annual tax return fields
  • [ ] saft_cross_check -- cross-reference invoicing vs accounting SAF-T
  • [x] PyPI package (pip install saft-mcp)
  • [x] GitHub Actions CI (pytest + ruff + mypy)

Supported SAF-T versions

  • SAF-T PT 1.04_01 (current Portuguese standard)

Tested with real exports from PHC Corporate. Should work with SAF-T files from any compliant Portuguese software (Sage, Primavera, PHC, Moloni, InvoiceXpress, etc.).


License

MIT


Built by bybloom.ai, a business unit of Bloomidea

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选