mcp-units

mcp-units

An MCP server that provides deterministic unit conversions backed by Pint, enabling exact unit conversions, compatibility checks, and parsing of quantities for LLMs.

Category
访问服务器

README

mcp-units

An MCP server that provides deterministic unit conversions via Pint. LLMs guess at unit conversions — this server makes them exact.

What this does

Exposes 5 tools, 3 resources, and 2 prompts over the Model Context Protocol. Any MCP client (Claude Code, Claude Desktop, Cursor) can convert units, check dimensional compatibility, parse quantity strings, and simplify expressions — all backed by Pint's 400+ unit registry instead of LLM arithmetic.

How it works

A FastMCP server wraps Pint's UnitRegistry and exposes it through MCP primitives:

  • Toolsconvert, check_compatibility, parse_quantity, list_compatible_units, simplify
  • Resourcesunits://systems, units://systems/{system}, units://dimensions
  • Promptsconvert_document (extract and convert all quantities in text), check_calculations (verify dimensional consistency)

The server runs over stdio by default (for Claude Code / Claude Desktop) or Streamable HTTP via fastmcp run (for remote / containerized deployment).

Quickstart

Prerequisites

  • Python 3.12+
  • uv

Install and run

git clone https://github.com/quantumleeps/mcp-units.git
cd mcp-units
uv sync

Add to Claude Code

claude mcp add --transport stdio mcp-units -- \
  uv run --directory /path/to/mcp-units mcp-units

Add to Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "mcp-units": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/mcp-units", "mcp-units"]
    }
  }
}

Run over HTTP

uv run fastmcp run src/mcp_units/server.py --transport http --port 8000

Docker

docker build -t mcp-units .
docker run -p 8000:8000 mcp-units

Tests

uv sync --all-extras
uv run pytest

Evaluation

Does giving an LLM access to a unit conversion tool actually improve its accuracy on physics problems?

Tool impact across 6 Claude models

Evaluated on 70 SciBench college-level physics problems requiring 2+ unit types, across 6 Claude models (840 total runs). Opus 4.6 — the latest model — shows the largest gain (+8.6pp, 70.0% → 78.6%), suggesting that its combination of broad knowledge and refined tool-use lets it leverage unit conversion as a reliable augmentation. 4.5-Sonnet, a strong reasoner and tool user, also improves (+2.9pp). The older 3.7-Sonnet regresses (-2.9pp) — analysis shows it sometimes treats an intermediate conversion result as the final answer, or spins through repeated tool calls without converging, consistent with less mature tool-use capabilities. The surprise is 4.5-Haiku: same generation as 4.5-Sonnet with capable reasoning and tool use, yet it declines (-1.4pp). With a smaller model, the tool appears to be a distraction rather than an augmentation — the model has the sophistication to use it but not always the judgment to know when it helps. With only 70 problems and a single run per model, these per-model deltas carry real uncertainty — the 4.5-Haiku result in particular could reflect noise rather than a meaningful pattern.

Next steps

  • Unit normalization — Models write cm3 but Pint needs cm^3. A lightweight normalize_unit() preprocessor plus better tool descriptions with formatting guidance would eliminate the 12 parsing failures observed in the eval.
  • Expression evaluation — Models sometimes pass math expressions (-1.602e-19 * 1.33e-39 / ...) as the value parameter to convert(). Pint rejects these since it expects a float. Accepting and evaluating simple arithmetic expressions would let the tool handle intermediate calculations.
  • Offset unit handling — Pint raises OffsetUnitCalculusError for °C and °F in compound expressions. The parse_quantity tool needs special handling for temperature offsets.
  • Larger problem set — 70 problems demonstrates the evaluation framework but limits statistical confidence on per-model deltas. Run-to-run variance within a single model is also unknown. Expanding to 200+ problems with multiple runs per problem would quantify both effects.

Run the eval

uv sync --group eval
uv run python -m eval.runner          # run all 6 models × 2 conditions (requires ANTHROPIC_API_KEY)
uv run python -m eval.visualize       # generate charts from results
uv run python -m eval.analyze         # print detailed analysis

Project Structure

mcp-units/
  src/mcp_units/
    server.py       # FastMCP instance — tools, resources, prompts
    registry.py     # Pint UnitRegistry + compatible units workaround
    models.py       # Result dataclasses for structured tool output
  eval/
    runner.py       # Async eval runner — baseline vs tool-augmented
    problems.py     # SciBench problem loading (70 problems, 2+ unit types)
    scorer.py       # Answer extraction + 5% tolerance scoring
    mcp_tools.py    # FastMCP Client wrapper for tool execution
    results.py      # RunResult dataclass + JSON persistence
    visualize.py    # Grouped bar chart + error histograms
    analyze.py      # 16-section detailed analysis
  tests/
    test_tools.py   # 18 Pint logic tests
    test_server.py  # 17 MCP Client integration tests
  Dockerfile        # HTTP transport for containerized deployment

Contributing

PRs welcome. Run pre-commit install after cloning and ensure uv run pytest passes before submitting.

License

MIT

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选