MCP 服务器

mcp-data-extractor

一个模型上下文协议服务器，用于从 TypeScript/JavaScript 源代码中提取嵌入式数据（例如 i18n 翻译或键/值配置）到结构化的 JSON 配置文件中。

内容获取

AI内容生成

Tools

extract_data

Extract data content (e.g. i18n translations) from source code to a JSON file. IMPORTANT: When encountering files with data such as i18n content embedded in code, use this tool directly instead of reading the file content first. This tool will programmatically extract all translations into a structured JSON file, preserving nested objects, arrays, template variables, and formatting. This helps keep translations as configuration and prevents filling up the AI context window with translation content. By default, the source file will be replaced with "MIGRATED TO <target absolute path>" and a warning message after successful extraction, making it easy to track where the data was moved to. This behaviour can be disabled by setting the DISABLE_SOURCE_REPLACEMENT environment variable to 'true'. The warning message can be customized by setting the WARNING_MESSAGE environment variable.

extract_svg

Extract SVG components from React/TypeScript/JavaScript files into individual .svg files. This tool will preserve the SVG structure and attributes while removing React-specific code. By default, the source file will be replaced with "MIGRATED TO <target absolute path>" and a warning message after successful extraction, making it easy to track where the SVGs were moved to. This behaviour can be disabled by setting the DISABLE_SOURCE_REPLACEMENT environment variable to 'true'. The warning message can be customized by setting the WARNING_MESSAGE environment variable.

README

mcp-data-extractor MCP 服务器

一个模型上下文协议服务器，用于从 TypeScript/JavaScript 源代码中提取嵌入式数据（例如 i18n 翻译或键/值配置）到结构化的 JSON 配置文件中。

特性

数据提取:
- 提取字符串字面量、模板字面量和复杂的嵌套对象
- 保留模板变量 (例如, Hello, {{name}}!)
- 支持嵌套对象结构和数组
- 使用点符号维护分层键结构
- 处理带有 JSX 支持的 TypeScript 和 JavaScript 文件
- 成功提取后，将源文件内容替换为 "MIGRATED TO <目标绝对路径>" (可配置)
SVG 提取:
- 从 React/TypeScript/JavaScript 文件中提取 SVG 组件
- 保留 SVG 结构和属性
- 删除 React 特定的代码和 props
- 创建以其组件命名的单独的 .svg 文件
- 成功提取后，将源文件内容替换为 "MIGRATED TO <目标绝对路径>" (可配置)

用法

添加到您的 MCP 客户端配置:

{
  "mcpServers": {
    "data-extractor": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-data-extractor"
      ],
      "disabled": false,
      "autoApprove": [
        "extract_data",
        "extract_svg"
      ]
    }
  }
}

基本用法

该服务器提供两个工具:

1. 数据提取

使用 extract_data 从源文件中提取数据 (例如 i18n 翻译):

<use_mcp_tool>
<server_name>data-extractor</server_name>
<tool_name>extract_data</tool_name>
<arguments>
{
  "sourcePath": "src/translations.ts",
  "targetPath": "src/translations.json"
}
</arguments>
</use_mcp_tool>

2. SVG 提取

使用 extract_svg 将 SVG 组件提取到单独的文件中:

<use_mcp_tool>
<server_name>data-extractor</server_name>
<tool_name>extract_svg</tool_name>
<arguments>
{
  "sourcePath": "src/components/icons/InspectionIcon.tsx",
  "targetDir": "src/assets/icons"
}
</arguments>
</use_mcp_tool>

源文件替换

默认情况下，成功提取后，服务器会将源文件的内容替换为:

数据提取的 "MIGRATED TO <目标路径>"
SVG 提取的 "MIGRATED TO <目标目录>"

这有助于跟踪哪些文件已被处理，并防止重复提取。它还使 LLM 和开发人员在以后遇到源文件时，可以轻松地看到提取的数据现在位于何处。

要禁用此行为，请在您的 MCP 配置中将 DISABLE_SOURCE_REPLACEMENT 环境变量设置为 true:

{
  "mcpServers": {
    "data-extractor": {
      "command": "npx",
      "args": [
        "-y",
        "mcp-data-extractor"
      ],
      "env": {
        "DISABLE_SOURCE_REPLACEMENT": "true"
      },
      "disabled": false,
      "autoApprove": [
        "extract_data",
        "extract_svg"
      ]
    }
  }
}

支持的模式

数据提取模式

数据提取器支持 TypeScript/JavaScript 应用程序中常用的各种模式:

简单的对象导出:

export default {
  welcome: "Welcome to our app",
  greeting: "Hello, {name}!",
  submit: "Submit form"
};

嵌套对象:

export default {
  header: {
    title: "Book Your Flight",
    subtitle: "Find the best deals"
  },
  footer: {
    content: [
      "Please refer to {{privacyPolicyUrl}} for details",
      "© {{year}} {{companyName}}"
    ]
  }
};

带有数组的复杂结构:

export default {
  faq: {
    heading: "Common questions",
    content: [
      {
        heading: "What if I need to change my flight?",
        content: "You can change your flight online if:",
        list: [
          "You have a flexible fare type",
          "Your flight is more than 24 hours away"
        ]
      }
    ]
  }
};

带有变量的模板字面量:

export default {
  greeting: `Hello, {{username}}!`,
  message: `Welcome to {{appName}}`
};

输出格式

数据提取输出

提取的数据保存为 JSON 文件，嵌套结构使用点符号:

{
  "welcome": "Welcome to our app",
  "header.title": "Book Your Flight",
  "footer.content.0": "Please refer to {{privacyPolicyUrl}} for details",
  "footer.content.1": "© {{year}} {{companyName}}",
  "faq.content.0.heading": "What if I need to change my flight?"
}

SVG 提取输出

SVG 组件被提取到单独的 .svg 文件中，并删除了 React 特定的代码。例如:

输入 (React 组件):

const InspectionIcon: React.FC<InspectionIconProps> = ({ title }) => (
  <svg className="c-tab__icon" width="40px" id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
    <title>{title}</title>
    <path className="cls-1" d="M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11..." />
  </svg>
);

输出 (InspectionIcon.svg):

<svg width="40px" id="Layer_1" data-name="Layer 1" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 32 32">
    <path class="cls-1" d="M18.89,12.74a3.18,3.18,0,0,1-3.24-3.11..." />
</svg>

扩展支持的模式

提取器使用 Babel 来解析和遍历源代码的 AST (抽象语法树)。您可以通过修改源代码来扩展支持的模式:

添加新的节点类型: src/index.ts 中的 extractStringValue 方法处理不同类型的字符串值。扩展它以支持新的节点类型:

private extractStringValue(node: t.Node): string | null {
  if (t.isStringLiteral(node)) {
    return node.value;
  } else if (t.isTemplateLiteral(node)) {
    return node.quasis.map(quasi => quasi.value.raw).join('{{}}');
  }
  // Add support for new node types here
  return null;
}

自定义值处理: processValue 方法处理不同的值类型 (字符串、数组、对象)。扩展它以支持新的值类型或自定义处理:

private processValue(value: t.Node, currentPath: string[]): void {
  if (t.isStringLiteral(value) || t.isTemplateLiteral(value)) {
    // Process string values
  } else if (t.isArrayExpression(value)) {
    // Process arrays
  } else if (t.isObjectExpression(value)) {
    // Process objects
  }
  // Add support for new value types here
}

自定义 AST 遍历: 服务器使用 Babel 的 traverse 来遍历 AST。您可以添加新的 visitors 来处理不同的节点类型:

traverse(ast, {
  ExportDefaultDeclaration(path: NodePath<t.ExportDefaultDeclaration>) {
    // Handle default exports
  },
  // Add new visitors here
});

开发

安装依赖:

npm install

构建服务器:

npm run build

用于自动重建的开发:

npm run watch

调试

由于 MCP 服务器通过 stdio 进行通信，因此调试可能具有挑战性。我们建议使用 MCP Inspector，它作为包脚本提供:

npm run inspector

Inspector 将提供一个 URL 以访问浏览器中的调试工具。

许可证

该项目已获得 MIT 许可证的许可 - 有关详细信息，请参阅 LICENSE 文件。