Ollama MCP Server

Ollama MCP Server

Enables complete local Ollama management including listing models, chatting with local LLMs, starting/stopping the server, and getting intelligent model recommendations for specific tasks through natural language commands.

Category
访问服务器

README

Ollama MCP Server

License: MIT Python 3.10+ MCP Compatible

A self-contained Model Context Protocol (MCP) server for local Ollama management, developed with Claude AI assistance. Features include listing local models, chatting, starting/stopping the server, and a 'local model advisor' to suggest the best local model for a given task. The server is designed to be a robust, dependency-free, and cross-platform tool for managing a local Ollama instance.

⚠️ Current Testing Status

Currently tested on: Windows 11 with NVIDIA RTX 4090
Status: Beta on Windows, Other Platforms Need Testing
Cross-platform code: Ready for Linux and macOS but requires community testing
GPU support: NVIDIA fully tested, AMD/Intel/Apple Silicon implemented but needs validation

We welcome testers on different platforms and hardware configurations! Please report your experience via GitHub Issues.

🎯 Key Features

🔧 Self-Contained Architecture

  • Zero External Dependencies: No external MCP servers required
  • MIT License Ready: All code internally developed and properly licensed
  • Enterprise-Grade: Professional error handling with actionable troubleshooting

🌐 Universal Compatibility

  • Cross-Platform: Windows, Linux, macOS with automatic platform detection
  • Multi-GPU Support: NVIDIA, AMD, Intel detection with vendor-specific optimizations
  • Smart Installation Discovery: Automatic Ollama detection across platforms

Complete Local Ollama Management

  • Model Operations: List, suggest, and remove local models.
  • Server Control: Start and monitor the Ollama server with intelligent process management.
  • Direct Chat: Communicate with any locally installed model.
  • System Analysis: Assess hardware compatibility and monitor resources.

🚀 Quick Start

Installation

git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server
pip install -e .

Configuration

Add to your MCP client configuration (e.g., Claude Desktop config.json):

{
  "mcpServers": {
    "ollama-mcp": {
      "command": "python",
      "args": [
        "X:\\PATH_TO\\ollama-mcp-server\\src\\ollama_mcp\\server.py"
      ],
      "env": {}
    }
  }
}

Note: Adjust the path to match your installation directory. On Linux/macOS, use forward slashes: /path/to/ollama-mcp-server/src/ollama_mcp/server.py

Requirements

  • Python 3.10+ (required by MCP SDK dependency)
  • Ollama installed and accessible in PATH
  • MCP-compatible client (Claude Desktop, etc.)

Ollama Configuration Compatibility

This MCP server automatically respects your Ollama configuration. If you have customized your Ollama setup (e.g., changed the models folder via OLLAMA_MODELS environment variable), the MCP server will work seamlessly without any additional configuration.

🛠️ Available Tools

Model Management

  • list_local_models - List all locally installed models with their details.
  • local_llm_chat - Chat directly with any locally installed model.
  • remove_model - Safely remove a model from local storage.
  • suggest_models - Recommends the best locally installed model for a specific task (e.g., "suggest a model for coding").

Server and System Operations

  • start_ollama_server - Starts the Ollama server if it's not already running.
  • ollama_health_check - Performs a comprehensive health check of the Ollama server.
  • system_resource_check - Analyzes system hardware and resource availability.

Diagnostics

  • test_model_responsiveness - Checks the responsiveness of a specific local model by sending a test prompt, helping to diagnose performance issues.
  • select_chat_model - Presents a list of available local models to choose from before starting a chat.

💬 How to Interact with Ollama-MCP

Ollama-MCP works through your MCP client (like Claude Desktop) - you don't interact with it directly. Instead, you communicate with your MCP client using natural language, and the client translates your requests into tool calls.

Basic Interaction Pattern

You speak to your MCP client in natural language, and it automatically uses the appropriate ollama-mcp tools:

You: "List my installed Ollama models"
→ Client calls: list_local_models
→ You get: Formatted list of your models

You: "Chat with llama3.2: explain machine learning"  
→ Client calls: local_llm_chat with model="llama3.2" and message="explain machine learning"
→ You get: AI response from your local model

You: "Check if Ollama is running"
→ Client calls: ollama_health_check  
→ You get: Server status and troubleshooting if needed

Example Interactions

Model Management

  • "What models do I have installed?"list_local_models
  • "I need a model for creative writing, which of my models is best?"suggest_models
  • "Remove the old mistral model to save space"remove_model

System Operations

  • "Start Ollama server"start_ollama_server
  • "Is my system capable of running large AI models?"system_resource_check

AI Chat

  • "Chat with llama3.2: write a Python function to sort a list"local_llm_chat
  • "Use deepseek-coder to debug this code: [code snippet]"local_llm_chat
  • "Ask phi3.5 to explain quantum computing simply"local_llm_chat

Key Points

  • No Direct Commands: You never call ollama_health_check() directly
  • Natural Language: Speak normally to your MCP client
  • Automatic Tool Selection: The client chooses the right tool based on your request
  • Conversational: You can ask follow-up questions and the client maintains context

🎯 Real-World Use Cases

Daily Development Workflow

"I need to work on a coding project. Which of my local models is best for coding? Let's check its performance and then ask it a question."

This could trigger:

  1. suggest_models - Recommends the best local model for "coding".
  2. test_model_responsiveness - Checks if the recommended model is responsive.
  3. local_llm_chat - Starts a chat with the model.

Model Management Session

"Show me what models I have and recommend one for writing a story. Then let's clean up any old models I don't need."

Triggers:

  1. list_local_models - Current inventory
  2. suggest_models - Recommends a local model for "writing a story".
  3. remove_model - Cleanup unwanted models.

Troubleshooting Session

"Ollama isn't working. Check what's wrong, try to fix it, and test with a simple chat."

Triggers:

  1. ollama_health_check - Diagnose issues
  2. start_ollama_server - Attempt to start server
  3. local_llm_chat - Verify working with test message

🏗️ Architecture

Design Principles

  • Self-Contained: Zero external MCP server dependencies
  • Fail-Safe: Comprehensive error handling with actionable guidance
  • Cross-Platform First: Universal Windows/Linux/macOS compatibility
  • Enterprise Ready: Professional-grade implementation and documentation

Technical Highlights

  • Internal Process Management: Advanced subprocess handling with timeout control
  • Multi-GPU Detection: Platform-specific GPU identification without confusing metrics
  • Intelligent Model Selection: Fallback to first available model when none specified
  • Progressive Health Monitoring: Smart server startup detection with detailed feedback

📋 System Compatibility

Operating Systems

  • Windows: Full support with auto-detection in Program Files and AppData ✅ Tested
  • Linux: XDG configuration support with package manager integration ⚠️ Needs Testing
  • macOS: Homebrew detection with Apple Silicon GPU support ⚠️ Needs Testing

GPU Support

  • NVIDIA: Full detection via nvidia-smi with memory and utilization info ✅ Tested RTX 4090
  • AMD: ROCm support via vendor-specific tools ⚠️ Needs Testing
  • Intel: Basic detection via system tools ⚠️ Needs Testing
  • Apple Silicon: M1/M2/M3 detection with unified memory handling ⚠️ Needs Testing

Hardware Requirements

  • Minimum: 4GB RAM, 2GB free disk space
  • Recommended: 8GB+ RAM, 10GB+ free disk space
  • GPU: Optional but recommended for model acceleration

🔧 Development

Project Structure

ollama-mcp-server/
├── src/
│   ├── __init__.py               # Defines the package version
│   └── ollama_mcp/
│       ├── __init__.py           # Makes 'ollama_mcp' a package
│       ├── server.py             # Main MCP server implementation
│       ├── client.py             # Ollama API client
│       ├── config.py             # Configuration management
│       ├── model_manager.py      # Local model operations
│       ├── hardware_checker.py   # System hardware analysis
│       └── ... (and other modules)
├── tests/
│   ├── test_client.py            # Unit tests for the client
│   └── test_tools.py             # Integration tests for tools
├── .gitignore                    # Specifies intentionally untracked files
└── pyproject.toml                # Project configuration and dependencies

Key Technical Achievements

Self-Contained Implementation

  • Challenge: Eliminated external desktop-commander dependency
  • Solution: Internal process management with advanced subprocess handling
  • Result: Zero external MCP dependencies, MIT license compatible

Intelligent GPU Detection

  • Challenge: Complex VRAM reporting causing user confusion
  • Solution: Simplified to GPU name display only
  • Result: Clean, reliable hardware identification

Enterprise Error Handling

  • Implementation: 6-level exception framework with specific error types
  • Coverage: Platform-specific errors, process failures, network issues
  • UX: Actionable troubleshooting steps for every error scenario

🤝 Contributing

We welcome contributions! Areas where help is especially appreciated:

  • Platform Testing: Different OS and hardware configurations ⭐ High Priority
  • GPU Vendor Support: Additional vendor-specific detection
  • Performance Optimization: Startup time and resource usage improvements
  • Documentation: Usage examples and integration guides
  • Testing: Edge cases and error condition validation

Immediate Testing Needs

  • Linux: Ubuntu, Fedora, Arch with various GPU configurations
  • macOS: Intel and Apple Silicon Macs with different Ollama installations
  • GPU Vendors: AMD ROCm, Intel Arc, Apple unified memory
  • Edge Cases: Different Python versions, various Ollama installation methods

Development Setup

git clone https://github.com/paolodalprato/ollama-mcp-server.git
cd ollama-mcp-server

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Code formatting
black src/
isort src/

# Type checking
mypy src/

🐛 Troubleshooting

Common Issues

Ollama Not Found

# Verify Ollama installation
ollama --version

# Check PATH configuration
which ollama  # Linux/macOS
where ollama  # Windows

Server Startup Failures

# Check port availability
netstat -an | grep 11434

# Manual server start for debugging
ollama serve

Permission Issues

  • Windows: Run as Administrator if needed
  • Linux/macOS: Check user permissions for service management

Platform-Specific Issues

If you encounter issues on Linux or macOS, please report them via GitHub Issues with:

  • Operating system and version
  • Python version
  • Ollama version and installation method
  • GPU hardware (if applicable)
  • Complete error output

📊 Performance

Typical Response Times (Windows RTX 4090)

  • Health Check: <500ms
  • Model List: <1 second
  • Server Start: 1-15 seconds (hardware dependent)
  • Model Chat: 2-30 seconds (model and prompt dependent)

Resource Usage

  • Memory: <50MB for MCP server process
  • CPU: Minimal when idle, scales with operations
  • Storage: Configuration files and logs only

🔐 Security

  • Data Flow: User → MCP Client (Claude) → ollama-mcp-server → Local Ollama → back through chain

👨‍💻 About This Project

This is my first MCP server, created by adapting a personal tool I had developed for my own Ollama management needs.

The Problem I Faced

I started using Claude to interact with Ollama because it allows me to use natural language instead of command-line interfaces. Claude also provides capabilities that Ollama alone doesn't have, particularly intelligent model suggestions based on both my system capabilities and specific needs.

My Solution

I built this MCP server to streamline my own workflow, and then refined it into a stable tool that others might find useful. The design reflects real usage patterns:

  • Self-contained: No external dependencies that can break
  • Intelligent error handling: Clear guidance when things go wrong
  • Cross-platform: Works consistently across different environments
  • Practical tools: Features I actually use in daily work

Design Philosophy

I initially developed this for my personal use to manage Ollama models more efficiently. When the MCP protocol became available, I transformed my personal tool into an MCP server to share it with others who might find it useful.

Development Approach: This project was developed with Claude using "vibe coding" - an iterative, conversational development process where AI assistance helped refine both the technical implementation and user experience. It's a practical example of AI-assisted development creating tools for AI management. Jules was also involved in the final refactoring phase.

📄 License

MIT License - see LICENSE file for details.

🙏 Acknowledgments

  • Ollama Team: For the excellent local AI platform
  • MCP Project: For the Model Context Protocol specification
  • Claude Desktop/Code by Anthropic: As tool in MCP client implementation, testing and refactoring
  • Jules by Google: As tool in refactoring

📞 Support


Changelog

  • v0.9.0 (August 17, 2025): Critical bugfix release - Fixed datetime serialization issue that prevented model listing from working with Claude Desktop. All 9 tools now verified working correctly.
  • August 2025: Project refactoring and enhancements. Overhauled the architecture for modularity, implemented a fully asynchronous client, added a test suite, and refined the tool logic based on a "local-first" philosophy.
  • July 2025: Initial version created by Paolo Dalprato with Claude AI assistance.

For detailed changes, see CHANGELOG.md.


Status: Beta on Windows, Other Platforms Need Testing
Testing: Windows 11 + RTX 4090 validated, Linux/macOS require community validation
License: MIT
Dependencies: Zero external MCP servers required

推荐服务器

Baidu Map

Baidu Map

百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。

官方
精选
JavaScript
Playwright MCP Server

Playwright MCP Server

一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。

官方
精选
TypeScript
Magic Component Platform (MCP)

Magic Component Platform (MCP)

一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。

官方
精选
本地
TypeScript
Audiense Insights MCP Server

Audiense Insights MCP Server

通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。

官方
精选
本地
TypeScript
VeyraX

VeyraX

一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。

官方
精选
本地
graphlit-mcp-server

graphlit-mcp-server

模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。

官方
精选
TypeScript
Kagi MCP Server

Kagi MCP Server

一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。

官方
精选
Python
e2b-mcp-server

e2b-mcp-server

使用 MCP 通过 e2b 运行代码。

官方
精选
Neon MCP Server

Neon MCP Server

用于与 Neon 管理 API 和数据库交互的 MCP 服务器

官方
精选
Exa MCP Server

Exa MCP Server

模型上下文协议(MCP)服务器允许像 Claude 这样的 AI 助手使用 Exa AI 搜索 API 进行网络搜索。这种设置允许 AI 模型以安全和受控的方式获取实时的网络信息。

官方
精选