Memorised them All
Imagine you could hand Claude a filing cabinet of your documents and say "remember all of this." Later you just ask questions, and Claude answers from what it remembers — citing which document each fact came from. That's Memorised them All. It's a small add-on (an MCP server) for Claude Desktop and Claude Code that: Reads your files — PDFs, Word/Excel/PowerPoint, web pages, images (with OCR), ev
README
<div align="center">
<img src="https://raw.githubusercontent.com/GRU-953/memorised-them-all/main/docs/social-preview.png" alt="Memorised them All — give Claude a private, local memory of your files" width="100%">
<h1>Memorised them All</h1>
<h3>Give Claude a private memory of your files — without paying for it in tokens.</h3>
<p>Point it at a folder of PDFs, Word/Excel files, images, or audio. It reads and remembers them <b>entirely on your own computer</b>, so you can just <i>ask Claude</i> about them later — no copy-pasting, no uploads, no API keys, no surprise token bills.</p>
<p> <a href="#-what-is-this"><b>What is this?</b></a> · <a href="#-get-started-in-about-a-minute"><b>Get started</b></a> · <a href="#-your-first-memory">First memory</a> · <a href="#-what-can-i-use-it-for">Use cases</a> · <a href="#-questions--troubleshooting">FAQ</a> · <a href="#-for-power-users">Advanced</a> </p>
</div>
🧠 What is this?
Imagine you could hand Claude a filing cabinet of your documents and say "remember all of this." Later you just ask questions, and Claude answers from what it remembers — citing which document each fact came from.
That's Memorised them All. It's a small add-on (an MCP server) for Claude Desktop and Claude Code that:
- Reads your files — PDFs, Word/Excel/PowerPoint, web pages, images (with OCR), even audio — and converts them to clean text on your computer.
- Builds a memory — a searchable map of the people, topics, and facts inside them (a "knowledge graph"), plus a tidy summary and an interactive mind map.
- Lets you ask — Claude recalls just the relevant bits when you ask, instead of you pasting whole files into the chat.
The one-line idea: Claude tokens cost money; your computer's effort is free. So all the heavy lifting happens locally, and Claude only ever receives a tiny answer. Memorising a 500-page folder costs roughly zero chat tokens.
💬 See it in action
Once it's installed, you just talk to Claude normally:
You: Memorise everything in ~/Documents/contracts.
Claude: ✅ Digested 38 files → 421 facts across 7 themes. (took ~30s, all local)
You: Which contracts mention an auto-renewal clause, and when do they renew?
Claude: Three do — the Globex MSA (renews 1 Jan, 60-day notice), … [cites each source]
You: Open the mind map.
Claude: Here's your interactive map: /…/mindmap.html
Nothing left your machine. Claude never saw the 38 files — only the small answers.
🚀 Get started in about a minute
You need Python 3.10 or newer (most Macs and Linux PCs already have it; Windows users can install it from python.org — tick "Add to PATH"). Everything else installs automatically the first time you use it.
Pick whichever matches how you use Claude:
▶ Claude Desktop (easiest — no terminal)
- Download
memorised-them-all.mcpbfrom the latest release. - Double-click it. Claude Desktop opens and offers to install the extension — click Install.
- Done. Start a chat and say "Memorise my Documents folder."
▶ Claude Code
claude
# then, inside Claude Code:
/plugin marketplace add GRU-953/memorised-them-all
/plugin install memorised-them-all
▶ Any other setup (pip)
pip install memorised-them-all
Then register it with Claude — easiest is to let it configure itself:
mta setup-claude # writes the MCP server into Claude Desktop (and Claude Code) config
(The install.sh installer runs this for you automatically.) Or add it by hand — it just runs mta serve:
{
"mcpServers": {
"memorised-them-all": { "command": "mta", "args": ["serve"] }
}
}
💡 Prefer Homebrew or Docker?
brew install GRU-953/memorised-them-all/mta, or see Run it in Docker. All paths give you the same thing.
Do I need to install AI models?
No — it works the moment it's installed. Out of the box it uses fast, built-in techniques (no downloads, fully offline).
For sharper summaries and search, it can optionally use a free local AI model via Ollama (still 100% on your machine). If Ollama is present it's used automatically; if not, you're never blocked. To check what you have and get one-line setup tips, run:
mta doctor
📁 Your first memory
-
Tell Claude what to remember — point it at a folder, a file, or a pattern:
"Memorise everything in ~/Documents/research."
(Behind the scenes Claude calls the
digesttool. The first run may take a little longer while it sets things up.) -
Ask away — in plain language:
"What do my documents say about the Q3 budget?" "Summarise everything about Project Apollo." "Who is mentioned most often, and in which files?"
-
Explore visually (optional):
"Open the mind map." — an interactive, offline map of how everything connects.
-
Keep it tidy — separate memories per topic with projects:
"Memorise ~/work/clientA into a project called clientA." "Using the clientA project, what were the agreed deliverables?"
Your memory lives in a folder on your computer (~/.memorised-them-all by default) and persists between chats. Re-running "memorise" updates it.
🎯 What can I use it for?
- 📚 Research & study — digest a pile of papers or a textbook, then ask for explanations, comparisons, and citations.
- 📑 Contracts & policies — load all your agreements and ask "which ones auto-renew?" or "what are the termination clauses?"
- 🗂️ Personal knowledge base — point it at years of notes, receipts, or manuals and actually find things.
- 🎧 Meetings & lectures — drop in audio recordings; it transcribes locally and remembers the content.
- 🖼️ Scanned documents & images — it reads text from photos and scans (OCR) so they become searchable.
- 🔒 Sensitive material — legal, medical, financial, or confidential files that must never leave your machine.
✨ Why is it free of token cost?
When you normally share a document with Claude, the whole thing is sent into the conversation — and you pay (in tokens) for every word, every time. A few big PDFs can blow your whole context window.
Memorised them All flips that around:
- Converting, reading, and summarising your files happens on your computer.
- Claude only ever receives a tiny result — a count, a short summary, or a few relevant snippets (capped small).
- So memorising a giant folder, and asking about it again and again, stays near-zero context tokens.
It's the difference between mailing someone an entire library versus asking a librarian a question.
🔒 Is my data private?
Yes — that's the whole point. By default:
- ✅ 100% local. Your files are read, converted, and remembered on your own machine. Their contents are never sent to Claude's servers, to us, or to anyone.
- ✅ No telemetry, no tracking, no accounts, no API keys.
- ✅ Works fully offline. Disconnect the internet and it still memorises and answers.
- ✅ Open source (MIT). You (or anyone) can read exactly what it does.
The only times anything touches the network are clearly optional and on your command: installing/updating software, an occasional check for a new version (turn off with MTA_AUTO_UPDATE=off), or if you explicitly choose to point it at a remote AI backend. With the defaults, your documents stay with you. See SECURITY.md for the full threat model.
🛡️ Built to be reliable
It's been hardened through repeated, deliberate stress-testing so it stays calm on real, messy folders:
- It always finishes. A broken, enormous, or stuck file can't freeze the job — every file has a time limit and is skipped if it jams, so the rest still go through.
- One bad file won't sink the batch. Unreadable files, looping shortcuts, and odd or over-long filenames are skipped, never fatal.
- Your memory is crash-safe. If the computer is interrupted mid-write, your memory isn't corrupted or lost — writes are atomic, and anything that looks damaged is backed up before it's touched.
- It's honest about its mode. If the local AI engine isn't responding, it tells you plainly and labels the memory as "basic" instead of pretending it's fine (see the FAQ below).
- It reads awkward files. Windows "Unicode" (UTF-16) text, scanned images (OCR), audio, and even legacy Bengali (Bijoy/SutonnyMJ) fonts are handled; empty and binary files are skipped cleanly.
- It can't be tricked into reading elsewhere. A shortcut planted in a folder that points outside that folder is ignored — a digest only ever reads what you pointed it at.
❓ Questions & troubleshooting
<details> <summary><b>Is it really free?</b></summary>
Yes — the software is free and open-source (MIT), and it runs on your own computer, so there are no per-use fees or token charges. The only "cost" is a little of your computer's time and disk space. </details>
<details> <summary><b>Claude says it doesn't have the tool / it's not showing up.</b></summary>
Make sure the extension/plugin is installed and enabled, then fully restart Claude Desktop (or your Claude Code session). To confirm the engine itself works, run mta status in a terminal — it should print your setup. Still stuck? Run mta doctor.
</details>
<details> <summary><b>The first "memorise" was slow.</b></summary>
The first run sets things up (and, if you have Ollama, may load a model). Later runs are much faster, and re-memorising only processes what changed. Add fast ("memorise … in fast mode") to skip the AI step entirely for a quick, fully-deterministic pass.
</details>
<details> <summary><b>What files can it read?</b></summary>
PDFs, Word/Excel/PowerPoint, plain text/Markdown, HTML, CSV/JSON/XML, RTF, EPUB, common images (via OCR), and audio (transcribed locally). Beyond those, any other text-based file is digested too (source code, .log, .ini, .tex, …); only genuine binaries are skipped — so a whole folder gets captured. Ask Claude to "list what's digestible in this folder" to see.
</details>
<details> <summary><b>What languages does it understand?</b></summary>
Text in any language works (it's Unicode throughout). For scanned documents and images, OCR runs English + Bangla by default (eng+ben); set MTA_OCR_LANG to other Tesseract codes (e.g. eng+hin+ara). Any language pack you don't have installed is dropped automatically, so it never errors.
</details>
<details> <summary><b>How do I delete a memory?</b></summary>
Tell Claude "forget the clientA project" (it asks you to name the project, on purpose). It deletes that project's memory from your disk — irreversibly. </details>
<details> <summary><b>Does it need an internet connection?</b></summary>
No. It's built to work completely offline. Internet is only used for optional, opt-in things like installing updates. </details>
<details> <summary><b>It says "basic mode" or "degraded" — what does that mean?</b></summary>
It's being honest with you. "Basic mode" (also called classical) means a memory was built without the local AI model — either because that's the default for your machine (the safe micro profile), or because the AI engine (Ollama) wasn't responding. The memory is still complete and searchable; it's just less detailed than the AI-assisted "accurate" mode.
If you expected the sharper mode, ask Claude to "check memory status". The health line tells you plainly what's wrong — most often "Ollama is running but its AI engine isn't responding", which is fixed by fully quitting and reopening Ollama and making sure your model is installed (ollama pull <model>). The tool no longer hides this behind a long silent run — it warns you at the start of memorising and labels the result.
</details>
🧰 The tools Claude gets
Once installed, Claude can use these nine tools on your behalf (you just talk normally — Claude picks the right one):
| Tool | What it does for you |
|---|---|
| digest | Reads files/folders and builds (or updates) the memory. |
| convert | Just converts files to clean Markdown (no memory) — handy for exporting or fixing legacy Bengali. |
| recall | Answers a question from memory with a few relevant, cited snippets. |
| memory_overview | Gives the big picture — a synopsis and the main themes. |
| list_digestible | Shows which files in a folder it can read. |
| open_mindmap | Opens the interactive, offline mind map. |
| export_memory | Saves the memory as portable Markdown notes you can keep or share. |
| memory_status | Reports your local setup (models, tools, projects). |
| forget | Deletes a project's memory (you name it explicitly). |
Every tool returns only small results — never your documents' contents.
🛠️ For power users
You don't need any of this to use the app — but it's here if you want it.
<details> <summary><b>Command line (no Claude needed)</b></summary>
The same engine ships as an mta command:
mta digest ~/Documents/research # build/update memory (--fast to skip the LLM)
mta recall "what about the Q3 budget?" # query it
mta overview # synopsis + themes
mta status # local stack health · mta doctor (fix deps)
mta export ./notes # export portable Markdown
mta mindmap --open # open the mind map
</details>
<details> <summary><b>Use it from other AI apps (OpenAI, Gemini, plain HTTP)</b></summary>
The same eight tools can be served beyond Claude:
mta serve --http # MCP over HTTP (loopback + an auto-generated bearer token)
mta serve --rest # plain JSON: POST http://127.0.0.1:8765/tools/<name>
mta export-schema # tool schemas as OpenAI / Gemini / OpenAPI 3.1 (no drift)
mta recipes # copy-paste connection snippets for every client
Both HTTP modes are loopback-only by default and require a bearer token. See mta recipes for ready-to-paste setup.
</details>
<details> <summary><b>Run it in Docker</b><a id="run-it-in-docker"></a></summary>
A multi-arch image (amd64 + arm64) is published to GHCR:
docker run -d --name mta -p 127.0.0.1:8765:8765 -v mta-data:/data \
ghcr.io/gru-953/memorised-them-all:latest
docker logs mta # copy the printed bearer token + the `claude mcp add …` line
It serves the tools over MCP HTTP and keeps memory in the /data volume. Mount documents read-only (-v /path/to/docs:/docs:ro) and digest the in-container path.
</details>
<details> <summary><b>Use a different (or remote) AI model</b></summary>
By default the optional AI step runs on local Ollama. To use another local server (LM Studio, llama.cpp, vLLM, …) set MTA_BACKEND:
MTA_BACKEND=lmstudio mta digest ~/docs # OpenAI-compatible server on :1234
MTA_BACKEND=openai MTA_BACKEND_URL=http://127.0.0.1:8080/v1 mta digest ~/docs
Set MTA_EXTRACT_MODEL / MTA_EMBED_MODEL to that server's model names. Pointing it at a non-local URL sends content off your machine — that's your explicit choice (you'll get a one-time warning).
</details>
<details> <summary><b>Configuration</b></summary>
Everything has sensible defaults. Common knobs (set as environment variables):
| Variable | Default | Meaning |
|---|---|---|
MTA_HOME |
~/.memorised-them-all |
where memory is stored |
MTA_OCR_LANG |
eng+ben |
OCR languages (Tesseract codes; missing packs dropped automatically) |
MTA_EXTRACT_MODEL |
(set by profile) | extraction LLM — overrides the profile; alternatives under "Choosing a model" below |
MTA_EMBED_MODEL |
qwen3-embedding:0.6b |
multilingual embeddings (1024-d, incl. Bangla) |
MTA_VISION_MODEL |
qwen3-vl:4b-instruct |
image caption / OCR-assist (32-language) |
MTA_WHISPER_MODEL |
small |
on-device speech-to-text size |
MTA_NO_OLLAMA |
unset | force fully-offline mode (no AI model) |
MTA_AUTO_UPDATE |
on |
daily update check (off to disable) |
MTA_PROFILE |
micro |
sizing tier: micro (4 GB / no-GPU — the safe default) · auto (size to this machine) · small · standard (16 GB) · large (32 GB+) · offline |
MTA_CONVERT_TIMEOUT |
120 |
per-file conversion timeout (seconds); a file that hangs the parser is skipped, never stalls the batch. 0 disables |
MTA_MEMORY_GB |
auto | override detected RAM (for containers/VMs that misreport it, to pick the right profile) |
MTA_BACKEND / MTA_BACKEND_URL |
auto |
use another local model server (see above) |
MTA_HTTP_* |
off | options for the opt-in HTTP/REST servers |
</details>
<details> <summary><b>Choosing a local model (lighter / multilingual alternatives)</b><a id="choosing-a-model"></a></summary>
The default is safe on a 4 GB machine with no graphics card. Out of the box (profile micro) the plugin runs fully offline — classical extraction + a tiny embedding model — so a digest always completes and never thrashes, on any computer. To use sharper local AI, pick a bigger profile with one setting — the easiest is MTA_PROFILE=auto, which sizes the models to your computer (a 16 GB machine gets the qwen3 stack below; a 4 GB one stays on micro). You can also set any model directly (env var or extension setting), which overrides the profile. All tags are verified-real Ollama models pulled on demand. Sizes are q4-class downloads.
| Profile | Good for | What it uses |
|---|---|---|
micro (default) |
4 GB, no GPU | offline/classical + tiny embedder + vision off |
auto |
recommended | sizes the stack to your machine's RAM |
standard |
~16 GB | the qwen3 stack below |
large |
32 GB+ | qwen3:8b + vision |
The model tables below apply when a profile (or you) turns the local LLM on:
Extraction LLM — MTA_EXTRACT_MODEL (entity/relation/fact extraction + summaries):
| Model | Size | Best for |
|---|---|---|
qwen3:4b-instruct (default) |
2.5 GB | Optimum on 16 GB — newer-gen, non-thinking (clean JSON), 119 languages incl. Bangla |
qwen3:8b |
5.2 GB | Higher quality if you have RAM — best Bangla + instruction-following |
gemma3:4b-it-qat |
4.0 GB | QAT ≈ BF16 quality, 140+ languages |
llama3.2:3b |
2.0 GB | Lightest solid English-centric option |
qwen2.5:7b |
4.7 GB | Previous default (older generation) |
Pin the
-instructbuilds. Bareqwen3:4b/qwen3-vl:4bare thinking models that emit chain-of-thought (bad for strict JSON / captions). Newest/experimental (mid-2026, less battle-tested):qwen3.5:4b(text) andgemma4:e2b-it-qat(text+vision) exist now — fine to try, but the picks above are the stable, instruct-guaranteed defaults.
Embeddings — MTA_EMBED_MODEL (entity resolution + recall):
| Model | Size | Best for |
|---|---|---|
qwen3-embedding:0.6b (default) |
0.64 GB | Optimum — 1024-d, 100+ languages incl. Bangla, top multilingual retrieval (MMTEB ≈ 64) |
bge-m3 |
1.2 GB | Explicit Bengali + hybrid dense/sparse (helps fuzzy entity matching) |
embeddinggemma:300m |
0.62 GB | 768-d multilingual; smaller footprint |
nomic-embed-text |
0.27 GB | English-only (previous default) |
Switching the embedding model changes the vector dimension, so re-digest with
reset: trueafterwards — recall transparently falls back to lexical scoring until you do (it never errors).
Vision — MTA_VISION_MODEL (captions images OCR can't read):
| Model | Size | Best for |
|---|---|---|
qwen3-vl:4b-instruct (default) |
3.3 GB | 32-language OCR incl. Bangla; reads charts / diagrams / forms |
qwen3-vl:2b-instruct |
1.9 GB | Same OCR engine, lighter |
gemma3:4b |
3.3 GB | 140+ languages |
granite3.2-vision:2b |
2.4 GB | Document / table / chart OCR (IBM) |
moondream |
1.7 GB | Tiniest / fastest (English-only; previous default) |
Speech-to-text — MTA_WHISPER_MODEL: default small (good speed/accuracy on 16 GB); medium or large-v3-turbo for maximum accuracy, tiny/base for low-resource. Runs on the Apple GPU via MLX-Whisper.
The default stack is already optimal for 16 GB. To favour maximum quality (needs more RAM), escalate the extractor and re-digest:
MTA_EXTRACT_MODEL=qwen3:8b mta digest ~/docs --reset
</details>
<details> <summary><b>Legacy Bengali (Bijoy / SutonnyMJ) → Unicode, and the <code>convert</code> command</b></summary>
Millions of Bengali documents were typed with the Bijoy keyboard in SutonnyMJ (and 110+ other ANSI fonts); read as plain text they come out as mojibake. Memorised them All upgrades them to standard Unicode Bengali automatically during conversion, so digest / recall / embeddings work on real text instead of garbage.
- Font-aware for Office files (
.docx/.pptx/.xlsx): only runs whose font is a Bijoy-family font are converted, so mixed English + Bengali documents come out clean (the English is left exactly as-is). Plain text uses a conservative density check that never touches ordinary English. - A faithful pure-Python port of the Mukti converter — no new dependency, fully local, on by default (
MTA_BANGLA_LEGACY=offto disable).
Convert a folder to Markdown (with the legacy upgrade) without building memory:
mta convert ~/docs # writes ~/docs/markdown_converted/*.md
mta convert ~/docs --out ~/md_out # …or choose the output folder
digest runs the very same conversion as its first step, so converting to Markdown is the default everywhere — reach for convert only when you want the .md files themselves.
</details>
<details> <summary><b>How it works under the hood</b></summary>
convert (files → Markdown, locally) → extract (entities, relations, facts) → graph (build + detect communities/themes) → summarise (layered: per-theme + a global synopsis) → embed (vectors for search) → materialise (memory.md, per-doc notes, mind map). recall embeds your question and returns the closest, capped, cited snippets. With no AI model available it falls back to fast classical techniques, so a digest always succeeds. See CHANGELOG.md and SECURITY.md for details.
</details>
💻 Platforms
macOS (Apple-silicon optimised), Linux, and Windows · Python 3.10–3.12 · tested on all three in CI.
🙏 Credits & license
Built on the shoulders of MarkItDown, Ollama, NetworkX, and the Model Context Protocol. Optional community-detection extras (python-igraph, leidenalg) are GPL-licensed and not installed by the MIT core. See ACKNOWLEDGEMENTS.md.
MIT licensed · made by GRU-953. Issues and contributions welcome — start with SECURITY.md for the security model.
<div align="center"> <sub>100% local · token-free · free & open-source · your files never leave your machine.</sub> </div>
推荐服务器
Baidu Map
百度地图核心API现已全面兼容MCP协议,是国内首家兼容MCP协议的地图服务商。
Playwright MCP Server
一个模型上下文协议服务器,它使大型语言模型能够通过结构化的可访问性快照与网页进行交互,而无需视觉模型或屏幕截图。
Audiense Insights MCP Server
通过模型上下文协议启用与 Audiense Insights 账户的交互,从而促进营销洞察和受众数据的提取和分析,包括人口统计信息、行为和影响者互动。
Magic Component Platform (MCP)
一个由人工智能驱动的工具,可以从自然语言描述生成现代化的用户界面组件,并与流行的集成开发环境(IDE)集成,从而简化用户界面开发流程。
VeyraX
一个单一的 MCP 工具,连接你所有喜爱的工具:Gmail、日历以及其他 40 多个工具。
Kagi MCP Server
一个 MCP 服务器,集成了 Kagi 搜索功能和 Claude AI,使 Claude 能够在回答需要最新信息的问题时执行实时网络搜索。
graphlit-mcp-server
模型上下文协议 (MCP) 服务器实现了 MCP 客户端与 Graphlit 服务之间的集成。 除了网络爬取之外,还可以将任何内容(从 Slack 到 Gmail 再到播客订阅源)导入到 Graphlit 项目中,然后从 MCP 客户端检索相关内容。
mcp-server-qdrant
这个仓库展示了如何为向量搜索引擎 Qdrant 创建一个 MCP (Managed Control Plane) 服务器的示例。
e2b-mcp-server
使用 MCP 通过 e2b 运行代码。
Neon MCP Server
用于与 Neon 管理 API 和数据库交互的 MCP 服务器