Back to Phantom Notes
AI Tools

MemPalace: The Local-First AI Memory System With 48,000 Stars

April 19, 20269 min readBy T.W. Ghost
MemPalaceAI MemoryLocal-FirstChromaDBMCPClaude CodeKnowledge GraphRAG

Every AI Assistant Forgets Everything

Open a new Claude Code session. It has no idea what you worked on yesterday. Start a fresh ChatGPT thread. The decisions you made last week are gone. Your AI coding partner is brilliant for sixty minutes, then amnesiac again.

People solve this with three types of systems:

  • Flat files. CLAUDE.md, README notes, rolling session logs. Simple, but retrieval is keyword grep.
  • Graph RAG. LightRAG, GraphRAG, AnythingLLM. Semantic search plus relationships, usually on a server.
  • Structured local memory. This is where MemPalace lives.

What MemPalace Actually Is

MemPalace is an open-source AI memory system that stores your conversations verbatim, organizes them with a physical-space metaphor, and retrieves them via semantic search. It runs on your laptop, in Python, with no mandatory API keys.

The structure is the interesting part. Instead of dumping everything into one flat vector store, MemPalace organizes memory like a physical building:

  • Wings are people or projects. The "LLM Match Maker" wing. The "client Acme Corp" wing.
  • Rooms are topics within a wing. "Stripe checkout bugs." "VAT strategy meetings."
  • Drawers are where the actual content lives. Verbatim conversation chunks.

When you ask a question, retrieval is scoped rather than flat-corpus. You can ask "what did I decide about pricing in the LLM Match Maker wing" and it searches only that wing's rooms and drawers. Much less noise than a global semantic search across every note you ever wrote.


The Numbers Matter

MemPalace hits 96.6% retrieval recall on LongMemEval, a benchmark specifically for long-term memory in AI agents. For comparison, naive flat RAG usually sits in the 60-75% range on the same test. The gains come from combining semantic search with the scoped structure and a temporal knowledge graph layer that tracks when facts were true.

It ships with 29 MCP (Model Context Protocol) tools exposing palace operations, agent diaries, and cross-wing navigation. Claude Code can call them natively. So can any MCP-aware agent.

And it is local-first. No API key required. ChromaDB by default for vector storage. SQLite for the temporal graph. Your conversations never leave your machine unless you want them to. The repo ships under MIT license with tens of thousands of GitHub stars and an active Discord.


The Part That Makes It Click

Two features make MemPalace feel different from a "better note taker":

Temporal knowledge graph. Every fact has a validity window. If you decided on Monday that your API rate limit was 100 requests per minute, and on Friday you raised it to 500, MemPalace remembers both decisions and their dates. Ask it today what the rate limit is, and it returns 500. Ask what it was on Tuesday, and it returns 100. Flat notes cannot do this without manual versioning.

Agent diaries. When an AI agent (Claude Code, Cursor, etc.) takes actions in your project, the diary captures what happened. Which files were edited, which commands were run, what errors came up, what fixed them. It is audit log plus memory plus retrievable context in one structure.

Together, those two features mean the system knows not just what you talked about, but what actually got done and when.


Installing Is Not a Weekend Project

MemPalace is Python 3.9+. You install it via pip, point it at a directory, and it starts watching. If you use Claude Code, enable the auto-save hook and every session gets captured automatically. If you want MCP tools exposed, start the server and add it to your Claude Code config.

The defaults are sane: ChromaDB for vectors, SQLite for graph, a small embedding model that runs on CPU. If you want cloud embeddings later you can plug in OpenAI, but nothing forces you to.

First-run ingestion of existing notes is the slow part. A couple gigabytes of markdown might take an hour to embed and graph. After that, incremental updates are fast.


MemPalace vs Claude Code Native Memory

Claude Code ships with a built-in memory system. CLAUDE.md files per project plus auto-memory under ~/.claude/projects/. It is flat markdown. Simple, fast, and already in your hands.

Get the Weekly IT + AI Roundup

What changed this week in NinjaOne, ServiceNow, CrowdStrike, and AI. One email, every Monday.

No spam, unsubscribe anytime. Privacy Policy

MemPalace is a different tier:

Claude Code native memoryMemPalace
StorageFlat markdown filesChromaDB + SQLite graph
RetrievalGrep, always-loaded CLAUDE.mdSemantic search, scoped by wing
Cross-sessionYes, via auto-memoryYes, with richer structure
Temporal awarenessNoneBuilt-in (validity windows)
CapacityLimited by context windowUnbounded, scales with disk
Cross-projectManualNative (wings handle scoping)
CostFreeFree (local) or small embedding API bill
Setup timeZero~30 min

Use Claude Code native memory when you have one or two projects and a few hundred decisions to remember. Use MemPalace when you have five, ten, thirty projects and need scoped recall.


MemPalace vs LightRAG

We deployed LightRAG on a VPS earlier this year to solve the same memory problem. So which should you use?

MemPalaceLightRAG
HostingLocal (laptop)Server / VPS
LanguagePython 3.9+Python, Docker-friendly
Vector storeChromaDB (pluggable)Internal, pluggable
GraphSQLite, temporalNeo4j-style, entity-relation
AccessLocal only by defaultHTTP API, multi-device
Phone accessNo (unless you expose it)Yes, native
OrganizationWings/rooms/drawersFlat corpus + knowledge graph
MCP29 toolsVia third-party wrappers
Use casePersonal, single-operatorTeam or multi-device

Short version: MemPalace for a single developer, LightRAG for a server you query from anywhere. They are complementary, not competitive. A common setup is MemPalace locally for per-machine diary and LightRAG on the VPS as the shared long-term store.


When MemPalace Is the Right Pick

You want privacy. Your conversations never leave your machine. No OpenAI logging, no Anthropic retention, no "we improve our models with your data." For work involving client code, medical data, or anything sensitive, this matters.

You work offline. A VPS-based system goes down when your internet does. MemPalace keeps working.

You want Claude Code to remember itself. The auto-save hook captures every session. Next time you open Claude Code, it can query what happened last time, what was decided, what bugs got fixed. Without you managing any files.

You want to avoid API bills. Semantic search locally costs you disk and CPU, not tokens. Heavy users of LightRAG or Gemini embeddings can burn $20-100/month on embeddings. MemPalace with local embedding models costs zero.


When MemPalace Is the Wrong Pick

You need team access. MemPalace lives on your laptop. If three people need to query the same memory, you want a server.

You work from multiple machines. Your work laptop at the office and your personal laptop at home each have their own MemPalace. You can sync them, but it is not native.

You want phone access. MemPalace is not designed to be queried from a Telegram bot or a mobile app. LightRAG is.

You hate managing Python environments. Pip, conda, pyenv, venv, one of these is going to eat a weekend if you do not already have it dialed in.


The Bigger Picture

Your AI assistant does not have a memory problem. It has a retrieval problem. Everything you need is written down somewhere. In a session log, a markdown note, a Slack thread, a commit message. The question is whether the AI can find it in the next conversation.

Flat files answer that question with "maybe, if you remember the keyword." Graph RAG answers it with "probably, if the graph has traversed to it." MemPalace answers it with "yes, and scoped to the right wing, and dated to the right moment in time."

For a solo developer or researcher who wants that layer without standing up a server, MemPalace is the cleanest starting point in the AI memory space today.


Start Here

The repo is at github.com/MemPalace/mempalace. MIT licensed. Python 3.9+. ChromaDB or BYO vector store.

If you want structured AI memory on a server (not local), our LightRAG Pro Track walks through VPS deployment, Claude Code + MCP integration, phone access via Telegram, and embedding strategy. Covers the server-side piece MemPalace does not.

Not sure which memory system fits your workflow? Take the free quiz and get matched in two minutes.