,

How I Added Persistent Memory to My Claude Code Telegram Bot in 20 Minutes

I run a Telegram bot that wraps Claude Code. It lets me talk to Claude from my phone, kick off tasks, manage projects — all from Telegram. It works great. One problem though: every time a session resets, Claude forgets everything.

Every. Single. Time.

Hey, what were we working on yesterday?” Nothing. Blank stare. “Remember that bug we debugged for an hour?” Nope. Gone. Six months of conversations, decisions, debugging sessions — all evaporating the moment a session expires.

I fixed it with MemPalace by Milla Jovovich. Yes, that Milla Jovovich. WTF, right!? Twenty minutes, zero code changes, persistent memory that survives session resets. If you want to skip the story and set it up yourself: github.com/hamen/claude-code-telegram — the README has a step-by-step guide. It’s my fork of RichardAtCT/claude-code-telegram with two additions: runtime model switching (/model) and persistent memory via MemPalace.

The Problem Is Architectural

My bot uses the Claude Agent SDK. Sessions are tracked in SQLite with metadata — cost, turns, tools used — but zero conversation content is persisted. When a session times out or I hit /new, Claude starts fresh with no memory of what happened before.

This is fine for one-off questions. It’s terrible for ongoing projects where context is everything. I’d have entire conversations about architectural decisions, debug sessions that uncovered critical bugs, agreements on how to approach a refactor — and the next day, Claude would ask me what project I was working on.

The session storage tracks session_id, user_id, project_path, total_cost, total_turns, and message_count. Notice what’s missing: no conversation content, no decisions, no user preferences. It’s a billing ledger, not a memory.

Why Not Just Summarize Sessions?

The obvious approach is: summarize each session when it ends, inject summaries into the next session’s system prompt. There’s actually an open PR on the upstream repo that does exactly this. It was closed because of implementation issues, but the architecture itself has deeper problems:

Session SummariesMemPalace
What’s storedLLM-generated summaries (lossy)Verbatim content + knowledge graph (lossless)
SearchNone — dumps last 5 summaries into system promptSemantic search via ChromaDB
CostExtra Claude API call every /newZero — all local
Prompt bloatUp to 2,000 chars injected every session~120 tokens on wake-up, then on-demand
Knowledge graphNoYes — entity tracking, contradiction detection
Code changes1,492 lines across 19 filesZero

Summaries are lossy by definition. Claude decides what’s worth keeping. MemPalace stores everything and uses semantic search to surface what’s relevant. The difference matters when you need to recall a specific decision from three months ago.

Enter MemPalace

I saw Ben Sigman’s tweet about MemPalace — an open-source AI memory system he built with Milla Jovovich. The pitch: give your AI persistent, searchable memory that runs entirely on your machine. No cloud, no subscription, no API key.

What caught my eye: it runs as an MCP server. My bot already had MCP infrastructure built in (I just never enabled it). This meant integration would be mostly configuration, not code.

MemPalace organizes memories into a “palace” structure inspired by the ancient Greek method of loci — wings (projects/people), halls (memory types), rooms (topics), closets (compressed summaries), and drawers (verbatim content). On disk, it’s ChromaDB for semantic search and SQLite for the knowledge graph. The MCP server exposes 19 tools that Claude can call to search, add, delete, query entities, write diary entries, and navigate between related topics.

The Integration: 20 Minutes

Here’s what I actually had to do:

1. Install it

pip install mempalace

2. Create an MCP config file (7 lines)

{
  "mcpServers": {
    "mempalace": {
      "command": "python",
      "args": ["-m", "mempalace.mcp_server"]
    }
  }
}

3. Set three environment variables

ENABLE_MCP=true
MCP_CONFIG_PATH=config/mcp.json
DISABLE_TOOL_VALIDATION=true

4. Initialize the palace

mempalace init /path/to/your/project

5. Add memory instructions to CLAUDE.md

This is the key part — and the insight that took me the longest to figure out. Without explicit instructions, Claude will acknowledge information but not actually save it. It’ll say “Got it, I’ll remember that” and then forget everything on the next session reset. You need a CLAUDE.md in your working directory that tells Claude to proactively use the memory tools:

  • Call mempalace_status on wake-up to load the palace
  • Use mempalace_add_drawer and mempalace_kg_add when told to remember things
  • Call mempalace_search before answering questions about past work
  • Write a diary entry with mempalace_diary_write at the end of meaningful sessions

The instructions are the product. The tools alone aren’t enough.

6. Restart the bot

That’s it. Zero code changes to the bot itself. The existing MCP infrastructure just picked it up.

The full setup with every step, config snippet, and CLAUDE.md template is in the README of my fork.

The Test

First, I sent the bot: “remember that my name is Ivan and I’m working on a solar inverter project with ZCS Azzurro 6kW hybrid.”

In the bot’s verbose output, I could see it working — it called mempalace_status to check the palace, mempalace_search to look for existing context, then mempalace_add_drawer to store the information and mempalace_kg_add to add facts to the knowledge graph. Real MCP tool calls, not just in-session acknowledgment.

I verified the data was persisted by querying the knowledge graph directly:

Ivan → has_name → Ivan
Ivan → works_on → ZCS Azzurro 6kW hybrid solar inverter project

Then the real test: /new to reset the session completely. Fresh start. Brand new Claude session. “What do you know about me?”

It came back with my name, my projects, technical details about my setup — everything it had stored. Pulled from the local ChromaDB database, not from session context.

Persistent memory that survives session resets. That’s it. That’s the whole point.

What Claude Actually Sees

When Claude wakes up in a new session, the CLAUDE.md instructions tell it to call mempalace_status. The response includes the full memory protocol and the AAAK dialect spec (a 30x compression format for fitting more context into fewer tokens). From that point, Claude knows:

  • What wings exist (people, projects)
  • How many drawers are stored
  • How to search, add, and manage memories
  • The AAAK shorthand for efficient storage

On a search query, Claude gets back relevant drawers ranked by semantic similarity. On a knowledge graph query, it gets structured triples with temporal validity — who works on what, when decisions were made, what changed.

The bot’s system prompt stays lean. Memory loads on demand, not all at once.

What I’d Watch Out For

  • You MUST instruct Claude to use the tools. I cannot stress this enough. Without a CLAUDE.md telling Claude to proactively save and search, it will just store things in session context and lose them on reset. I burned my first test on this — Claude said “I’ll remember that” and then didn’t call any MCP tools. The CLAUDE.md fixed it immediately.
  • ChromaDB adds memory overhead. My bot’s memory usage increased after the MCP server spawns. Monitor this if you’re on a constrained machine. The MCP server process starts and stops with each Claude session (stdio transport), so it’s not permanently resident.
  • It’s brand new. MemPalace is version 3.0.0, just released. The init command has interactive prompts that don’t fully support --yes for all steps yet. Expect rough edges and rapid iteration.
  • Palace directory discovery matters. The MCP server inherits its working directory from the Claude SDK options. Make sure mempalace init was run in the same directory your bot’s APPROVED_DIRECTORY points to, or the server won’t find the .mempalace folder.

FAQ

Does this work with the upstream claude-code-telegram, or only your fork?

Both. The MCP infrastructure exists in the upstream RichardAtCT/claude-code-telegram project. My fork just has the setup guide in the README and the /model command for switching between Opus, Sonnet, and Haiku from Telegram.

How much disk space does it use?

Minimal. ChromaDB stores embeddings, and the knowledge graph is SQLite. After initial setup with a few memories, it’s under 10MB. It grows with usage but stays small — these are text memories, not media files.

Can I use MemPalace with other AI tools, not just Telegram bots?

Yes. MemPalace works with anything that supports MCP servers — Claude Code desktop, Cursor, or any local model via the CLI. The Telegram bot is just one integration point.

Does it work with local models like Llama or Mistral?

MemPalace itself is model-agnostic. The MCP server doesn’t call any AI APIs. For local models that don’t support MCP yet, MemPalace has a wake-up command that dumps context to a text file you can paste into your model’s system prompt.

What happens if the palace gets corrupted?

The palace is just ChromaDB + SQLite files in ~/.mempalace/palace. Back it up like any other local database. If it gets corrupted, worst case is you lose memories and start fresh — your bot still works, it just forgets.

Bottom Line

If you’re running any kind of AI agent that needs to remember things across sessions — a Telegram bot, a development assistant, a project manager — MemPalace is worth trying. The MCP integration means you can bolt it onto anything that supports MCP servers without changing your existing code.

Twenty minutes of configuration for persistent AI memory. Running locally on my machine. Open source. I’ll take that trade any day.

Get it running: github.com/hamen/claude-code-telegram — fork it, follow the MemPalace setup in the README, and your bot remembers everything.


MemPalace is MIT licensed by Ben Sigman and Milla Jovovich. The Telegram bot is a fork of RichardAtCT/claude-code-telegram. I’m Ivan Morgillo — I write about Linux, dev tools, and the things I build.