Agent Memory: Context That Survives a Restart

Humans have working, long-term, and procedural memory — an agent needs the same. In 2026 memory stopped being an optimization and became an architectural requirement: without it every session starts from scratch, and the user explains the same thing for the third time in a row.

AdvancedAI Agents25 minMem0, JSON file storage, Claude API

Short, long, working — an agent has three memories

In humans, working memory holds the current conversation, long-term memory stores facts about the world, and procedural memory holds skills like riding a bike. An agent has the same structure: context window = working memory, external storage = long-term, system prompt with rules = procedural. Inside long-term, three more types exist: episodic (what happened in a specific session), semantic (stable facts about the user or project), procedural (how to do a task correctly). Mixing them in one store is a classic mistake: the fact "user prefers TypeScript" lives forever, while "today I fixed an auth bug" is stale in a week. Separate them by type from day one.

Episodic

What happened in a session
Example: "yesterday we fixed a Redis bug"
Goes stale fast

Semantic

Stable facts
Example: "project uses Next.js 16, pnpm"
Changes rarely

Procedural

How to do X correctly
Example: "run npm run lint before commit"
Lives in the system prompt

Mixing types is the first mistake. Store them separately, otherwise the agent will blend facts with experience, and procedural rules will drown in stale events.

File or vector DB? Most projects get this wrong

Counterintuitive but true: for most agents, file-based memory outperforms vector retrieval. The reason is simple — if the entire memory fits in 5–10K tokens, loading it whole is cheaper and more reliable than searching for relevant chunks. A vector DB adds infrastructure, retrieval miss risk, and one more service that can fail. When is a file justified? A personal assistant, a coding agent for one project, a support bot with a fixed FAQ base. When do you actually need a vector DB? When you have thousands of documents, hundreds of users with different context, or a sub-second latency requirement. Choose by real scale, not "future-proofing".

File-based or vector DB?

Memory < 10K tokens — a file is enough

Single user / single project

Need to search across thousands of documents

Hard requirement of latency < 1s

"Future-proofing" without real data — no

Mem0 2026 benchmark: file-based memory gives 72.9% accuracy, vector gives 66.9%. But vector is 0.2s vs 17s. Choose by latency, not size.

An index, not a dump: how the agent decides what to read

The naive approach: one giant memory.md loaded every session. A month in, it's 30K tokens, half of it stale, the agent burns context on noise. The right pattern is an index: MEMORY.md lists files with one line per file. Details live in separate files, loaded only when needed. This works like a book's table of contents. You don't read the whole book to find the authentication chapter — you check the TOC and open the right page. The agent does the same: reads the index, picks 1–2 relevant files based on the user's task, loads only those. Context stays lean, and memory scales without quality degradation.

Session start

Read MEMORY.md (index)

Pick relevant files

Load only chosen

Work on the task

session_start:
  прочитать MEMORY.md (индекс: список файлов + 1 строка на каждый)
  по задаче пользователя → выбрать релевантные файлы
  загрузить только выбранные → не всё подряд

во время сессии:
  узнал новый факт → записал в file.md
  добавил ссылку в MEMORY.md (одна строка)

What to save, and what to squeeze out

Memory is not an event log, it's a distillation of what can't be recovered from the current state of the world. Four types worth saving: user profile (name, role, preferences), feedback and corrections (what the user fixed — this teaches the agent), project state (stack, conventions, open questions), references to external resources. What NOT to save? Code — it's already in the repo. Git history — `git log` remembers. Intermediate task state — once solved, it's trash. Debugging noise — attempts, errors, reverts. Simple rule: if a fact is recoverable via `grep` or `git log`, it's not memory, it's a duplicate. Save what lives only in the user's head.

Save	Do not save
Profile: name, role, stack	Code — it's already in the repo
User corrections and their "not like that"	Git history — it's in `git log`
Project conventions, open questions	Intermediate task state
Links to external resources and docs	Debug noise: attempts, reverts

If a fact is recoverable via `git log` or `grep` — don't save it. Memory is for what's NOT in the code.

Memory goes stale — and that's more dangerous than having none

An agent without memory re-asks for context every time — slow but safe. An agent with stale memory acts confidently on wrong data — and that's far worse. Typical scenario: memory says "AuthService.ts contains login logic" → the agent edits this file → but three months ago it was renamed to auth/service.ts, logic moved, the old file is an empty shell. The agent confidently breaks what's no longer there. Three defenses against staleness. First, verify before acting: before relying on a fact from memory, check it in the current state (grep, ls, API call). Memory is a hint, not a source of truth. Second, periodic review: once a month walk through MEMORY.md and delete what's no longer relevant. This can't be automated — it's manual hygiene. Third, immediate deletion on detected error: if the agent finds something wrong in memory, remove it right away, not "fix later". Every stale fact is a landmine under the next session. Fundamental difference: the current state of code always wins over memory. Memory is a snapshot of the past, code is the truth of the present. Conflict? Always pick the present.

Good habit: before using a fact from memory, verify it in the code. "Memory says X exists" ≠ "X exists now".

Result

You understand that agent memory is an architecture of three types (episodic, semantic, procedural), why a file is more often right than a vector DB, and how an index with separate files scales without context bloat. The key point: memory goes stale, and verifying before acting matters more than the act of saving itself.

All Recipes

Agent Memory: Context That Survives a Restart

AdvancedAI Agents25 minMem0, JSON file storage, Claude API

Short, long, working — an agent has three memories

Episodic

What happened in a session
Example: "yesterday we fixed a Redis bug"
Goes stale fast

Semantic

Stable facts
Example: "project uses Next.js 16, pnpm"
Changes rarely

Procedural

How to do X correctly
Example: "run npm run lint before commit"
Lives in the system prompt

Mixing types is the first mistake. Store them separately, otherwise the agent will blend facts with experience, and procedural rules will drown in stale events.

File or vector DB? Most projects get this wrong

File-based or vector DB?

Memory < 10K tokens — a file is enough

Single user / single project

Need to search across thousands of documents

Hard requirement of latency < 1s

"Future-proofing" without real data — no

Mem0 2026 benchmark: file-based memory gives 72.9% accuracy, vector gives 66.9%. But vector is 0.2s vs 17s. Choose by latency, not size.

An index, not a dump: how the agent decides what to read

Session start

Read MEMORY.md (index)

Pick relevant files

Load only chosen

Work on the task

session_start:
  прочитать MEMORY.md (индекс: список файлов + 1 строка на каждый)
  по задаче пользователя → выбрать релевантные файлы
  загрузить только выбранные → не всё подряд

во время сессии:
  узнал новый факт → записал в file.md
  добавил ссылку в MEMORY.md (одна строка)

What to save, and what to squeeze out

Save	Do not save
Profile: name, role, stack	Code — it's already in the repo
User corrections and their "not like that"	Git history — it's in `git log`
Project conventions, open questions	Intermediate task state
Links to external resources and docs	Debug noise: attempts, reverts

If a fact is recoverable via `git log` or `grep` — don't save it. Memory is for what's NOT in the code.

Memory goes stale — and that's more dangerous than having none

Good habit: before using a fact from memory, verify it in the code. "Memory says X exists" ≠ "X exists now".

Agent Memory: Context That Survives a Restart

Short, long, working — an agent has three memories

Episodic

Semantic

Procedural

File or vector DB? Most projects get this wrong

File-based or vector DB?

An index, not a dump: how the agent decides what to read

What to save, and what to squeeze out

Memory goes stale — and that's more dangerous than having none

Result

Related Theory

Agent Memory: Context That Survives a Restart

Short, long, working — an agent has three memories

Episodic

Semantic

Procedural

File or vector DB? Most projects get this wrong

File-based or vector DB?

An index, not a dump: how the agent decides what to read

What to save, and what to squeeze out

Memory goes stale — and that's more dangerous than having none

Result

Related Theory