Agent memory is not mainly a storage problem.

Storing more text is easy. The harder question is deciding what should be remembered, when it should be retrieved, how strongly it should influence the current task, and when it should be forgotten.

That makes memory a retrieval policy problem.

Memory has different jobs

People often talk about agent memory as if it were one box. In practice, different kinds of memory do different work.

An agent may need:

  • Short-term context for the current conversation.
  • Persistent facts that should carry across sessions.
  • Episodic history of past interactions and outcomes.
  • Semantic knowledge about concepts, systems, and relationships.
  • Procedural memory for repeatable workflows.

Mixing these together creates confusion. A user preference is not the same thing as a runbook. A project fact is not the same thing as a past conversation. A workflow instruction is not the same thing as a summary.

The boundaries matter because each layer has a different failure mode.

Retrieval can make the agent worse

Bad memory is not neutral.

Over-aggressive retrieval fills the context with stale, irrelevant, or over-specific information. The agent may follow an old constraint, overweight a past decision, or treat a rough note as durable truth.

Under-aggressive retrieval has the opposite problem. The agent forgets important context, repeats past analysis, or asks for information it already has.

Useful memory sits between those failures:

  • Store only information worth reusing.
  • Attach enough metadata to judge relevance.
  • Retrieve based on the current task, not just keyword overlap.
  • Keep old facts reviewable and replaceable.
  • Separate stable instructions from searchable history.

This is one reason LLM Maintained Knowledge Bases Need Boundaries matters. A knowledge base should not be an undifferentiated pile of memory.

Always-loaded memory should be small

Some context is important enough to load every time.

That set should be small: current priorities, persistent preferences, durable constraints, and operating rules that shape most work. If the always-loaded memory becomes too large, it stops being guidance and starts becoming noise.

Larger history should usually stay searchable. The agent can pull it in when the task needs it.

This is close to the pattern in A Three-Layer Pattern for AI-Maintained Notes: raw sources, generated knowledge, and working memory need different treatment. Not everything belongs in the prompt.

Forgetting is a feature

Memory systems need deletion, expiry, and correction.

Without those, the agent accumulates stale assumptions. Old project status, outdated architecture choices, and abandoned plans can become invisible drag on future work.

Forgetting does not have to mean losing history. It can mean moving old information out of always-loaded memory, marking it as superseded, or requiring fresh verification before use.

The useful question is not “can the agent remember everything?”

The useful question is “can the agent retrieve the right thing, with the right confidence, at the right time?”