ADR-0003 — Five-layer memory composition
Status: Accepted · Date: 2026-05-09 · Decider: maintainer · Verification: A (architectural foundation)
Context
Most agent frameworks treat memory as RAG: a vector store + a retriever. You jam relevant chunks into context per turn, hope the model picks the right ones, and call it memory.
This is insufficient for agents that grow over time:
- No identity layer — the agent has no model of who you are across threads, only what you said in retrieved chunks
- No procedural distinction — code/skill knowledge mixes with episodic conversation in the same vector space, polluting both
- No reflection — there’s no place for the agent to think about past sessions and synthesize what was learned
- No working memory — the current turn’s context is undifferentiated from long-term memory; everything is “context”
A composed-runtime architecture wants all five of these as separate concerns with separate writers, retention policies, and access patterns.
Decision
Memory composes as five layers:
| Layer | Storage | Writer | Cadence | Read by |
|---|---|---|---|---|
| L1 Working | Claude session state + Auto Memory file | Claude itself | Implicit per turn | Claude every boot |
| L2 Identity | Honcho (Workspace > Peer > Session) | Bridge per-turn | 1.5s dialectic + async ingest | Per turn (pre-spawn) |
| L3 Episodic | SQLite episodes table + Float32 BLOB embeddings | Bridge post-success | Per turn (after reply sent) | First turn of new threads |
| L4 Procedural | .claude/skills/<slug>/SKILL.md git-tracked + persona files | Reflection writer (drafts) → founder ✅ | At session boot | Claude Code skill auto-discovery |
| L5 Reflection | Drafts to disk + Slack DMs | claude -p --effort low at session end | At session end + nightly | Founder review, then layer-4 + identity update |
Each layer is independently writable, queryable, and revocable.
Consequences
Positive
- Layered concerns: identity changes don’t pollute episodic; skills don’t pollute persona; reflection has its own surface
- Different retention per layer: episodes can age out; persona is permanent; identity slowly evolves
- Different writers: Claude writes its own working memory; the bridge writes episodic; reflection writes skills (founder-gated); Honcho derives identity facts
- Replayability: episodes can be rewound (per SPEC-bitemporal-memory); identity can be exported (per Honcho’s API)
- Mythic mapping: the layers map to roles in the Sefirot tree (Working = Malkuth, Identity = Binah, Episodic = Chokmah, Procedural = Yesod, Reflection = Kether), making the architecture visually legible in the dashboard
Negative
- More complex than RAG — new readers must understand all five layers
- More moving parts — each layer is a potential failure point
- Migration cost — switching to Thoth requires mapping existing memory to the layered shape
Mitigation
- Layers are optional — Honcho can be disabled via
HONCHO_DISABLED=true, episodic viaEPISODIC_DISABLED=true, reflection viaREFLECTION_DISABLED=true. The runtime degrades gracefully to fewer layers. - Documentation explains each layer with diagrams + use cases
- Each layer’s write/read patterns are documented in docs/concepts/memory
Alternatives considered
Alternative 1 — Single vector store (RAG)
Pros: Simple, well-understood, easy to swap providers.
Cons: Fails to distinguish identity from episode from skill; no clear path for reflection; no founder-gated procedural growth.
Rejected because: this is the failure mode we’re trying to escape.
Alternative 2 — Just Honcho + skills
Pros: Lighter, two well-defined layers.
Cons: No episodic recall; no reflection writers; no working memory distinction. Loses cross-thread continuity.
Rejected because: the cross-thread story is core to the value.
Alternative 3 — Three layers (working, episodic, procedural)
Pros: Simpler than five.
Cons: Conflates identity (per-peer model) with episodic (per-turn record); loses the reflection-as-its-own-layer surface.
Rejected because: the per-peer identity model is a real and distinct concern; reflection needs its own writers.
Alternative 4 — Letta-style hierarchical memory
Pros: OS-inspired, well-researched.
Cons: Single hierarchy doesn’t cleanly express the cross-cutting concerns (identity is cross-cutting; reflection is cross-cutting).
Rejected because: we want the agent to grow, not just to swap context blocks in/out.
Review trigger
Revisit this decision when:
- Research demonstrates a 6th layer is fundamental (e.g., a “world model” layer per Dreamer-style approaches)
- A specific layer is consistently disabled by users (signal we should restructure)
- A new memory backend ships that subsumes multiple layers (unlikely in 2-3 year horizon)