Skip to content

ADR-0003 — Five-layer memory composition

Status: Accepted · Date: 2026-05-09 · Decider: maintainer · Verification: A (architectural foundation)

Context

Most agent frameworks treat memory as RAG: a vector store + a retriever. You jam relevant chunks into context per turn, hope the model picks the right ones, and call it memory.

This is insufficient for agents that grow over time:

  • No identity layer — the agent has no model of who you are across threads, only what you said in retrieved chunks
  • No procedural distinction — code/skill knowledge mixes with episodic conversation in the same vector space, polluting both
  • No reflection — there’s no place for the agent to think about past sessions and synthesize what was learned
  • No working memory — the current turn’s context is undifferentiated from long-term memory; everything is “context”

A composed-runtime architecture wants all five of these as separate concerns with separate writers, retention policies, and access patterns.

Decision

Memory composes as five layers:

LayerStorageWriterCadenceRead by
L1 WorkingClaude session state + Auto Memory fileClaude itselfImplicit per turnClaude every boot
L2 IdentityHoncho (Workspace > Peer > Session)Bridge per-turn1.5s dialectic + async ingestPer turn (pre-spawn)
L3 EpisodicSQLite episodes table + Float32 BLOB embeddingsBridge post-successPer turn (after reply sent)First turn of new threads
L4 Procedural.claude/skills/<slug>/SKILL.md git-tracked + persona filesReflection writer (drafts) → founder ✅At session bootClaude Code skill auto-discovery
L5 ReflectionDrafts to disk + Slack DMsclaude -p --effort low at session endAt session end + nightlyFounder review, then layer-4 + identity update

Each layer is independently writable, queryable, and revocable.

Consequences

Positive

  • Layered concerns: identity changes don’t pollute episodic; skills don’t pollute persona; reflection has its own surface
  • Different retention per layer: episodes can age out; persona is permanent; identity slowly evolves
  • Different writers: Claude writes its own working memory; the bridge writes episodic; reflection writes skills (founder-gated); Honcho derives identity facts
  • Replayability: episodes can be rewound (per SPEC-bitemporal-memory); identity can be exported (per Honcho’s API)
  • Mythic mapping: the layers map to roles in the Sefirot tree (Working = Malkuth, Identity = Binah, Episodic = Chokmah, Procedural = Yesod, Reflection = Kether), making the architecture visually legible in the dashboard

Negative

  • More complex than RAG — new readers must understand all five layers
  • More moving parts — each layer is a potential failure point
  • Migration cost — switching to Thoth requires mapping existing memory to the layered shape

Mitigation

  • Layers are optional — Honcho can be disabled via HONCHO_DISABLED=true, episodic via EPISODIC_DISABLED=true, reflection via REFLECTION_DISABLED=true. The runtime degrades gracefully to fewer layers.
  • Documentation explains each layer with diagrams + use cases
  • Each layer’s write/read patterns are documented in docs/concepts/memory

Alternatives considered

Alternative 1 — Single vector store (RAG)

Pros: Simple, well-understood, easy to swap providers.

Cons: Fails to distinguish identity from episode from skill; no clear path for reflection; no founder-gated procedural growth.

Rejected because: this is the failure mode we’re trying to escape.

Alternative 2 — Just Honcho + skills

Pros: Lighter, two well-defined layers.

Cons: No episodic recall; no reflection writers; no working memory distinction. Loses cross-thread continuity.

Rejected because: the cross-thread story is core to the value.

Alternative 3 — Three layers (working, episodic, procedural)

Pros: Simpler than five.

Cons: Conflates identity (per-peer model) with episodic (per-turn record); loses the reflection-as-its-own-layer surface.

Rejected because: the per-peer identity model is a real and distinct concern; reflection needs its own writers.

Alternative 4 — Letta-style hierarchical memory

Pros: OS-inspired, well-researched.

Cons: Single hierarchy doesn’t cleanly express the cross-cutting concerns (identity is cross-cutting; reflection is cross-cutting).

Rejected because: we want the agent to grow, not just to swap context blocks in/out.

Review trigger

Revisit this decision when:

  • Research demonstrates a 6th layer is fundamental (e.g., a “world model” layer per Dreamer-style approaches)
  • A specific layer is consistently disabled by users (signal we should restructure)
  • A new memory backend ships that subsumes multiple layers (unlikely in 2-3 year horizon)