Deep Dive into Claude's Memory System — Auto Memory, Auto Dream, and Sleep-time Compute

Posted Mar 25, 2026

Advanced

By Sehyup

14 min read

Deep Dive into Claude's Memory System — Auto Memory, Auto Dream, and Sleep-time Compute

Prerequisites — Read these first

AI in Game Development: Insights from 2,165 Messages - A Developer's 47-Day Log
Claude Opus 4.5 → 4.6 Transition: Performance, Tokens, Workflow Changes Experienced by a Game Developer
Does AGENTS.md Really Help? - Analysis of a Paper Verifying the Impact of Context Files on Coding Agents
Claude Memory Goes Free, /simplify & /batch — and the Hidden Cost of CLAUDE.md
Complete Guide to Installing Claude Code on Windows — With Real-World Troubleshooting
The Complete Claude Code Guide for Game Designers — From Specs to Balancing
Complete Guide to Setting Up C# LSP for Claude Code on macOS — From csharp-ls Installation to Troubleshooting
C# LSP vs JetBrains MCP Token Efficiency Analysis — Which Tool Is More Efficient in Claude Code?
Mastering Claude Skills 2.0 — Skill Creator, Benchmarking, and Trigger Optimization
Deep Dive into Claude's Memory System — Auto Memory, Auto Dream, and Sleep-time Compute

Mastering Claude Skills 2.0 — Sk...

TL;DR — Key Takeaways

Claude Code's memory has evolved into three stages — CLAUDE.md (explicit rules) + Auto Memory (automatic learning) + Auto Dream (sleep consolidation) — mimicking the human 'write → sleep → organize → remember' cycle
Auto Dream is built on the theoretical foundation of the Sleep-time Compute paper (2025) — pre-computing before user queries can reduce test-time computation by approximately 5x
Recent research (2025–2026) classifies agent memory into Factual, Experiential, and Working memory, managed through a Formation → Evolution → Retrieval lifecycle

Introduction

The ability of AI agents to learn and remember long-term beyond a single conversation is a central topic of AI research in 2025–2026. Claude Code is one of the most aggressively experimenting products in this space, and the recently discovered unreleased feature Auto Dream clearly shows this direction.

This article dissects Claude’s memory system along three axes:

Product level: Claude Code’s memory architecture (CLAUDE.md → Auto Memory → Auto Dream)
Theoretical level: The “pre-computation during sleep” paradigm from the Sleep-time Compute paper
Research level: Agent memory taxonomies and latest survey papers from 2025–2026

1. The Complete Picture of Claude Code’s Memory Architecture

1-1. Three-Layer Memory System

Claude Code’s memory is divided into three layers based on who writes it and when it loads.

Layer	Author	Content	Loading	Scope
CLAUDE.md	User	Explicit rules & conventions	Every session start (full)	Project/User/Org
Auto Memory	Claude	Learned patterns & preferences	Every session start (200 lines)	Per git repository
Auto Dream	Claude (background)	Consolidated memories	When conditions are met	Per git repository

CLAUDE.md — Declarative Rule Hierarchy

CLAUDE.md is a project-level instruction file. Narrower scope (project) takes precedence over broader scope (organization).

Priority (high → low):
Managed Policy — /Library/Application Support/ClaudeCode/CLAUDE.md (IT admin)
Project       — ./CLAUDE.md or ./.claude/CLAUDE.md (team shared)
User          — ~/.claude/CLAUDE.md (personal global)

Requirements for effective CLAUDE.md:

Rule compliance rate is 92%+ under 200 lines, dropping to 71% beyond 400 lines
Verifiable instructions (“use 2-space indentation”) work better than vague ones (“format code nicely”)
Modularize with .claude/rules/ directory for conditional loading via glob patterns

Auto Memory — Notes Claude Writes for Itself

Auto Memory automatically records patterns Claude discovers during sessions, without the user writing anything.

~/.claude/projects/<project>/memory/
├── MEMORY.md          # Index (200 lines loaded per session)
├── debugging.md       # Debugging patterns
├── api-conventions.md # API design decisions
└── ...                # Claude creates freely

How it works:

First 200 lines of MEMORY.md are injected into context at session start
Detects user corrections, preferences, and recurring patterns during conversation
Selectively saves based on “Will this be useful in a future conversation?”
Detailed content goes to topic files, MEMORY.md stays as an index only

1-2. The Fundamental Problem with Auto Memory

Auto Memory is powerful, but descends into chaos over time.

After 20+ sessions:

Contradictory entries accumulate (“use pnpm” vs “use npm”)
Relative dates lose meaning (“bug fixed yesterday” from 3 months ago)
Transient notes and essential learnings are stored at the same level
Priority competition within the 200-line limit

This isn’t a simple implementation bug — it’s the structural limitation of a write-only system that never consolidates.

2. Auto Dream — AI That “Organizes Memories While Sleeping”

2-1. How It Was Discovered

Auto Dream was discovered in March 2026 by Japanese developer Akari on Zenn, found in Claude Code’s /memory command. It appears in the UI but cannot be activated — controlled by a server-side feature flag.

Codename: tengu_onyx_plover
Default settings:
  enabled: false
  minHours: 24        # Minimum 24-hour interval
  minSessions: 5      # Runs after 5 sessions accumulated

This dual-gate design is intentional — it doesn’t trigger for light usage, only running periodic consolidation for active development projects.

2-2. Auto Dream’s Four-Phase Process

Auto Dream is analogous to human REM sleep. Information collected while awake (sessions) is organized and consolidated while sleeping (background).

Phase 1 — Orientation

Scans the memory directory and assesses the current state — which topic files exist, the current size of MEMORY.md, and how much time has elapsed since the last consolidation.

Phase 2 — Gather Signal

Selectively extracts important information from session transcripts. Instead of reading all files, it performs narrow grep searches:

User corrections (“No, not that…”)
Explicit save commands (“Remember this”)
Recurring topics and patterns
Important technical decisions

Phase 3 — Consolidation

Merges new information into existing files while:

Converting relative dates (“yesterday”) to absolute dates (“2026-03-24”)
Keeping only the latest among contradictory facts
Removing outdated, no longer valid memories
Merging duplicate entries into one

Phase 4 — Prune and Index

Updates the MEMORY.md index to stay under 200 lines.

2-3. Auto Memory + Auto Dream = Complete Memory

  Auto Memory (While Awake)            Auto Dream (While Sleeping)
  ┌─────────────────────┐         ┌─────────────────────┐
  │  Collect experiences │         │  Extract signals     │
  │  Detect patterns     │  ───▶   │  Resolve conflicts   │
  │  Write notes         │         │  Consolidate/Organize│
  │  Split topic files   │         │  Update index        │
  └─────────────────────┘         └─────────────────────┘
           ↑                                │
           └────────────────────────────────┘
                  Reflected in next session

This cycle directly mimics the human “experience → sleep → memory strengthening” pattern. In neuroscience, the process where the hippocampus replays daytime experiences during sleep and transfers them to the neocortex is called memory consolidation — Auto Dream plays exactly this role.

2-4. Why Hasn’t It Been Released Yet?

While technically ready for immediate release, three business decisions remain:

Cost: A sub-agent consumes tokens in the background without user request
Transparency: Trust issues around restructuring memory without user awareness
Defaults: opt-in vs opt-out — what’s the right default behavior?

3. Theoretical Foundation — Sleep-time Compute

3-1. Paper Overview

The theoretical basis for Auto Dream is the “Sleep-time Compute: Beyond Inference Scaling at Test-time” paper published in April 2025.

Authors: Kevin Lin, Charlie Snell, Yu Wang, Charles Packer, Sarah Wooders, Ion Stoica, Joseph E. Gonzalez arXiv: 2504.13171 Core idea: Pre-computing before the user asks questions can drastically reduce computation needed during actual inference.

Figure 1: Sleep-time compute pre-processes the original context, performing additional computations that may be useful for future queries. (Lin et al., 2025)

3-2. Key Results

The paper demonstrates impressive efficiency improvements on two modified reasoning tasks (Stateful GSM-Symbolic, Stateful AIME).

Figure 3: Test-time compute vs. accuracy tradeoff on Stateful GSM-Symbolic. The shaded area shows where sleep-time compute improves the Pareto frontier.

Key figures:

~5x reduction in test-time compute: Reduces inference computation needed for the same accuracy to about 1/5
Up to 13% accuracy improvement: Scaling sleep-time compute achieves up to 13 percentage points higher accuracy
2.5x reduction in average cost per query: Sharing context across multiple queries reduces costs

Figure 9: On Multi-Query GSM-Symbolic, the cost-accuracy Pareto improves as the number of questions per context increases.

3-3. Correlation Between Predictability and Effectiveness

A particularly interesting finding is that the more predictable the user’s question is from context, the greater the effect of sleep-time compute.

Figure 10: As question predictability increases, the performance gap between sleep-time compute and standard inference widens.

The implications for Auto Dream are clear:

Recurring patterns in development projects (build commands, coding style, architecture decisions) have high predictability
Therefore, pre-organizing these patterns allows Claude to generate more accurate responses with fewer tokens in the next session
Auto Dream is essentially sleep-time compute specialized for development context

3-4. Real-World Software Engineering Application

The paper also validated on SWE-Features, a real software engineering task.

Figure 11: On SWE-Features, sleep-time compute shows higher F1 scores than the standard approach at lower test-time budgets.

4. Academic Context — The Current State of Agent Memory Research

Claude’s memory system sits within a broader academic research stream. Let’s understand this context through key survey papers published in 2025–2026.

4-1. Mapping Human Memory → AI Memory

“From Human Memory to AI Memory” (Wu et al., 2025) proposed a framework mapping human memory taxonomies to AI memory systems.

Figure 1: Correspondence between human memory (sensory, working, explicit/implicit) and LLM-based AI system memory. (Wu et al., 2025)

This paper classifies memory using 3 dimensions (object, form, time) and 8 quadrants:

Dimension	Description	Example
Object	Whose memory is it?	Individual vs. collective
Form	How is it stored?	Text, vector, parameter
Time	How long does it persist?	Short-term vs. long-term

Mapping to Claude Code:

CLAUDE.md = Explicit long-term memory (user-declared rules)
Auto Memory = Implicit long-term memory (auto-extracted from experience)
Context window = Working memory (current session)
Auto Dream = Memory consolidation (sleep-time organization)

4-2. Memory Taxonomy for the Age of AI Agents

“Memory in the Age of AI Agents” (Hu, Liu et al., 2025/2026) is a large-scale survey with 47 authors, proposing a new taxonomy that goes beyond simple “long-term/short-term memory” classification.

arXiv: 2512.13564 (December 2025, v2: January 2026)

The paper first distinguishes agent memory from LLM memory, RAG, and context engineering.

Figure 1: Conceptual comparison of Agent Memory with LLM memory, RAG, and context engineering. (Hu et al., 2025)

Forms Dimension — Physical Structure of Memory

The physical forms of agent memory are classified into three types.

Figure 4: Comparison of three forms — token-level, parametric, and latent memory.

Token-level memory is the most intuitive form, storing information as text tokens. It’s further subdivided by topological complexity:

Figure 2: Topological classification of token-level memory — flat (1D), planar (2D graph/tree), hierarchical (3D multi-layer).

Form	Structure	Example	Claude Code Mapping
Flat (1D)	Linear sequence	Simple text log	`MEMORY.md` entry listing
Planar (2D)	Tree/Graph	Knowledge graph	Cross-references between topic files
Hierarchical (3D)	Multi-layer	Pyramid memory	`MEMORY.md` → topic file hierarchy

Parametric memory encodes information in model weights themselves. Fine-tuning and LoRA are representative examples. Claude Code does not use this.

Latent memory stores information in hidden states or KV caches. Anthropic’s prompt caching is close to this category.

Functions Dimension — Cognitive Roles of Memory

Figure 6: Functional taxonomy of agent memory — Factual, Experiential, and Working memory.

Factual Memory

The agent’s declarative knowledge base, divided into two subtypes:

User Factual Memory: Information for maintaining consistency in user interactions (“This project uses pnpm”)
Environment Factual Memory: Ensuring consistency with external world knowledge (“This API was deprecated in Node.js 20”)

In Claude Code, CLAUDE.md is the primary store for factual memory.

Experiential Memory

Learnings extracted from past experience. Three subtypes:

Case-based: Records of past episodes (“Last build error was resolved this way”)
Strategy-based: Learning of reasoning patterns (“In this project, always run type checks first”)
Skill-based: Procedural capabilities (“Test → Build → Deploy pipeline”)

In Claude Code, Auto Memory handles experiential memory.

Working Memory

Capacity-limited active context that maintains the state of ongoing tasks in the current session.

In Claude Code, the context window itself is working memory, and the 200-line limit of MEMORY.md is reminiscent of the human working memory capacity limit (Miller’s 7±2 law).

Dynamics Dimension — Memory Lifecycle

Figure 8: Agent memory lifecycle — Formation, Evolution, Retrieval.

Stage	Description	Claude Code Mapping
Formation	Writing new information to memory	Auto Memory detects and saves patterns during sessions
Evolution	Updating, transforming, deleting existing memory	Auto Dream resolves contradictions, consolidates, refines
Retrieval	Accessing memory at the needed moment	200 lines loaded at session start + on-demand topic file reading

Figure 7: Detailed mechanisms of memory evolution — update, reinforcement, forgetting, restructuring.

4-3. The Write–Manage–Read Loop

“Memory for Autonomous LLM Agents” (2026) formalized agent memory as a Write–Manage–Read loop.

┌──────────┐    ┌──────────┐    ┌──────────┐
│  Write   │───▶│  Manage  │───▶│   Read   │
│ (Record) │    │ (Manage) │    │(Retrieve)│
└──────────┘    └──────────┘    └──────────┘
     ↑                                │
     └────────────────────────────────┘
              Feedback Loop

Applying this framework to Claude Code:

Stage	Implementation	Owner
Write	Recording to memory files during sessions	Auto Memory
Manage	Periodic consolidation and refinement	Auto Dream
Read	Loading 200 lines at session start + accessing topic files on demand	Claude Code Runtime

5. Claude’s Memory in a Broader Context

5-1. Chat Memory vs Claude Code Memory

Claude’s memory system provides different layers depending on user type.

Layer	Target	Mechanism	Consolidation Cycle
Chat Memory	All Claude users	Auto-extraction from conversations (Memory Synthesis)	~24 hours
CLAUDE.md + Auto Memory	Claude Code developers	Explicit rules + automatic learning	Manual / Auto Dream
API Memory Tool	App builders	Programmatic CRUD	Per app logic

Chat Memory is the simplest form, using extractive summarization that extracts long-term useful information from conversations approximately every 24 hours. Claude Code’s Auto Memory + Auto Dream combination is a far more sophisticated evolution of this.

5-2. Three Layers of Infrastructure Scheduling

Auto Dream’s execution infrastructure is also evolving in three stages:

Layer	Scope	Persistence	Example
CLI `/loop`	Active session only	Dies on session end	`loop 10m /simplify`
Desktop Scheduled Tasks	Local machine persistent	Dies on machine shutdown	crontab-based
Cloud Scheduled Tasks	Anthropic infrastructure	Always running	Serverless execution

When Auto Dream moves to Cloud Scheduled Tasks, memory consolidation can proceed even when the user’s computer is off. This aligns with Anthropic’s $100M investment in the Claude Partner Network in March 2026.

5-3. Competitive Landscape

Product	Memory Form	Consolidation	Developer Integration
Claude Code	File-based (CLAUDE.md + MEMORY.md)	Auto Dream (upcoming)	Per git repository
ChatGPT	Server-side key-value	Auto summary	Limited
GitHub Copilot	`.github/copilot-instructions.md`	None	Per repository
Cursor	`.cursorrules`	None	Per project

Claude Code’s differentiator is the existence of the consolidation phase (Auto Dream). While other tools offer “write-only” memory, Claude Code aims to implement the complete cycle of “write, sleep, organize, and remember.”

6. Practical: Effective Memory Management Strategies

6-1. CLAUDE.md Writing Principles

  
# Good: Specific and verifiable
- Use 2-space indentation
- Run `npm test` before committing
- API handlers go in `src/api/handlers/`

# Bad: Vague and unverifiable
- Write clean code
- Test properly
- Keep files organized

6-2. Auto Memory Tips

Check regularly with /memory: Periodically review what Claude remembers
Fix contradictions immediately: Say “that’s no longer correct” and Claude will update
Encourage topic file splitting beyond 200 lines: Request “please organize the memory”
Watch for sensitive information: Ensure API keys and passwords aren’t stored

6-3. Preparing for Auto Dream

When Auto Dream officially launches, you can expect:

Automatic conflict resolution: Contradictory records for the same setting consolidated to the latest
Relative date conversion: “yesterday” automatically converted to absolute dates
Priority-based cleanup: Recurring patterns weighted higher than transient notes
Manual trigger: /dream command for immediate cleanup after major refactoring

Conclusion — The Era of AI That Dreams

The evolution of Claude’s memory system can be summarized in one line:

“Writing (Auto Memory) alone isn’t enough. You need to sleep and organize (Auto Dream) for real memories to form.”

This isn’t just a feature addition — it’s a signal that LLM agents are beginning to mimic fundamental mechanisms of human cognition. The Sleep-time Compute paper provides the theory, agent memory surveys provide the taxonomy, and Claude Code’s Auto Dream implements it as a product.

As a game developer analogy, it follows the same trajectory as NPC AI evolving from simple state machines (FSM) to behavior trees, and then to utility AI. Just as each step qualitatively changed the complexity of situations agents could handle, AI memory systems are preparing for a qualitative leap along the trajectory of “simple storage → automatic learning → sleep consolidation.”

References

Papers

Kevin Lin et al., “Sleep-time Compute: Beyond Inference Scaling at Test-time”, arXiv:2504.13171, 2025.
Yuyang Hu, Shichun Liu et al., “Memory in the Age of AI Agents: A Survey”, arXiv:2512.13564, 2025/2026.
Yaxiong Wu et al., “From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs”, arXiv:2504.15965, 2025.
“Memory for Autonomous LLM Agents”, arXiv:2603.07670, 2026.
“A Survey on the Memory Mechanism of Large Language Model-based Agents”, ACM TOIS, 2025.

Articles and Documentation

Akari, “The Day We Might Say ‘Sweet Dreams’ to Claude”, Zenn, 2026-03-24.
Anthropic, “How Claude remembers your project”, Claude Code Docs.
“Claude Memory Guide: Understanding the 3-Layer Architecture”, ShareUHack, 2026.
“Claude Code Auto Dream: Memory Consolidation Feature Explained”, ClaudeFast.

AI, Claude

This post is licensed under CC BY 4.0 by the author.