Every time you talk to Claude, Claude doesn't "remember" your previous conversations. It doesn't have a brain that stores memories. Instead, it has a context window — a fixed-size workspace where everything Claude can "see" right now lives.
Think of it as a desk. Everything Claude needs to do its job has to fit on this desk. If the desk fills up, older stuff gets pushed off the edge. When a new session starts, the desk is cleared and reset.
Click any layer to see what's inside and why it matters.
@.claude/rules/ references in CLAUDE.md. Hard rules, session protocol, team structure, marketing rules.search_docs result, every mac-shell output, every file read — it ALL goes into the context window. A single large file read can consume thousands of tokens.Click a scenario to see how the context window fills up differently depending on the situation.
| Model | Context Window | Roughly... |
|---|---|---|
| Claude Opus 4.6 | 200,000 tokens | ~150,000 words or ~500 pages |
| Claude Sonnet 4.6 | 200,000 tokens | Same size, faster, cheaper |
| Claude Haiku 4.5 | 200,000 tokens | Same size, fastest, cheapest |
| GPT-4o (OpenAI) | 128,000 tokens | ~64% of Claude's window |
| Gemini 1.5 Pro (Google) | 1,000,000+ tokens | 5x Claude — but accuracy drops at the edges |
When a session ends and a new one starts, the context window is completely cleared. Claude doesn't carry anything over. This is why CLAUDE.md exists — it's the only thing that automatically reloads. Everything else (supplements, conversation, decisions made) must be explicitly preserved through files (HANDOFF.md, KB, session-memory.md).
When the context window fills up during a session, the system automatically summarizes older conversation to free space. This is lossy — details, nuance, and specific instructions from early in the session may be simplified or lost. This is what happened earlier today when our session compacted and you got that "continued from previous conversation" summary.
Even within a session where nothing is lost, Claude's "attention" to instructions degrades as the conversation grows. CLAUDE.md instructions that were perfectly followed at message 3 may be inconsistently followed at message 30 — not because they're gone, but because they're now competing with 50,000+ tokens of newer content for attention weight. This is the "laziness" you've been experiencing, compounded by the Opus 4.6 regression.
| Architecture Decision | Why It Works (Context Window Reason) |
|---|---|
| Modular CLAUDE.md with @-imports | Keeps boot context organized without wasting tokens on section headers and navigation |
| Separate sessions for different roles | Each session gets its own fresh 200K window instead of one session trying to hold everything |
| Knowledge base with on-demand retrieval | Hundreds of docs NOT loaded at boot — only retrieved when needed. Massive savings vs. stuffing everything into CLAUDE.md |
| Supplements separate from CLAUDE.md | Company-level context always loaded (cheap). Product-level loaded per session (only when needed) |
| Periodic checkpoint rule | Forces doc updates before the window gets too full to act on them reliably |
| Session close-out protocol | Writes critical state to files BEFORE compaction can erase it from the window |
| "Write it down or it doesn't exist" | Anything only in the context window dies at session end. Files are permanent. |