Foundations
Context Window Management
Claude's working memory fills up, and most people don't notice until things go wrong. Here's what the context window actually is, when it becomes a problem, and how to manage it.
On this page (14 sections)
What Is the Context Window?
Think of it like working memory. Everything in your current session (your messages, Claude's responses, file contents it has read, command outputs) occupies space in this window. When it fills up, earlier parts of the conversation start getting pushed out. Claude Code begins forgetting things you said or files it already read.
How Big Is It?
Claude Code's models (Opus 4.6, Sonnet 4.6, Haiku 4.5) all support 1 million token context windows. That's roughly 750,000 words. Sounds massive. It still fills faster than you think in agentic workflows where Claude reads files, runs commands, and processes outputs in a loop.
A rough sense of scale:
| Content | Approximate tokens |
|---|---|
| A typical source file (200-400 lines) | 500-2,000 |
| A large source file (1,000 lines) | 3,000-5,000 |
| Reading 50 files in a session | ~50,000 (25% of your window) |
| A long back-and-forth conversation (30+ exchanges) | 20,000-40,000 |
| Command output from a failing test suite | 5,000-15,000 |
A focused single task usually fits fine. A marathon session with multiple tasks, lots of file reading, and long debugging conversations will hit the wall.
Signs You Are Running Out
You'll notice these before any explicit warning:
- Responses get less accurate. Claude Code "forgets" constraints you mentioned earlier or re-reads files it already looked at.
- It loses track of decisions. You agreed on an approach 20 messages ago. Now it suggests something different.
- Responses get shorter or cut off. The model is running out of room to generate.
- You see context length warnings. Claude Code will tell you when things are getting large.
How Tokens Work
Roughly 4 characters equals 1 token. A word averages about 1.3 tokens. Code tends to be slightly more token-dense than prose because of syntax, indentation, and special characters.
Quick math: a 300-line TypeScript file is about 1,200 tokens. If Claude Code reads 40 files during a session, that's 48,000 tokens just from file content, before counting any conversation.
The /compact Command
When context gets large, run /compact. This summarizes the entire conversation so far and starts fresh with just the summary. Think of it as taking notes from a meeting, then starting a new meeting with only the notes.
When to use it:
- After completing a task, before starting a new one
- When you notice Claude Code forgetting earlier context
- When you get a context length warning
- Proactively, after any long debugging session
What it preserves: Key decisions, file modifications, current state of work.
What it loses: Exact file contents, detailed error messages, nuanced back-and-forth reasoning. If you need to reference specific details after compaction, Claude Code will re-read the relevant files.
Strategies to Stay Under the Limit
1. One session, one task
Don't let a single session accumulate context from five different tasks. Finish a task, start a new session for the next one. This is the single most effective strategy and most people never do it.
2. Use handoffs instead of marathon sessions
Instead of one 3-hour session, work in focused 30-60 minute blocks. End each block with a handoff (what was done, what's next). Start a fresh session that reads only the handoff. Fresh context every time.
See the session lifecycle guide for handoff protocols.
3. Point to files instead of pasting them
Let Claude Code read files on demand. Don't paste file contents into the chat. When you paste, that content stays in the conversation forever. When Claude Code reads a file with its tools, it's more efficient.
4. Use sub-agents for research
Sub-agents (via the Task tool) get their own context window. If you need Claude Code to explore a large codebase or research something, offload it to a sub-agent. The main conversation only gets the summary back.
5. Be specific in your prompts
Vague prompts cause Claude Code to read more files searching for what you mean. "Fix the auth bug" makes it read every auth-related file. "Fix the token refresh logic in src/auth/refresh.ts where expired tokens are not being caught" sends it straight to the right file.
6. Use /compact between tasks
Even within one session, compact between distinct tasks. Finished adding tests? Compact. Now moving to a different feature? Clean slate, prior work summarized.
The Handoff Strategy
This is the power move for heavy users. Instead of fighting the context window, work with it:
- Work in a focused block (30-60 minutes, one task)
- End the session with a handoff: what was done, what's next, key decisions
- Start a new session. Load only the handoff and relevant
CLAUDE.md - Fresh 1M tokens, zero wasted context
You lose nothing because the handoff captures everything that matters. You gain a full context window for the next task. Over a full day of work, this approach is dramatically more effective than one long session that degrades as context fills up.
1M Tokens Is a Lot. You Still Need Strategy.
All current Claude models support 1 million tokens. That's a massive working memory. But the same principles apply: context fills up during long agentic sessions, and the strategies above still matter. They just become critical later in longer sessions rather than sooner. More context also means higher costs per session. Another reason to stay disciplined.
New guides, when they ship
One email, roughly weekly. CLAUDE.md templates, workflows I actually use, and the cut-for-length stuff that does not make the public guides. One-click unsubscribe.
Or follow on Substack
Keep Your Two Lives Apart: Work, Personal, and What Goes Where
Plan Mode: Think Before You Build