Context Window Management

What Is the Context Window?

Think of it like working memory. Everything in your current session (your messages, Claude's responses, file contents it has read, command outputs) occupies space in this window. When it fills up, earlier parts of the conversation start getting pushed out. Claude Code begins forgetting things you said or files it already read.

How Big Is It?

Claude Code's models (Opus 4.6, Sonnet 4.6, Haiku 4.5) all support 1 million token context windows. That's roughly 750,000 words. Sounds massive. It still fills faster than you think in agentic workflows where Claude reads files, runs commands, and processes outputs in a loop.

When your context window fills up

A rough sense of scale:

Content	Approximate tokens
A typical source file (200-400 lines)	500-2,000
A large source file (1,000 lines)	3,000-5,000
Reading 50 files in a session	~50,000 (25% of your window)
A long back-and-forth conversation (30+ exchanges)	20,000-40,000
Command output from a failing test suite	5,000-15,000

A focused single task usually fits fine. A marathon session with multiple tasks, lots of file reading, and long debugging conversations will hit the wall.

Signs You Are Running Out

You'll notice these before any explicit warning:

Responses get less accurate. Claude Code "forgets" constraints you mentioned earlier or re-reads files it already looked at.
It loses track of decisions. You agreed on an approach 20 messages ago. Now it suggests something different.
Responses get shorter or cut off. The model is running out of room to generate.
You see context length warnings. Claude Code will tell you when things are getting large.

How Tokens Work

Roughly 4 characters equals 1 token. A word averages about 1.3 tokens. Code tends to be slightly more token-dense than prose because of syntax, indentation, and special characters.

Quick math: a 300-line TypeScript file is about 1,200 tokens. If Claude Code reads 40 files during a session, that's 48,000 tokens just from file content, before counting any conversation.

The /compact Command

When context gets large, run /compact. This summarizes the entire conversation so far and starts fresh with just the summary. Think of it as taking notes from a meeting, then starting a new meeting with only the notes.

When to use it:

After completing a task, before starting a new one
When you notice Claude Code forgetting earlier context
When you get a context length warning
Proactively, after any long debugging session

What it preserves: Key decisions, file modifications, current state of work.

What it loses: Exact file contents, detailed error messages, nuanced back-and-forth reasoning. If you need to reference specific details after compaction, Claude Code will re-read the relevant files.

Strategies to Stay Under the Limit

1. One session, one task

Don't let a single session accumulate context from five different tasks. Finish a task, start a new session for the next one. This is the single most effective strategy and most people never do it.

2. Use handoffs instead of marathon sessions

Instead of one 3-hour session, work in focused 30-60 minute blocks. End each block with a handoff (what was done, what's next). Start a fresh session that reads only the handoff. Fresh context every time.

See the session lifecycle guide for handoff protocols.

3. Point to files instead of pasting them

Let Claude Code read files on demand. Don't paste file contents into the chat. When you paste, that content stays in the conversation forever. When Claude Code reads a file with its tools, it's more efficient.

4. Use sub-agents for research

Sub-agents (via the Task tool) get their own context window. If you need Claude Code to explore a large codebase or research something, offload it to a sub-agent. The main conversation only gets the summary back.

5. Be specific in your prompts

Vague prompts cause Claude Code to read more files searching for what you mean. "Fix the auth bug" makes it read every auth-related file. "Fix the token refresh logic in src/auth/refresh.ts where expired tokens are not being caught" sends it straight to the right file.

6. Use /compact between tasks

Even within one session, compact between distinct tasks. Finished adding tests? Compact. Now moving to a different feature? Clean slate, prior work summarized.

The Handoff Strategy

This is the power move for heavy users. Instead of fighting the context window, work with it:

Work in a focused block (30-60 minutes, one task)
End the session with a handoff: what was done, what's next, key decisions
Start a new session. Load only the handoff and relevant CLAUDE.md
Fresh 1M tokens, zero wasted context

You lose nothing because the handoff captures everything that matters. You gain a full context window for the next task. Over a full day of work, this approach is dramatically more effective than one long session that degrades as context fills up.

1M Tokens Is a Lot. You Still Need Strategy.

All current Claude models support 1 million tokens. That's a massive working memory. But the same principles apply: context fills up during long agentic sessions, and the strategies above still matter. They just become critical later in longer sessions rather than sooner. More context also means higher costs per session. Another reason to stay disciplined.

Foundations