Foundations
Which Claude Model Should I Use?
Sonnet handles 80% of what you'll ever do. Here's exactly when to reach for the others, what they cost, and how to stop paying for more than you need.
On this page (9 sections)
Start with Sonnet. That's the answer.
Seriously. If you're new to Claude Code, just use Sonnet 4.6 for everything and stop reading here. You can come back to this page when Sonnet doesn't cut it.
Still here? Good. There are three models, they're meaningfully different, and picking wrong costs you either money or quality.
When to Use Opus 4.6
Use Opus when getting it wrong the first time is expensive.
- Designing system architecture or making foundational decisions
- Refactoring that touches 10+ files
- Debugging cross-file issues where your first approach already failed
- Writing PRDs, competitive analysis, strategy documents
- Security reviews and code audits where you need to be thorough
It generates at 20-30 tokens per second versus Sonnet's 40-60. You feel that. For a quick bug fix it's irritating. For a 2-hour architecture session it's worth every second of waiting.
Pricing: $5 input / $25 output per million tokens.
When to Use Sonnet 4.6
This is your daily driver. 80% of tasks, Sonnet is the right call.
- Single-file edits and bug fixes
- Writing tests, docs, code reviews
- CSS tweaks and UI changes
- Git operations and daily workflow
- Features that are mostly straightforward
- Answering questions about your codebase
Here's the thing most people discover after a week: for most tasks, the quality difference between Sonnet and Opus is negligible. You're not leaving capability on the table. You're being efficient.
Pricing: $3 input / $15 output per million tokens.
When to Use Haiku 4.5
Haiku is for volume work where you don't need reasoning. Think of it like the intern who types really fast.
- Quick code formatting
- Simple lookups ("what does this function do?")
- Running through checklists
- Basic text transformations
- Syntax checks
The real use case here is sub-agents. When you're running background code review or spinning up a test-execution agent, Haiku is fast and cheap enough that you can run several in parallel without watching your bill climb.
Pricing: Cheapest of the three. Exact numbers fluctuate but it's a fraction of Sonnet.
How Claude Code Picks For You
On Claude Pro ($20/month) you get Sonnet by default with generous daily usage. That's enough for most people. Full stop.
On Claude Max ($100/month) you still default to Sonnet but get Opus access and 20x the usage of Pro. Get here when you're hitting Pro limits daily, not before.
On the API (pay per use) you pick the model per request. Full control, full responsibility for the bill.
Switching Models Mid-Session
/model opus # Switch to Opus for a complex task
/model sonnet # Switch back for regular work
/model haiku # Switch to Haiku for quick tasksYou can do this mid-conversation, mid-task. The context carries over. Switch to Opus when you hit something hard, switch back when it's done.
Which Plan Should I Get?
If you're just trying it out: start free, see if you like it.
If you're using it regularly: Pro at $20/month handles most people's actual usage. I'd start here and upgrade only when you start hitting limits and it frustrates you. You'll know.
If you're a power user building things all day: Max at $100/month gives you the headroom. The Opus access is nice but honestly it's the 20x usage that makes Max worth it.
If you're building apps that call Claude: API, pay per token, pick your model per call.
The Decision in One Line
Sonnet for everything. Opus when Sonnet's first pass wasn't good enough. Haiku for background agents and volume work.
That's it. Start there and adjust based on what you actually experience, not what you imagine might be better.
Pro Tips
- Start every session on Sonnet. Only switch to Opus if the first attempt clearly isn't cutting it.
- Use Haiku for sub-agents. Code review and test agents running in the background don't need Opus-level reasoning.
- Watch your context window. The 1M token limit sounds massive, but loading whole codebases repeatedly adds up. Use
/compactto trim. - Extended thinking costs output tokens, not input. The model reasons internally before answering but you pay for the final response, not the thinking. Worth knowing.
New guides, when they ship
One email, roughly weekly. CLAUDE.md templates, workflows I actually use, and the cut-for-length stuff that does not make the public guides. One-click unsubscribe.
Or follow on Substack