When people hear "200,000 token context window," they usually think in words or sentences. That's the wrong unit. Tokens are the actual unit Claude thinks in — and they don't map cleanly to words, sentences, or characters.
Understanding tokens explains why some operations feel expensive, why code costs more than prose, why other languages burn through context faster, and why "context window" is not the same as "how much Claude can read."
Type anything below and watch it get split into tokens. Each color is a different token.
| Content Type | Tokens per unit | What fits in 200K tokens |
|---|---|---|
| Average English word | ~1.3 tokens | ~150,000 words |
| Novel page (250 words) | ~330 tokens | ~600 pages (a long novel) |
| This lesson (full text) | ~2,500 tokens | 80 lessons like this one |
| Tweet (280 chars) | ~80 tokens | ~2,500 tweets |
| Python function (20 lines) | ~150-250 tokens | ~1,000 small functions |
| JSON API response (large) | ~2,000-5,000 tokens | 40-100 API calls |
| PDF page (dense text) | ~400-600 tokens | ~350-500 pages |
| Spanish/French text | ~1.5-2x English | Roughly half the English capacity |
| Chinese/Japanese/Arabic | ~2-4x English | A quarter or less of English capacity |
Code is tokenized differently from prose. Variable names, symbols, whitespace, and syntax all fragment into more tokens than equivalent English text.
| Text Type | Example | Tokens |
|---|---|---|
| Plain English | "Get all users from the database" | ~8 |
| SQL equivalent | SELECT * FROM users WHERE active = 1; | ~15 |
| Python equivalent | users = db.query(User).filter(User.active==1).all() | ~22 |
| JSON config (indented) | 4-space indented JSON object, 10 fields | ~60-100 |
| Markdown with headers | # Heading / **bold** / - list items | ~10-15% overhead vs plain text |
Most people only think about message length. The real costs are often invisible.
Use the slider to set how many tools calls and messages are in your session. See where the tokens actually go.
Anthropic charges by token on the API. The consumer plans bundle tokens into monthly limits. Either way, tokens are the unit.
| Model | Input tokens | Output tokens | When to use |
|---|---|---|---|
| Claude Opus 4.6 | $15 / 1M tokens | $75 / 1M tokens | Complex reasoning, critical decisions |
| Claude Sonnet 4.6 | $3 / 1M tokens | $15 / 1M tokens | Most tasks — the smart default |
| Claude Haiku 4.5 | $0.80 / 1M tokens | $4 / 1M tokens | High-volume, simple tasks |
When Claude generates output, it's doing computation for every single token — predicting the next most likely token, one at a time. Reading input is comparatively cheap (one forward pass through the model). Writing output requires generating tokens sequentially, which is why a long response costs significantly more than a long input to read.
With this mental model, conservation strategies make intuitive sense:
| Strategy | Why It Works (Token Reason) | Savings |
|---|---|---|
| Use Sonnet instead of Opus | Same token count, 80% cost reduction. Doesn't reduce tokens used, but reduces cost per token. | $$$ |
| KB retrieval vs. boot-context loading | search_docs returns ~200-500 tokens of targeted info. Loading the entire doc into CLAUDE.md would cost that every single turn. | High |
| Separate sessions for separate roles | Each session has its own clean context. No cross-contamination. No need to re-explain role A to session B. | Medium |
| Short, specific messages | Input is cheap, but also sets up a longer response. Vague prompts produce longer, less useful responses, costing more output tokens. | Low-Medium |
| Read specific file sections, not whole files | Reading lines 200-250 of a file costs ~500 tokens. Reading the whole file might cost 8,000. | High (for large files) |
| Start fresh sessions for new topics | A new session has minimal conversation history. Continuing in an old session means Claude re-reads everything from 3 hours ago every turn. | High (late sessions) |
People obsess over context window size. The number that actually determines quality is:
Everything you do to reduce the non-thinking parts is directly invested into Claude's ability to reason. Lean CLAUDE.md, targeted retrieval, short sessions — these aren't style preferences. They're how you buy yourself more thinking space per dollar.