Ask Claude the same question twice and you get two different answers. Not wildly different — but different. This isn't a bug or inconsistency. It's a deliberate design choice called temperature, and understanding it explains a lot about how to get consistent, high-quality output.
Temperature also connects to something deeper: what Claude's "personality" actually is, whether it can be "creative" in any real sense, and what's actually happening when Claude uses extended thinking mode.
Claude doesn't think of a full sentence and then say it. It picks words one at a time. For each next word, it calculates a probability for every possible word in its vocabulary — then picks one based on those probabilities.
Temperature controls how that probability distribution is applied. Use the slider to see what happens.
Even at the same temperature setting, Claude produces different responses to identical prompts. Here's why: temperature makes the output probabilistic. Claude doesn't pick the guaranteed best answer — it samples from a distribution of possible answers. Every run is a new sample.
"The fastest way to sort a list in Python is using the built-in sorted() function, which uses Timsort under the hood."
"Python's list.sort() method sorts in-place and is generally faster than sorted() if you don't need the original list preserved."
"For large datasets, consider sorted() with a key function. For small lists, the difference is negligible — either works."
All three are correct. All three answered the same question. The variation comes from which path through the probability tree Claude happened to sample. The first token diverged slightly, and each subsequent token compounded that divergence.
When you see Claude "thinking" — that scratchpad of reasoning before the answer — people often imagine it's like a human writing notes. It's closer to this: Claude generates a stream of tokens that happen to look like reasoning, and that process of generation improves the quality of what comes next.
This is a genuine philosophical question without a clean answer. Here's the honest mechanical picture:
| What people mean by creative | What Claude does |
|---|---|
| Generating something genuinely new | Sampling combinations from patterns in training data. Novel combinations, not truly new concepts. |
| Having preferences and aesthetic judgment | Has consistent stylistic tendencies from training. Prefers certain structures. Whether that's "preference" is debatable. |
| Surprise — producing something unexpected | Yes, at higher temperatures. The probabilistic nature means Claude can surprise even itself. |
| Intentionality — making choices for reasons | No explicit intent. Patterns that look like intentional choices emerge from probability distributions. |
The curiosity, the directness, the tendency to hedge on uncertain things, the humor — none of this comes from temperature. It comes from training.
Anthropic trained Claude on vast amounts of human-generated text, then used a process called RLHF (Reinforcement Learning from Human Feedback) where human raters indicated which responses were better. Over millions of comparisons, Claude developed consistent tendencies that we call personality.
| Trait | Where it actually comes from |
|---|---|
| Intellectual curiosity | Trained on and reinforced toward engaging with ideas rather than deflecting |
| Directness | Human raters preferred clear answers; reinforced over hedging |
| Ethical caution | Explicit Constitutional AI training layer on top of base model |
| Humor | Emerged from training data; humans rated appropriately humorous responses higher |
| Consistency of style | Training distribution — Claude saw certain styles vastly more than others |
You don't set temperature directly in most interfaces. But you can steer Claude toward precise or creative modes through how you frame the task.
| Model | Default Personality Feel | Temperature Default | Thinking Mode |
|---|---|---|---|
| Claude (Anthropic) | Thoughtful, direct, intellectually curious. Tends toward nuance over simplicity. | ~1.0 (balanced) | Extended thinking — visible reasoning scratchpad. Opt-in. |
| GPT-4o (OpenAI) | Polished, accommodating, slightly more formal. Tends toward structured responses. | ~1.0 | Chain-of-thought via prompting. o1/o3 models have dedicated reasoning mode. |
| Gemini (Google) | Helpful, comprehensive, sometimes verbose. Strong on factual recall. | ~0.9 | Thinking mode available on Gemini 2.0 Flash Thinking and Pro models. |
| Grok (xAI) | More casual, willing to be edgy, conversational. Reflects X/Twitter culture. | Higher, more variation | DeepSearch for research tasks. Less structured reasoning. |
Six sessions. One complete mental model of the machine underneath every conversation you have with Claude.
Every advanced Claude technique — multi-session orchestration, context engineering, prompt design, tool use — builds on this foundation.