What is Chain of Thought (CoT)?
A prompting technique that asks an LLM to 'think step by step' before answering — significantly boosts accuracy on hard problems.
Chain of Thought (CoT) is a prompting technique that asks an LLM to write out its reasoning steps before reaching a conclusion, instead of jumping straight to the answer. It noticeably improves accuracy on logic, math, and planning problems.
Example
Plain prompt:
Roger has 5 tennis balls. Today he buys 2 cans, each containing 3 balls. How many balls does Roger have?
LLM (no CoT): “11 balls.” → correct, but sometimes wrong because it leaps straight to the answer.
CoT prompt:
Roger has 5 tennis balls. Today he buys 2 cans, each containing 3 balls. How many balls does Roger have? Let’s think step by step.
LLM:
Step 1: Roger starts with 5 balls. Step 2: 2 cans × 3 balls = 6 new balls. Step 3: 5 + 6 = 11 balls. Answer: 11 balls.
By “exposing” its reasoning, the model makes fewer arithmetic and logic mistakes.
Why does CoT work?
The intuition: an LLM predicts tokens. When you ask it to write step-by-step, each new token is conditioned on the previous step → fewer chances to skip ahead and get it wrong.
The original paper (Wei et al., 2022) showed CoT lifting accuracy from ~17% to ~58% on the GSM8K math benchmark with a 540B model.
Variants
Zero-shot CoT
Just appending "Let's think step by step" to the end of the prompt is enough to trigger reasoning.
Few-shot CoT
Provide 2-3 step-by-step examples before asking the real question:
Q: 23 + 47 = ?
A: 23 + 47. 23 = 20+3, 47 = 40+7. 20+40 = 60, 3+7 = 10, 60+10 = 70. Answer: 70.
Q: 89 - 34 = ?
A: ...
Tree of Thoughts (ToT)
Branching reasoning — try multiple paths, evaluate them, pick the best one.
Self-consistency
Generate many different chains of thought (with high temperature) and majority-vote the most common answer.
Reasoning models (2025-26)
The newest models (OpenAI o1/o3, Claude 4.7 Sonnet with extended thinking, Gemini 2.5) do CoT internally before answering, with no prompting needed.
You’ll see “thinking…” for a few seconds to a few minutes before the result appears. More expensive, but far more accurate on hard tasks.
When you DON’T need CoT
- Simple questions (definitions, facts)
- Open-ended creative work (poetry, fiction)
- When you need a fast response and latency matters