Tree of Thoughts
Branching reasoning
The Problem: Some problems have many possible paths, and the first one you try might be wrong. How can AI explore multiple approaches and backtrack when needed?
The Solution: Think Like a Chess Player
Tree of Thoughts (ToT) lets the AI explore multiple reasoning branches, evaluate each one, and prune dead ends. Instead of committing to one path, it considers several options at each step, like a chess player thinking several moves ahead. It extends Chain-of-Thought into a tree structure, and while Self-Consistency picks the best final answer from several independent attempts, ToT evaluates partial progress at every step and can abandon a branch before wasting more reasoning on it.
How it works
ToT runs a small search over “thoughts” — coherent intermediate steps toward a solution. At each node the model does three things. First it generates several candidate next thoughts (e.g. three different ways to continue). Then a value step scores how promising each candidate is, either by asking the model to rate it (“sure / maybe / impossible”) or by comparing candidates head-to-head. Finally a search strategy — usually breadth-first (keep the best k branches at each level) or depth-first (dive into the strongest branch, backtrack on failure) — decides which nodes to expand next. Because every branch is just a normal prompt, ToT needs no fine-tuning; it is an orchestration layer that calls the model many times and keeps the high-value paths.
When to use it — and the tradeoffs
Reach for ToT on problems where a single greedy chain often goes wrong: tasks with a large search space, an early choice that locks in later failure, or a clear way to check partial progress (puzzles, constraint satisfaction, multi-step planning, code that must pass tests). The cost is the catch — exploring many branches and scoring each one multiplies token usage and latency, sometimes 10× or more versus plain Chain-of-Thought. It also only helps when the model can evaluate a partial solution reasonably; if it can’t tell a good branch from a bad one, the search just wastes calls. For straightforward questions a single reasoning chain is cheaper and just as good. Worked example — Game of 24: given the numbers 4, 9, 10, 13, reach 24 using each once. ToT proposes first moves like 10 − 4 = 6, 13 − 9 = 4, and 4 × 9 = 36, scores each on whether the remaining numbers could still reach 24, prunes the dead ends, and expands the survivors — finding (13 − 9) × (10 − 4) = 24. In the original paper this lifted the success rate from about 4% with standard prompting to 74%.
Think of it like a chess player planning moves:
- 1. Generate options: "I could move the knight, bishop, or queen..."
- 2. Evaluate each: "Knight looks promising, queen seems risky..."
- 3. Explore deeper: "If knight, then opponent might... then I could..."
- 4. Backtrack if needed: "That path leads to checkmate against me, try another"
Where Is This Used?
- Puzzles: Sudoku, 24 game, logic puzzles
- Creative Writing: Exploring different plot directions
- Planning: Finding optimal paths through complex decisions
- Code Architecture: Evaluating different design approaches
Fun Fact: Tree of Thoughts increased success rate on the Game of 24 (making 24 from 4 numbers) from 4% with standard prompting to 74%! The ability to backtrack and try different paths is incredibly powerful.
Try It Yourself!
Use the interactive example below to see how Tree of Thoughts explores multiple branches and finds the best solution path.
Frequently asked questions
How is Tree of Thoughts different from Chain-of-Thought?
Chain-of-Thought builds one linear chain of reasoning and follows it to the end. Tree of Thoughts turns reasoning into a tree: at each step the model generates several candidate next thoughts, scores them, and only expands the promising branches while pruning dead ends. This lets it backtrack and try another path, whereas a plain CoT chain that makes an early mistake runs all the way to a wrong answer with no chance to recover.
When should I use Tree of Thoughts?
ToT pays off on problems with a large search space, where an early wrong choice causes later failure, and where partial progress can be checked: puzzles, constraint satisfaction, multi-step planning, code that must pass tests. For straightforward questions it is overkill — a single Chain-of-Thought chain is cheaper and just as good.
Why is Tree of Thoughts more expensive than standard prompting?
ToT calls the model many times: at each node it generates several candidates and scores each one separately. Exploring branches plus the evaluation steps multiply the number of requests, so token usage and latency can grow 10x or more versus plain Chain-of-Thought. The extra cost is only worth it when the accuracy gain genuinely matters.
Does Tree of Thoughts require fine-tuning the model?
No. ToT is an orchestration layer on top of ordinary prompts: both generating thoughts and evaluating them are normal model calls. No fine-tuning is needed. It works on off-the-shelf LLMs as long as the model can reasonably judge which branch is more promising.
Try it yourself
Interactive demo of this technique
Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place?
Ann — 2, Bob — 1, Vicky — 3
Step 1: All possible arrangements (6 options) 1-2-3: ABV, AVB, BAV, BVA, VAB, VBA
Step 2: Ann is not first (remove A in 1st place) ❌ ABV, ❌ AVB ✓ BAV, ✓ BVA, ✓ VAB, ✓ VBA Remaining: 4 options
Step 3: Bob is not next to Vicky BAV: B(1)-V(3) not adjacent ✓ BVA: B(1)-V(2) adjacent ❌ VAB: V(1)-B(3) not adjacent ✓ VBA: V(1)-B(2) adjacent ❌
Step 4: Remaining options
- Bob-Ann-Vicky (BAV)
- Vicky-Ann-Bob (VAB)
Answer: Two solutions! Both are valid.
Tree of Thoughts systematically explores all options. Without ToT, the model gave one answer and missed the second solution.
Create a free account to solve challenges
4 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path