Lesson 5

Tree of Thoughts

Branching reasoning

The Problem: Some problems have many possible paths, and the first one you try might be wrong. How can AI explore multiple approaches and backtrack when needed?

The Solution: Think Like a Chess Player

Tree of Thoughts (ToT) lets the AI explore multiple reasoning branches, evaluate each one, and prune dead ends. Instead of committing to one path, it considers several options at each step, like a chess player thinking several moves ahead. It extends Chain-of-Thought into a tree structure, and while Self-Consistency picks the best final answer from several independent attempts, ToT evaluates partial progress at every step and can abandon a branch before wasting more reasoning on it.

How it works

ToT runs a small search over “thoughts” — coherent intermediate steps toward a solution. At each node the model does three things. First it generates several candidate next thoughts (e.g. three different ways to continue). Then a value step scores how promising each candidate is, either by asking the model to rate it (“sure / maybe / impossible”) or by comparing candidates head-to-head. Finally a search strategy — usually breadth-first (keep the best k branches at each level) or depth-first (dive into the strongest branch, backtrack on failure) — decides which nodes to expand next. Because every branch is just a normal prompt, ToT needs no fine-tuning; it is an orchestration layer that calls the model many times and keeps the high-value paths.

When to use it — and the tradeoffs

Reach for ToT on problems where a single greedy chain often goes wrong: tasks with a large search space, an early choice that locks in later failure, or a clear way to check partial progress (puzzles, constraint satisfaction, multi-step planning, code that must pass tests). The cost is the catch — exploring many branches and scoring each one multiplies token usage and latency, sometimes 10× or more versus plain Chain-of-Thought. It also only helps when the model can evaluate a partial solution reasonably; if it can’t tell a good branch from a bad one, the search just wastes calls. For straightforward questions a single reasoning chain is cheaper and just as good. Worked example — Game of 24: given the numbers 4, 9, 10, 13, reach 24 using each once. ToT proposes first moves like 10 − 4 = 6, 13 − 9 = 4, and 4 × 9 = 36, scores each on whether the remaining numbers could still reach 24, prunes the dead ends, and expands the survivors — finding (13 − 9) × (10 − 4) = 24. In the original paper this lifted the success rate from about 4% with standard prompting to 74%.

Think of it like a chess player planning moves:

1. Generate options: "I could move the knight, bishop, or queen..."
2. Evaluate each: "Knight looks promising, queen seems risky..."
3. Explore deeper: "If knight, then opponent might... then I could..."
4. Backtrack if needed: "That path leads to checkmate against me, try another"

Where Is This Used?

Puzzles: Sudoku, 24 game, logic puzzles
Creative Writing: Exploring different plot directions
Planning: Finding optimal paths through complex decisions
Code Architecture: Evaluating different design approaches

Fun Fact: Tree of Thoughts increased success rate on the Game of 24 (making 24 from 4 numbers) from 4% with standard prompting to 74%! The ability to backtrack and try different paths is incredibly powerful.

Try It Yourself!

Use the interactive example below to see how Tree of Thoughts explores multiple branches and finds the best solution path.

Frequently asked questions

How is Tree of Thoughts different from Chain-of-Thought?

Chain-of-Thought builds one linear chain of reasoning and follows it to the end. Tree of Thoughts turns reasoning into a tree: at each step the model generates several candidate next thoughts, scores them, and only expands the promising branches while pruning dead ends. This lets it backtrack and try another path, whereas a plain CoT chain that makes an early mistake runs all the way to a wrong answer with no chance to recover.

When should I use Tree of Thoughts?

ToT pays off on problems with a large search space, where an early wrong choice causes later failure, and where partial progress can be checked: puzzles, constraint satisfaction, multi-step planning, code that must pass tests. For straightforward questions it is overkill — a single Chain-of-Thought chain is cheaper and just as good.

Why is Tree of Thoughts more expensive than standard prompting?

ToT calls the model many times: at each node it generates several candidates and scores each one separately. Exploring branches plus the evaluation steps multiply the number of requests, so token usage and latency can grow 10x or more versus plain Chain-of-Thought. The extra cost is only worth it when the accuracy gain genuinely matters.

Does Tree of Thoughts require fine-tuning the model?

No. ToT is an orchestration layer on top of ordinary prompts: both generating thoughts and evaluating them are normal model calls. No fine-tuning is needed. It works on off-the-shelf LLMs as long as the model can reasonably judge which branch is more promising.

Try it yourself

Interactive demo of this technique

Technique Comparison

Demo Mode

Pre-recorded responses

TaskIntermediateLogic

Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place?

Without technique

Prompt

Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place?

Response

Ann — 2, Bob — 1, Vicky — 3

Tokens:52/18

Time:340ms

Quality:

With Tree of Thoughts

With technique

Prompt

Solve using Tree of Thoughts: consider all possibilities and eliminate invalid ones. Problem: Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place? Step 1: All possible arrangements Step 2: Filter by "Ann is not first" Step 3: Filter by "Bob is not next to Vicky" Step 4: Remaining options

Response

Step 1: All possible arrangements (6 options) 1-2-3: ABV, AVB, BAV, BVA, VAB, VBA

Step 2: Ann is not first (remove A in 1st place) ❌ ABV, ❌ AVB ✓ BAV, ✓ BVA, ✓ VAB, ✓ VBA Remaining: 4 options

Step 3: Bob is not next to Vicky BAV: B(1)-V(3) not adjacent ✓ BVA: B(1)-V(2) adjacent ❌ VAB: V(1)-B(3) not adjacent ✓ VBA: V(1)-B(2) adjacent ❌

Step 4: Remaining options

Bob-Ann-Vicky (BAV)
Vicky-Ann-Bob (VAB)

Answer: Two solutions! Both are valid.

👁️Generate all 6 permutations

🧠Filter 1: Ann not first → 4 options

🧠Filter 2: Bob not next to Vicky → 2 options

✅Result: BAV or VAB — both correct

Tokens:95/245

Time:920ms

Quality:

Why this works

Tree of Thoughts systematically explores all options. Without ToT, the model gave one answer and missed the second solution.

1 / 4

Practice Challenges

Create a free account to solve challenges

4 AI-verified challenges for this lesson

Related lessons:Chain Of Thought Self Consistency

This lesson is part of a structured LLM course.

My Learning Path

Lesson 5

Tree of Thoughts

Branching reasoning

The Problem: Some problems have many possible paths, and the first one you try might be wrong. How can AI explore multiple approaches and backtrack when needed?

The Solution: Think Like a Chess Player

How it works

When to use it — and the tradeoffs

Think of it like a chess player planning moves:

1. Generate options: "I could move the knight, bishop, or queen..."
2. Evaluate each: "Knight looks promising, queen seems risky..."
3. Explore deeper: "If knight, then opponent might... then I could..."
4. Backtrack if needed: "That path leads to checkmate against me, try another"

Where Is This Used?

Puzzles: Sudoku, 24 game, logic puzzles
Creative Writing: Exploring different plot directions
Planning: Finding optimal paths through complex decisions
Code Architecture: Evaluating different design approaches

Try It Yourself!

Use the interactive example below to see how Tree of Thoughts explores multiple branches and finds the best solution path.

Frequently asked questions

How is Tree of Thoughts different from Chain-of-Thought?

When should I use Tree of Thoughts?

Why is Tree of Thoughts more expensive than standard prompting?

Does Tree of Thoughts require fine-tuning the model?

Try it yourself

Interactive demo of this technique

Technique Comparison

Demo Mode

Pre-recorded responses

TaskIntermediateLogic

Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place?

Without technique

Prompt

Three friends (Ann, Bob, Vicky) take three places (1, 2, 3). Ann is not first. Bob is not next to Vicky. Who is in which place?

Response

Ann — 2, Bob — 1, Vicky — 3

Tokens:52/18

Time:340ms

Quality:

With Tree of Thoughts

With technique

Prompt

Response

Step 1: All possible arrangements (6 options) 1-2-3: ABV, AVB, BAV, BVA, VAB, VBA

Step 2: Ann is not first (remove A in 1st place) ❌ ABV, ❌ AVB ✓ BAV, ✓ BVA, ✓ VAB, ✓ VBA Remaining: 4 options

Step 3: Bob is not next to Vicky BAV: B(1)-V(3) not adjacent ✓ BVA: B(1)-V(2) adjacent ❌ VAB: V(1)-B(3) not adjacent ✓ VBA: V(1)-B(2) adjacent ❌

Step 4: Remaining options

Bob-Ann-Vicky (BAV)
Vicky-Ann-Bob (VAB)

Answer: Two solutions! Both are valid.

👁️Generate all 6 permutations

🧠Filter 1: Ann not first → 4 options

🧠Filter 2: Bob not next to Vicky → 2 options

✅Result: BAV or VAB — both correct

Tokens:95/245

Time:920ms

Quality:

Why this works

Tree of Thoughts systematically explores all options. Without ToT, the model gave one answer and missed the second solution.

1 / 4

Practice Challenges

Create a free account to solve challenges

4 AI-verified challenges for this lesson

Related lessons:Chain Of Thought Self Consistency

This lesson is part of a structured LLM course.

My Learning Path