ART — Automatic Reasoning and Tool-use
Auto Reasoning & Tool-use
The Problem: AI needs to both reason AND use external tools, but combining these seamlessly is tricky. How can we get the best of both worlds?
The Solution: A Detective With Gadgets
ART (Automatic Reasoning and Tool-use) automatically selects and integrates reasoning patterns with tool usage from a library of demonstrations. It's like a detective who knows when to think and when to reach for their gadgets. The technique merges Chain-of-Thought reasoning with tool calls, using few-shot demonstrations to teach the model when each move is appropriate. It was introduced in the 2023 paper “ART: Automatic multi-step Reasoning and Tool-use for large language models.”
How it works
ART keeps a small task library of human-written examples. Each example is written in a structured format that interleaves reasoning steps with explicit tool calls, like [search]("...") or [calculate]("..."). When a new question arrives, ART retrieves a few demonstrations from tasks that resemble it and pastes them into the prompt as a template. The model then generates its own reasoning. Crucially, whenever it emits a tool call, generation is paused, the external tool actually runs, its output is spliced back into the context, and the model continues. So the program is not hallucinating answers to its own tool calls — it is reasoning over real results from a calculator, a search index, or code execution.
When to use it and what to watch for
Reach for ART on multi-step tasks where pure reasoning is fragile: arithmetic, lookups of current facts, or anything that needs verification against an external source. Because the tool library and the parsing format are decoupled from the model, you can add a new tool by writing one or two demonstrations rather than retraining — and a human can edit a demonstration to fix a systematic mistake. The main tradeoffs are latency and cost (each tool call is a round trip and re-prompt), brittleness if the model emits a malformed call the parser can't read, and the risk that a bad search result quietly poisons the chain. Concrete example: for “What is 23% of the population of France?” a plain LLM might guess. ART instead reasons “I need France's population,” calls [search]("population of France") → ~68 million, then calls [calculate]("0.23 * 68000000") → 15.64 million, and reports a number it can actually defend.
Think of it like a detective with gadgets:
- 1. Analyze task: What kind of problem is this?
- 2. Select approach: Find similar solved cases in the library
- 3. Combine reasoning + tools: "I need to think about X, then use tool Y"
- 4. Execute smoothly: Seamless blend of thinking and tool use
Where Is This Used?
- Research Tasks: Searching, calculating, then synthesizing
- Data Analysis: Querying databases and reasoning about results
- Complex Q&A: Combining web search with logical deduction
- Multi-Modal Tasks: Using vision, code, and reasoning together
Fun Fact: ART achieves better results than either pure reasoning or pure tool use alone! The key is having a good library of demonstrations that show how to combine thinking with actions for different task types.
Try It Yourself!
Use the interactive example below to see how ART combines chain-of-thought reasoning with tool usage for powerful problem-solving.
ART: Auto Reasoning with Tools
LLM automatically selects and uses tools
If I bought 50 shares of Apple at $150 and the current price is $185, what is my profit?
How ART Works
- 1. LLM analyzes task and determines needed steps
- 2. For each step, a tool is automatically selected
- 3. Tool executes, result returns to LLM
- 4. LLM integrates results and continues reasoning
- 5. Process repeats until final answer is reached
| Aspect | Simple Prompt | ReAct | ART |
|---|---|---|---|
| Tool Use | No | Fixed | Auto-select |
| Reasoning | Implicit | Explicit | Explicit + tools |
| Math Accuracy | Low | High | High |
| Flexibility | High | Medium | Very High |
You have access to the following tools:
- calculator: for mathematical calculations
- search: for current information lookup
- wikipedia: for fact lookup
For each reasoning step:
1. Decide if a tool is needed
2. If yes, call it in format: [TOOL: name](parameters)
3. Use the result in reasoning
4. Continue until final answer
Question: {question}
Reasoning:- ✓ Tasks requiring precise calculations
- ✓ Questions about current events
- ✓ Fact and data verification
- ✓ Complex multi-step tasks
- ✓ When answer verifiability matters
Frequently asked questions
What is ART (Automatic Reasoning and Tool-use)?
ART is a prompting technique where a language model automatically blends step-by-step Chain-of-Thought reasoning with calls to external tools such as a calculator, a search index, or code execution. It retrieves similar examples from a library of demonstrations, reasons along their template, and pauses to run a real tool whenever one is needed. It was introduced in the 2023 paper 'ART: Automatic multi-step Reasoning and Tool-use for large language models.'
How is ART different from ReAct?
ReAct uses a fixed set of tools and follows a rigid thought-action-observation loop. ART goes further: it automatically selects relevant demonstrations from a task library for each new question and decides which tool fits. Because the tool library and the parsing format are decoupled from the model, adding a new tool in ART means writing one or two demonstrations rather than rewriting the loop logic.
When should you use ART, and when not?
Use ART for multi-step tasks where pure reasoning is fragile: precise arithmetic, looking up current facts, verifying data against an external source, or any answer that must be verifiable. Avoid it for simple questions the model already answers correctly: every tool call adds latency and cost, and a malformed call or a bad search result can quietly poison the whole reasoning chain.
How do you write an ART prompt?
List the available tools (e.g. calculator, search, wikipedia) and define a call format such as [TOOL: name](parameters). Provide a few few-shot demonstrations that interleave reasoning steps with tool calls. At each step the model decides whether a tool is needed, calls it, receives the real result, and continues reasoning until a final answer. Quality hinges on a good library of demonstrations covering different task types.
Try it yourself
Interactive demo of this technique
Calculate the final cost of an order with 3 items, different discounts, and tax
Laptop: 1020 Mouse: 40.50 Bag: 1140.50 With tax: 1232
Step 1: Laptop price with discount 15% off 180 Price: [calc(1200 - 180)] = $1020.00
Step 2: Mouse price with discount 10% off 4.50 Price: [calc(45 - 4.50)] = $40.50
Step 3: Bag with no discount = $80.00
Step 4: Subtotal [calc(1020 + 40.50 + 80)] = $1140.50
Step 5: Tax at 8.5% [calc(1140.50 * 0.085)] = $96.94
Step 6: Final total [calc(1140.50 + 96.94)] = $1237.44
Final order cost: $1,237.44
ART separates the task into "thinking" and "computing". The model reasons well about the order of operations but struggles with exact arithmetic. The calculator tool compensates for this weakness.
Create a free account to solve challenges
6 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path