Structured Output
JSON, schemas & validated responses
The Problem: LLMs produce beautiful prose, but your code needs JSON. Free-text output breaks parsers, varies in format, and requires fragile regex to extract data. How do you get reliable, machine-readable output?
The Solution: Taming Free Text into Structured Data
Structured Output constrains LLM responses to follow specific schemas — JSON objects, typed function parameters, or validated data structures. Think of it as giving the model a form to fill out rather than a blank page. Use JSON mode, function calling, or output schemas to eliminate parsing failures and format inconsistencies entirely.
Think of it like filling out a form instead of writing an essay:
- 1. Define output schema: Write a JSON Schema, Pydantic model, or TypeScript interface that describes the exact shape
- 2. Enable JSON mode or function calling: Pass the schema to the API; the model is now constrained to output matching that structure
- 3. LLM generates schema-compliant output: Model fills in fields like a form — field names, types, and nesting are guaranteed
- 4. Parse and validate against schema: Deserialise and run Pydantic/Zod validation to catch semantic errors, not just format errors
- 5. Retry with error feedback if invalid: On validation failure, feed the error message back to the model for self-correction (max 2 retries)
Where Is This Used?
- API Response Formatting: Guaranteed JSON responses that parsers can consume without error handling for format issues
- Data Extraction Pipelines: Pulling structured fields (name, date, amount) from unstructured documents at scale
- Form Filling Automation: Converting free-text intake forms or emails into validated database records
- Configuration Generation: Turning natural language feature descriptions into typed config objects or infrastructure-as-code
- Common Pitfall: Semantically Wrong Values: Even with JSON mode, LLMs may produce valid JSON with wrong values (hallucinated dates, inverted booleans) — always validate values, not just schema shape, with Pydantic or Zod
Fun Fact: Without JSON mode, LLMs add "helpful" text around JSON about 30% of the time: "Here's the JSON you requested: {...}". With JSON mode, this drops to 0%. Function calling goes further — it guarantees not just valid JSON, but valid JSON matching your exact schema. The difference in downstream parsing errors: 30% → 0.5%.
Try It Yourself!
Use the interactive demo below to see the difference between free-text and structured output, and build your own schemas to extract data reliably.
Structured Output Explorer
InteractiveInput text
Hi, I'm Sarah Chen from Acme Corp. You can reach me at sarah@acme.com or call 555-0123. I'm based in San Francisco and I'd love to discuss the Q3 partnership proposal.
The sender is Sarah Chen who works at Acme Corp. Her email is sarah@acme.com and her phone number is 555-0123. She's located in San Francisco and wants to discuss a Q3 partnership proposal.
- • JSON mode eliminates 30% of responses that wrap JSON in "helpful" prose.
- • Function calling guarantees YOUR schema compliance — fields, types, required/optional.
- • Always validate + retry: most format errors resolve in 1 retry with error feedback.
Try it yourself
Interactive demo of this technique
Extract contact info as JSON from an email signature
Name: John Smith Title: Marketing Director Email: john@techcorp.com Phone: +1 (555) 123-4567
{ "name": "John Smith", "title": "Marketing Director", "company": "TechCorp LLC", "email": "john@techcorp.com", "phone": "+1 (555) 123-4567", "website": "https://techcorp.com" }
For structured data extraction, give the model an exact JSON schema with types and a null rule for absent fields — this makes the output programmatically parseable without post-processing.
Create a free account to solve challenges
1 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path