Code Generation
Writing code with AI
The Problem: Writing code is time-consuming, and it's easy to forget syntax or make mistakes. How can AI help write code faster and more reliably?
The Solution: Your AI Junior Developer
Code Generation uses LLMs to write, complete, and transform code based on natural language descriptions. The model was trained on billions of lines of public source code, so it has learned the statistical patterns of how real programs are structured: common API signatures, idiomatic loops, error-handling conventions, and the way tests usually look. When you describe what you want, the model predicts the most probable sequence of tokens that satisfies that description — it is not reasoning about your program the way a compiler does, it is pattern-matching against everything it has seen. That distinction explains both why it is so fast and why it can be confidently wrong.
How it works in practice
The quality of the output depends almost entirely on how much context you give. A bare one-line request leaves the model guessing about the language, the libraries, the naming style, and the edge cases. Two cheap techniques fix most of this. With chain-of-thought prompting you ask the model to outline its approach before writing code, which surfaces flawed assumptions early. With few-shot examples you paste one or two existing functions from your codebase so the output matches your conventions instead of generic StackOverflow style. Modern coding agents go further: they read neighbouring files, run the tests, see the failure, and patch the code in a loop — which is why an agent with access to your project usually beats a single chat completion.
Tradeoffs and a worked example
The danger is that generated code looks correct. The model produces plausible syntax even when the logic is subtly wrong, and it can invent library functions that do not exist — a coding-specific form of hallucination. Treat every output as a draft from an intern: read it, run it, and test the edge cases. Concretely, ask for "a Python function to validate an email address" and most models hand you a one-line regex. It accepts a@b.c but happily rejects perfectly valid addresses like user+tag@sub.domain.co and accepts garbage like a@@b.com. The fix is to drive the spec yourself — state the input/output types, name the edge cases (empty string, unicode, plus-addressing, very long input), and ask for unit tests. The model is excellent at writing the boilerplate around a precise specification; it is poor at inventing the specification for you.
Think of it like a programming intern:
- 1. Write specification: "Write a function to validate email addresses — return true/false"
- 2. Include types and signatures: Specify input/output types, language, and expected interface
- 3. Add edge cases: Empty input, unicode, very long strings, concurrent access
- 4. AI generates code: Complete implementation with types, error handling, and docs
- 5. Review for subtle bugs: Check for race conditions, null checks, off-by-one errors, and missing cleanup
Where Is This Used?
- Code Completion: IDE integrations like GitHub Copilot
- Boilerplate Generation: Creating repetitive code structures
- Language Translation: Converting code between languages
- Test Generation: Writing unit tests from function signatures
- Gotchas: Subtle Bugs: LLM-generated code can look correct but hide race conditions, off-by-one errors, missing error handling, or incorrect edge case logic — always review critically
Fun Fact: LLM-generated code has a dangerous property: it looks plausible at first glance. Studies show developers accept AI-generated code with less scrutiny than human-written code, yet it can contain subtle logical errors — swapped comparison operators, missing null checks, or async code that works 99% of the time but deadlocks under load.
Try It Yourself!
Use the interactive example below to see how different prompting approaches affect the quality of generated code.
Build your prompt — toggle features:
Toggle a feature above to see how your prompt changes the generated code
Key Insight
- 1. Specificity = quality. "Write a function" gives minimum. Each requirement improves the output.
- 2. If you don't ask for it, you won't get it — LLMs take the shortest path.
- 3. Good prompts are checklists, not essays — each toggle = one line in the prompt.
Frequently asked questions
How does the prompt affect generated code quality?
Dramatically. Specifying language, types, error handling, edge cases, and desired patterns in your prompt can shift output from prototype-quality to production-ready code with proper structure and documentation.
Should I review LLM-generated code?
Always. LLMs can produce code with subtle bugs, security vulnerabilities, or incorrect assumptions. Treat generated code as a draft from a junior developer — review, test, and validate before using.
Which programming languages do LLMs generate best?
Python, JavaScript/TypeScript, and Java have the most training data and best results. Less common languages (Rust, Haskell, Elixir) work but with more errors. Always verify syntax and idioms.
How can I use LLMs for refactoring existing code?
Provide the code with explicit instructions: 'Refactor to reduce duplication', 'Convert to async/await', 'Add TypeScript types'. Include context about patterns and constraints. Review diffs carefully.
Try it yourself
Interactive demo of this technique
Write a function to filter an array of objects by multiple criteria
function filter(arr, key, value) { return arr.filter(item => item[key] === value); }
type Criteria<T> = {
[K in keyof T]?: T[K] | T[K][];
};
/**
* Filters an array of objects by multiple criteria.
* A criterion can be a value (exact match) or array (inclusion check).
*
* @example
* const users = [{ name: "Alice", role: "admin", status: "active" }];
* filterByMultipleCriteria(users, { role: "admin", status: ["active", "pending"] });
* // => [{ name: "Alice", role: "admin", status: "active" }]
*/
function filterByMultipleCriteria<T extends Record<string, unknown>>(
items: T[],
criteria: Criteria<T>
): T[] {
const entries = Object.entries(criteria);
if (entries.length === 0) return [...items];
return items.filter((item) =>
entries.every(([key, expected]) => {
const actual = item[key];
return Array.isArray(expected)
? expected.includes(actual)
: actual === expected;
})
);
}
A structured prompt with a usage example, typing constraints, and an explicit edge case transforms a one-liner without types into a production-ready generic function.
Create a free account to solve challenges
6 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path