Context Windows
Memory limits
The Problem: AI forgets the beginning of long conversations and can't process unlimited text. What limits AI memory, and how do you work within it?
The Solution: Understand Working Memory
The context window is the maximum amount of text an LLM can process at once — like working memory. It's like your RAM — everything that needs to be actively considered must fit in this space. The size is measured in tokens, and managing your token budget is key to staying within limits.
Think of it like computer RAM:
- 1. Limited size: 8K, 32K, 128K, 200K tokens depending on model
- 2. Includes everything: System prompt + conversation history + current message
- 3. FIFO when full: Oldest content gets dropped when limit reached
- 4. Cost scales with size: More tokens = more expensive
Managing Context
- Summarization: Compress old conversation into summaries
- Selective Inclusion: Only include relevant prior messages
- RAG: Pull in relevant docs dynamically instead of storing everything
- Chunking: Break long documents into processable pieces
Fun Fact: Context windows have grown from 4K tokens (GPT-3) to 200K+ tokens (Claude 3) in just a few years! But "needle in a haystack" tests show that attention quality degrades in very long contexts — bigger isn't always better.
Try It Yourself!
Use the interactive example below to see how context window limits affect AI memory and learn strategies to manage them.
📦 Context window is the model's "memory". Everything that doesn't fit — gets forgotten! Add messages and watch the window fill up.
When context overflows, old messages are "forgotten". That's why it's important to: 1) choose a model with enough context, 2) compress history, 3) keep important info closer to the end.
Try it yourself
Interactive demo of this technique
Processing a long document — information loss when context window overflows
The report describes company financial metrics, revenue growth, and development plans. No critical issues were found.
Critical issue: vulnerability in authorization module (p. 43). 12,000 accounts affected. Patch fully deployed Jan 18. Follow-up audit recommended for Q2.
More context is not always better. Strategic document chunking with summarization of irrelevant parts beats "paste everything and pray."
Create a free account to solve challenges
3 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path