Vector Databases
Pinecone, Chroma, Weaviate
The Problem: RAG needs to quickly find relevant documents from millions of entries. Regular databases search by keywords, but we need to search by meaning. How?
The Solution: A Smart Filing Cabinet
Vector databases store embeddings (numerical representations of meaning) and enable lightning-fast similarity search across millions of documents. Think of a library with a card catalog — but instead of organizing books alphabetically, the librarian arranges them by meaning. The HNSW algorithm acts like a librarian who knows shortcuts between sections, finding the right book in milliseconds. They are the backbone of RAG pipelines, turning your documents into searchable knowledge.
Think of it like a library with a card catalog system — HNSW index is like the librarian who knows shortcuts between sections:
- 1. Split documents into chunks: Break large documents into 200-500 token chunks with overlap — preserve paragraph and section boundaries
- 2. Generate embeddings: Run each chunk through an embedding model (e.g., text-embedding-3-small) to get a 1536-dimensional vector
- 3. Store in vector DB: Index vectors with HNSW for sub-millisecond search — store metadata (source, page, date) alongside each vector
- 4. Embed the query: When a user query arrives, convert it to a vector using the same embedding model
- 5. Find nearest chunks: Vector DB returns the top-K most similar chunks (typically K=3-10) — these become context for the LLM
Chunk size matters: too small = missing context, too large = noise. Typical sweet spot: 200-500 tokens with 10-20% overlap between chunks.
Popular Vector Databases
- Pinecone: Fully managed, easy to use
- Weaviate: Open source, rich features
- Chroma: Lightweight, developer-friendly
- pgvector: Postgres extension, familiar tooling
Fun Fact: Vector search can find relevant documents even when there's zero word overlap! Searching "how to fix a broken heart" in a medical database correctly returns cardiology articles, not poetry — because embeddings capture domain context, not just word similarity.
Try It Yourself!
See how vector similarity search finds semantically related content.
Try it yourself
Interactive demo of this technique
Find relevant documents: keyword search vs semantic search
Found 1 result:
- "How to cancel subscription: go to settings → subscriptions → cancel"
Missed:
- "Refund for unused period" (relevant but no word "cancel")
- "Pause monthly payment" (synonym but different keywords)
- "Delete account and all data" (related topic)
Top-5 by cosine similarity:
- (0.95) "How to cancel subscription: settings → subscriptions → cancel"
- (0.89) "Refund for unused subscription period"
- (0.85) "Pause monthly payments during vacation"
- (0.82) "Change plan or switch to free tier"
- (0.78) "Delete account and all associated data"
Vector search finds documents by MEANING, not words. "Cancel subscription" also finds "get refund" and "pause payments" — because embeddings encode semantics.
Create a free account to solve challenges
4 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path