Lesson 3

Vector Databases

Pinecone, Chroma, Weaviate

The Problem: RAG needs to quickly find relevant documents from millions of entries. Regular databases search by keywords, but we need to search by meaning. How?

The Solution: A Smart Filing Cabinet

Vector databases store embeddings (numerical representations of meaning) and enable lightning-fast similarity search across millions of documents. Think of a library with a card catalog — but instead of organizing books alphabetically, the librarian arranges them by meaning. The HNSW algorithm acts like a librarian who knows shortcuts between sections, finding the right book in milliseconds. They are the backbone of RAG pipelines, turning your documents into searchable knowledge.

Think of it like a library with a card catalog system — HNSW index is like the librarian who knows shortcuts between sections:

1. Split documents into chunks: Break large documents into 200-500 token chunks with overlap — preserve paragraph and section boundaries
2. Generate embeddings: Run each chunk through an embedding model (e.g., text-embedding-3-small) to get a 1536-dimensional vector
3. Store in vector DB: Index vectors with HNSW for sub-millisecond search — store metadata (source, page, date) alongside each vector
4. Embed the query: When a user query arrives, convert it to a vector using the same embedding model
5. Find nearest chunks: Vector DB returns the top-K most similar chunks (typically K=3-10) — these become context for the LLM

Chunk size matters: too small = missing context, too large = noise. Typical sweet spot: 200-500 tokens with 10-20% overlap between chunks.

Popular Vector Databases

Pinecone: Fully managed, easy to use
Weaviate: Open source, rich features
Chroma: Lightweight, developer-friendly
pgvector: Postgres extension, familiar tooling

Fun Fact: Vector search can find relevant documents even when there's zero word overlap! Searching "how to fix a broken heart" in a medical database correctly returns cardiology articles, not poetry — because embeddings capture domain context, not just word similarity.

Try It Yourself!

See how vector similarity search finds semantically related content.

Try it yourself

Interactive demo of this technique

Technique Comparison

Demo Mode

Pre-recorded responses

TaskBeginnerAnalysis

Find relevant documents: keyword search vs semantic search

Without technique

Prompt

Keyword search "how to cancel subscription" in FAQ database. Results from SQL LIKE: WHERE text LIKE "%cancel subscription%"

Response

Found 1 result:

"How to cancel subscription: go to settings → subscriptions → cancel"

Missed:

"Refund for unused period" (relevant but no word "cancel")
"Pause monthly payment" (synonym but different keywords)
"Delete account and all data" (related topic)

Tokens:30/85

Time:45ms

Quality:

With Vector Databases

With technique

Prompt

Semantic search: query "how to cancel subscription" → embedding → cosine similarity over vector DB. Find top-5 relevant documents by meaning, not keywords.

Response

Top-5 by cosine similarity:

(0.95) "How to cancel subscription: settings → subscriptions → cancel"
(0.89) "Refund for unused subscription period"
(0.85) "Pause monthly payments during vacation"
(0.82) "Change plan or switch to free tier"
(0.78) "Delete account and all associated data"

👁️Keyword search finds only exact word matches — misses synonyms and related topics

🧠Vector embedding encodes MEANING: "cancel subscription" ≈ "stop payments" ≈ "get refund"

✅Cosine similarity ranks by semantic proximity → 5 relevant results instead of 1

Tokens:35/110

Time:120ms

Quality:

Why this works

Vector search finds documents by MEANING, not words. "Cancel subscription" also finds "get refund" and "pause payments" — because embeddings encode semantics.

1 / 2

Practice Challenges

Create a free account to solve challenges

4 AI-verified challenges for this lesson

Related lessons:Rag Embeddings

This lesson is part of a structured LLM course.

My Learning Path

Lesson 3

Vector Databases

Pinecone, Chroma, Weaviate

The Problem: RAG needs to quickly find relevant documents from millions of entries. Regular databases search by keywords, but we need to search by meaning. How?

The Solution: A Smart Filing Cabinet

Think of it like a library with a card catalog system — HNSW index is like the librarian who knows shortcuts between sections:

1. Split documents into chunks: Break large documents into 200-500 token chunks with overlap — preserve paragraph and section boundaries
2. Generate embeddings: Run each chunk through an embedding model (e.g., text-embedding-3-small) to get a 1536-dimensional vector
3. Store in vector DB: Index vectors with HNSW for sub-millisecond search — store metadata (source, page, date) alongside each vector
4. Embed the query: When a user query arrives, convert it to a vector using the same embedding model
5. Find nearest chunks: Vector DB returns the top-K most similar chunks (typically K=3-10) — these become context for the LLM

Chunk size matters: too small = missing context, too large = noise. Typical sweet spot: 200-500 tokens with 10-20% overlap between chunks.

Popular Vector Databases

Pinecone: Fully managed, easy to use
Weaviate: Open source, rich features
Chroma: Lightweight, developer-friendly
pgvector: Postgres extension, familiar tooling

Try It Yourself!

See how vector similarity search finds semantically related content.

Try it yourself

Interactive demo of this technique

Technique Comparison

Demo Mode

Pre-recorded responses

TaskBeginnerAnalysis

Find relevant documents: keyword search vs semantic search

Without technique

Prompt

Keyword search "how to cancel subscription" in FAQ database. Results from SQL LIKE: WHERE text LIKE "%cancel subscription%"

Response

Found 1 result:

"How to cancel subscription: go to settings → subscriptions → cancel"

Missed:

"Refund for unused period" (relevant but no word "cancel")
"Pause monthly payment" (synonym but different keywords)
"Delete account and all data" (related topic)

Tokens:30/85

Time:45ms

Quality:

With Vector Databases

With technique

Prompt

Semantic search: query "how to cancel subscription" → embedding → cosine similarity over vector DB. Find top-5 relevant documents by meaning, not keywords.

Response

Top-5 by cosine similarity:

(0.95) "How to cancel subscription: settings → subscriptions → cancel"
(0.89) "Refund for unused subscription period"
(0.85) "Pause monthly payments during vacation"
(0.82) "Change plan or switch to free tier"
(0.78) "Delete account and all associated data"

👁️Keyword search finds only exact word matches — misses synonyms and related topics

🧠Vector embedding encodes MEANING: "cancel subscription" ≈ "stop payments" ≈ "get refund"

✅Cosine similarity ranks by semantic proximity → 5 relevant results instead of 1

Tokens:35/110

Time:120ms

Quality:

Why this works

Vector search finds documents by MEANING, not words. "Cancel subscription" also finds "get refund" and "pause payments" — because embeddings encode semantics.

1 / 2

Practice Challenges

Create a free account to solve challenges

4 AI-verified challenges for this lesson

Related lessons:Rag Embeddings

This lesson is part of a structured LLM course.

My Learning Path