Semantic Search
Beyond keyword matching
The Problem: Traditional keyword search fails when users use different words than your documents. Searching "headache remedy" won't find "migraine treatment". How do you bridge this gap?
The Solution: Understanding Meaning, Not Words
Semantic search uses embeddings — dense vector representations of text meaning — to find content by concept rather than by exact keyword. An embedding model reads a piece of text and outputs a list of numbers (often 384, 768, or 1536 of them) that encodes its meaning. Texts about similar ideas land close together in this high-dimensional space, while unrelated texts land far apart. Searching for "headache remedy" finds "migraine treatment" because both map to nearby vectors, even though they share no words at all.
How it works
The pipeline has two phases. First, an offline indexing phase: every document in your collection is run through the embedding model once and the resulting vectors are stored in a vector database like Pinecone, Qdrant, or pgvector. Second, an online query phase: when a user searches, their query is embedded with the same model, and you compare the query vector against the stored document vectors. The standard comparison is cosine similarity — the cosine of the angle between two vectors, ranging from 1 (identical direction) to 0 (unrelated) to -1 (opposite). The documents with the highest similarity are returned as the top-k results. For large collections you do not compare against every vector exactly; an ANN (approximate nearest neighbor) index such as HNSW makes retrieval fast at the cost of occasionally missing a true match.
When to use it, and the tradeoffs
Reach for semantic search when users phrase things in their own words, when synonyms and paraphrases matter, or when you need cross-lingual matching. It is also the retrieval backbone of RAG systems that feed context to an LLM. But it is not free of pitfalls. Embeddings blur exact details: a query for product code "X-450" or a specific person's name may rank a vaguely related document above the exact one, because keyword precision is exactly what embeddings smooth over. The fix in practice is hybrid search — blend semantic scores with a keyword signal like BM25 so you get both meaning and exact-match precision. Two more rules of thumb: you must embed queries and documents with the same model, and long documents should be split into smaller chunks before embedding, since a single vector cannot faithfully represent ten pages of mixed topics.
Worked example. Imagine a support knowledge base of 3 articles: A = "Resetting your password", B = "Updating your billing card", C = "Recovering a locked account". A user types "I forgot my login". Keyword search finds nothing — none of the articles contain the word "forgot" or "login". Semantic search embeds the query and gets cosine scores like A = 0.81, C = 0.74, B = 0.22. It returns A and C at the top because "forgot my login" is semantically close to "resetting your password" and "locked account", and correctly pushes the unrelated billing article to the bottom.
Think of it like a librarian who understands what you mean, not just what you said:
- 1. Convert documents to embeddings: Each document is encoded into a dense vector and stored in a vector database
- 2. Convert query to embedding: The user's search query is encoded using the same embedding model
- 3. Calculate cosine similarity: Measure the angle between the query vector and every document vector
- 4. Rank by similarity score: Documents closest in meaning to the query float to the top of results
- 5. Return top-k results: Deliver the most semantically relevant matches, often combined with reranking
Where Is This Used?
- Knowledge Base Search: Finding relevant support articles even when users describe problems in their own words
- Documentation Search: Surfacing the right API reference page from a conceptual question
- Product Discovery: "Comfy shoes for long walks" finds "ergonomic footwear" and "orthopedic sneakers"
- Cross-Lingual Search: A query in English finds semantically matching documents written in Russian or French
- Common Pitfall: Embedding Blind Spots: Embedding models struggle with rare proper nouns, product codes, and very recent terminology — hybrid search (semantic + keyword BM25) handles these edge cases better
Fun Fact: In 1536-dimensional embedding space, the distance between "king" and "queen" is almost identical to the distance between "man" and "woman". This is how embeddings capture relationships. Modern embedding models handle 100+ languages in the same vector space — a Russian question can find an English answer.
Try It Yourself!
Try the interactive demo below to compare keyword search vs semantic search and see how meaning-based matching finds what keywords miss.
Keyword vs Semantic Search
See how the same query returns different results
Select a search query:
Getting Started with Python Programming
A beginner guide to writing your first Python script.
Building a REST API with Node.js
Set up routes, middleware, and deploy your backend server.
Advanced JavaScript Patterns
Closures, prototypes, and design patterns for JS engineers.
Getting Started with Python Programming
A beginner guide to writing your first Python script.
Advanced JavaScript Patterns
Closures, prototypes, and design patterns for JS engineers.
Building a REST API with Node.js
Set up routes, middleware, and deploy your backend server.
Introduction to Data Analysis
Use pandas and statistics to extract insights from datasets.
Neural Network Architecture Guide
Deep dive into layers, activations, and model design.
Understanding Transformers in AI
Attention mechanism, BERT, GPT, and the NLP revolution.
Machine Learning Fundamentals
Core ML algorithms and how models learn from data.
- • Semantic search understands synonyms: "code" matches "programming" even if those words never appear in the query.
- • Keyword search is brittle: missing ONE word means missing the document.
- • Best systems combine both (hybrid search): keyword for exact matches, semantic for conceptual intent.
Frequently asked questions
How does semantic search differ from keyword search?
Keyword search matches exact words — searching 'car' won't find 'automobile'. Semantic search converts text to embeddings (numerical vectors) and compares meaning, so 'car' matches 'automobile', 'vehicle', and 'sedan'.
What are embeddings and how do they work?
Embeddings are dense numerical vectors (e.g., 1536 dimensions) that capture the meaning of text. Similar meanings produce similar vectors. They're generated by specialized models like text-embedding-3-small or Cohere embed.
What is cosine similarity?
Cosine similarity measures the angle between two vectors, returning a value from -1 to 1. Values close to 1 mean very similar, close to 0 means unrelated. It's the standard metric for comparing embeddings.
When should I use semantic search vs keyword search?
Use semantic search when users phrase queries in natural language, when synonyms matter, or for cross-lingual search. Use keyword search for exact identifiers (product codes, names). Best systems combine both approaches (hybrid search).
Try it yourself
Interactive demo of this technique
Convert a user question into a better search query
laptop running slow reasons
Primary query: laptop slow performance degradation
Alternative 1 (symptoms): laptop freezing OR sluggish app loading response time
Alternative 2 (causes): laptop performance degradation causes OR overheating CPU load disk usage high
A good search query is not just "better words" — it's multiple variants covering different phrasings of the same problem, which directly improves the recall of a semantic search system.
Create a free account to solve challenges
1 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path