Basic Fundamentals · Topic 15

What does “Nearest Neighbor” mean?

Given a query vector, the nearest neighbor (NN) is the point in the collection whose vector is closest to the query according to the chosen distance or similarity measure—e.g. L2 (Euclidean), cosine similarity, or dot product. “Nearest” means smallest distance or largest similarity, depending on the metric.

Summary

Nearest neighbor (NN) = the point whose vector is closest to the query under the collection’s metric (L2, cosine, dot product).
In practice we want k-NN: top k points by distance/similarity; powers semantic search, recommendations, RAG.
Exact k-NN = linear scan; for scale we use ANN indexes for “good enough” neighbors much faster; latent space is trained so “closeness” = similarity.

Definition and metrics

“Nearest” is defined by the collection’s distance or similarity metric. For L2 (Euclidean distance), nearest = smallest distance. For cosine similarity or dot product (on normalized vectors), nearest = largest similarity. So “nearest” is always “best under the chosen measure.” The vector database guarantees that all vectors in a collection use the same metric so that ordering is consistent.

For normalized vectors, cosine similarity and dot product give the same ranking; L2 and cosine can give different orderings when vectors have different magnitudes. Choosing the metric at collection creation time ensures every query and every indexed vector are compared the same way.

k-NN and ANN

In practice you usually want the k nearest neighbors (k-NN): the top k points ranked by distance or similarity. That’s what powers semantic search, recommendations, and RAG. Exact k-NN requires comparing the query to every vector (linear scan), which doesn’t scale. For large datasets we use approximate nearest neighbor (ANN) indexes so that we get “good enough” neighbors much faster, trading a small loss in recall for large gains in latency.

The value of k is chosen by the application: for RAG you might request 5–20 chunks; for recommendations you might want 10–50 items. Larger k increases the chance that the true top-k are in the result set but also increases response size and, with ANN, can slightly increase query time.

Why “nearest” means “most similar”

The latent space is built (by the embedding model) so that “closeness” corresponds to semantic or structural similarity. So nearest neighbors in that space are the most relevant items for the query. The vector database exists to answer these nearest-neighbor queries at scale without scanning the full dataset on every query. The choice of metric (L2 vs. cosine) and index parameters then determines how accurate and how fast the answers are.

Frequently Asked Questions

What is the difference between nearest neighbor and exact match?

Exact match is “find the point with this exact vector” (or ID). Nearest neighbor is “find the point(s) whose vector is closest to the query vector.” Vector DBs are built for the latter; exact match on the vector is rarely needed.

Can I get more than k results?

You specify k (e.g. top 10). For pagination, some systems support offset/limit or cursor-based retrieval, but ANN doesn’t naturally support “page 2” of nearest neighbors—order is only defined for the top k. So design k and filters for what you need in one request.

Why do we use approximate instead of exact NN?

Exact k-NN is O(n) per query; at billions of vectors that’s too slow. ANN indexes (e.g. HNSW, IVF) reduce the number of distance computations and return near-optimal results with high recall at much lower latency.

Does “nearest” depend on the distance metric?

Yes. L2 and cosine can rank points differently. The collection’s metric is fixed so all queries and vectors are comparable. See when to use L2 vs. cosine and impact of distance metrics on recall.