Indexing Algorithms - IVF & Quantization · Topic 84

What is nlist (number of clusters)?

nlist is the number of clusters (centroids) in an IVF index. It is set at index build time and determines how finely the vector space is partitioned by K-means into Voronoi cells. Larger nlist means more, smaller clusters; smaller nlist means fewer, larger clusters. It directly affects build cost, memory for storing centroids, and the potential recall/speed trade-off at query time (together with nprobe).

Summary

Larger nlist → fewer vectors per cluster on average; same nprobe can mean faster query but finer partition → more risk of missing neighbors. Smaller nlist → coarser partition, more vectors per probed cluster. Rule of thumb: sqrt(N) to 4×sqrt(N).
nlist affects build time (K-means with more clusters); nprobe is tuned at query time.
Pipeline: set nlist at build; K-means produces nlist centroids; vectors assigned to nearest centroid. At query time nprobe (not nlist) is the lever for recall vs. latency.
Trade-off: larger nlist → finer partition, faster query for same nprobe but more centroid comparisons and build cost; smaller nlist → coarser, fewer centroid comparisons, more vectors per probed cell.
What is nprobe and the recall–latency trade-off curve for IVF complete the tuning picture.

Choosing nlist

With a larger nlist, each cluster contains fewer vectors on average. If you probe the same number of clusters (nprobe), you search fewer total vectors, so queries can be faster—but the partition is finer, so the query might fall near a cell boundary and the true nearest neighbor might be in a neighboring cell you didn’t probe, hurting recall. With a smaller nlist, each cluster is bigger; probing a few clusters can mean scanning many vectors, so query time goes up, but the coarser partition can make it more likely that the nearest neighbor lies in one of the probed cells. A common rule of thumb is to set nlist on the order of sqrt(N) to 4*sqrt(N) for N vectors, then tune nprobe for the desired recall and latency.

Build time

nlist also affects index build time: K-means with more clusters takes longer. So in practice you choose nlist to balance build time and the granularity of the partition, then use nprobe at query time to control how many of those clusters are actually searched.

Practical tip: start with nlist ≈ sqrt(N) or 2×sqrt(N); measure recall@k and latency for a range of nprobe values. If recall is low even at high nprobe, try increasing nlist and rebuilding; if build time is too high, reduce nlist and compensate with higher nprobe at query time. The role of Voronoi cells in IVF and K-means clustering for index centroids underpin how nlist is used.

Frequently Asked Questions

Can I change nlist after build?

No. Centroids are computed at build time; changing nlist would require re-running K-means and reassigning all vectors—i.e. rebuild.

How much memory do centroids use?

nlist × dimension × 4 bytes for float32 centroids. For 10k clusters and 768-d, that’s about 30 MB—usually small vs. the vectors.

What if nlist is too large?

Build gets slower; with fixed nprobe you search fewer vectors per query but recall can drop because you probe a smaller fraction of clusters.

Does nlist need to match my data distribution?

K-means adapts to the data; nlist is the number of clusters. For very non-uniform data, some cells may be empty or huge; consider sampling or balanced variants.