Filtering & Querying · Topic 137

Pagination in ANN search (Why it’s difficult).

Pagination in ANN (approximate nearest neighbor) search is difficult because ANN algorithms return a set of near neighbors without a well-defined total order beyond the top-k. There is no global “page 2” of the “full” result set—the index only approximates the true ranking, and expanding the search (e.g. to get results 11–20) may not mean “next page” in a stable, repeatable way. Workarounds include over-fetching and cursor-based strategies.

Summary

Pagination in ANN is difficult because ANN returns a set of near neighbors without a well-defined total order beyond top-k; there is no stable “page 2” of the full result set.
Practical approach: request a larger k (e.g. 100) and page in application logic (slices 1–10, 11–20). True offset-based “skip 1000, return 10” is expensive and unstable; results can shift if index or data changes.
Workarounds: large k + in-memory pagination; cursor-based (last ID or score as cursor; support varies); accept overlap/difference across pages; for deep pagination, exact search on filtered subset or two-phase ANN + re-rank. See sorting by metadata vs. distance.
Pipeline: request top-k (e.g. 100) → slice in app for page 1..n. Trade-off: over-fetch uses more latency and memory; cursor is stable but not all VDBs support it. Practical tip: cap pages or use cursor when supported.

Why ANN pagination is hard

In exact search you could sort all vectors by distance and then take offset/limit. In ANN, you typically request a larger k (e.g. 100) and then “page” in application logic by taking slices (e.g. 1–10, 11–20). That works if you only need a few pages and can over-fetch once. True offset-based pagination (e.g. “skip 1000, return 10”) is expensive and unstable: the index isn’t built to enumerate beyond the top candidates, and results can shift between requests if the index or data changes. Pipeline: request top-k (e.g. 100) once, then slice in application logic for page 1, 2, …; true offset is not recommended.

Workarounds

Workarounds: (1) Request a large k and paginate in memory (bounded by latency and k limits). (2) Cursor-based pagination: use the last returned ID or score as a cursor and run a new query with “return only results worse than cursor” (support varies by VDB). (3) Accept that “page 2” may overlap or differ from a single big query. (4) For very deep pagination, consider exact search on a filtered subset or a two-phase approach (ANN for candidate set, then re-rank and paginate). See sorting by metadata vs. distance for interaction with sort orders. Trade-off: over-fetch uses more latency and memory; cursor is more stable but not all VDBs support it. Practical tip: cap the number of pages or use cursor-based pagination when the engine supports it.

Frequently Asked Questions

Why is pagination hard in ANN?

ANN returns a set of near neighbors without a well-defined total order beyond top-k. There is no global “page 2”; expanding the search for results 11–20 may not be stable or repeatable. The index approximates the true ranking.

What is the practical approach?

Request a larger k (e.g. 100) and paginate in application logic (e.g. slices 1–10, 11–20). Works if you need only a few pages and can over-fetch once. Bounded by latency and k limits.

What is cursor-based pagination?

Use the last returned ID or score as a cursor and run a new query with “return only results worse than cursor.” Support varies by VDB. More stable than offset for deep pagination but not all systems support it.

When is offset-based pagination a bad idea?

True offset (e.g. “skip 1000, return 10”) is expensive and unstable: the index isn’t built to enumerate beyond top candidates, and results can shift between requests. For very deep pagination consider exact search on a filtered subset or two-phase (ANN for candidates, then re-rank and paginate). See sorting by metadata vs. distance.