Filtering & Querying · Topic 130

“In-bitmap” filtering during graph traversal

In-bitmap filtering means that during HNSW (or similar graph) traversal, every time the algorithm considers a neighbor or candidate node, it checks the filter bitmap: if the bit for that node is 0 (not eligible), the node is skipped—not added to the candidate set and not used to expand the search. So the graph is traversed as usual, but only “in-bitmap” (eligible) points contribute to the result set and to the search frontier. This topic covers how it works and how to tune it.

Summary

During HNSW (or similar) traversal, each neighbor/candidate is checked against the filter bitmap; ineligible nodes (bit 0) are skipped so only eligible points contribute to results and the search frontier.
Enables pre-filtering without rebuilding the graph—same graph, mask ineligible points on the fly. Traversal continues until enough bitmap-passing results or a limit is hit.
Very selective filters may require visiting many more nodes before finding K eligible ones; recall can suffer. Tuning efSearch higher when a filter is active helps recall at the cost of latency.
Bitmap is keyed by internal point/node ID; the filter (e.g. from Boolean metadata conditions) is converted to a bitmap before search; efficient implementations use SIMD or dense word-aligned bitmaps.
Practical tip: increase efSearch when using filters to improve recall; monitor ratio of nodes visited to eligible results to detect overly selective filters.

How in-bitmap filtering works

This is a practical way to do pre-filtering with graph indexes: you don’t rebuild the graph for the filtered set; you use the same graph and mask out ineligible points on the fly. The traversal continues until you have enough results that pass the bitmap (or you hit a limit).

The downside is that if the filter is very selective, you may have to visit many more nodes before finding K eligible ones, and the path to the true nearest eligible neighbor might be long because you’re skipping intermediate nodes—so recall can suffer. Pipeline: build filter bitmap → start traversal from entry point → for each candidate node, check bitmap; if 0 skip, if 1 add to candidates and frontier → stop when K results or efSearch limit. Tuning efSearch (or equivalent) higher when a filter is active helps recover recall at the cost of latency.

Implementation details

The bitmap is usually keyed by internal point or node ID. When the index is built, each vector has a stable ID; the filter (e.g. from Boolean conditions on metadata) is converted to a bitmap over those IDs before the search starts. During traversal, every neighbor lookup is accompanied by a bitmap check. Efficient implementations use SIMD or dense word-aligned bitmaps so the check is a single memory access and bit test.

Trade-off: no per-filter index rebuild vs. possible recall drop and higher node visits for selective filters. Practical tip: expose efSearch (or similar) per query so applications can increase it when a filter is present; combine with a max-visit limit to avoid runaway latency.

Frequently Asked Questions

What is in-bitmap filtering?

During HNSW or similar graph traversal, each candidate node is checked against a filter bitmap. If the bit is 0 the node is skipped; only eligible (in-bitmap) points are added to the candidate set and used to expand the search. See pre-filtering vs. post-filtering.

Why use in-bitmap instead of rebuilding the graph?

You keep one graph and mask ineligible points on the fly, avoiding per-tenant or per-filter indexes. The downside is that very selective filters can hurt recall because you may skip nodes on the path to the true nearest eligible neighbor.

How do I improve recall when using in-bitmap filtering?

Increase efSearch (or equivalent) when a filter is active so the traversal visits more nodes and has a better chance of finding K eligible results. This trades off latency for recall.

How is the filter converted to a bitmap?

The filter (e.g. from Boolean conditions on metadata) is evaluated over internal point/node IDs and stored as a bitmap before the search starts. During traversal each neighbor lookup does a single bitmap check; bitmaps and bitsets are designed for fast combination with index traversal.