← All topics

Filtering & Querying · Topic 131

Boolean expressions in vector queries (AND, OR, NOT).

Boolean expressions let you combine metadata filters with AND, OR, and NOT. For example: “category = 'electronics' AND price < 500” or “region IN ('US', 'EU') OR premium = true.” Vector DBs evaluate these to restrict which vectors are considered (or returned) alongside the nearest neighbor search. AND narrows the candidate set; OR broadens it; NOT excludes matches. This topic covers implementation and expression support.

Summary

  • AND narrows candidates; OR broadens; NOT excludes. Used to restrict which vectors are considered or returned with nearest neighbor search.
  • With pre-filtering, the expression is applied before/during traversal (e.g. bitmaps); with post-filtering, search first then filter—may require over-fetching.
  • Support varies: some VDBs only AND of equality; others full AND/OR/NOT. Range queries often combine with boolean logic. Check whether logic is pushed into the index or applied after retrieval.
  • Trade-off: AND of equalities maps cleanly to bitmaps; OR and NOT may require multiple bitmaps and bitwise ops; complex expressions can fall back to post-filter.
  • Practical tip: prefer AND for selective filters when pre-filter is supported; use OR/IN for low-cardinality sets; check docs for max clause count or nesting depth.

Pre-filtering vs. post-filtering

Implementation depends on pre-filtering vs. post-filtering. With pre-filtering, the filter is applied before or during index traversal (e.g. via bitmaps or in-bitmap filtering), so only candidates satisfying the expression are scored. With post-filtering, you run the vector search first, then apply the boolean expression to the results—simpler but you may need to over-fetch (request more k) to get enough matches after filtering.

Pipeline (pre-filter): parse expression → build per-clause bitmaps (equality, range, etc.) → combine with AND/OR/NOT (bitwise) → single eligibility bitmap → use during traversal. Pipeline (post-filter): run ANN → for each result evaluate expression → keep only matches; if fewer than k, oversample and retry or return partial. Trade-off: pre-filter avoids scoring ineligible points but requires index support; post-filter works everywhere but may need oversample.

Expression complexity and support

Complex expressions (nested AND/OR/NOT, many clauses) can be expensive. Some systems support only a subset (e.g. AND of equality filters); others support full query languages. Range queries on numeric or date fields often combine with boolean logic (e.g. date > X AND status = ‘active’). Check your VDB’s filter syntax and whether it pushes boolean logic into the index or applies it after retrieval.

Practical tip: start with AND of equality and one range; add OR/NOT only when needed. If the VDB supports it, index all fields used in the expression so bitmap or range index build is fast.

Frequently Asked Questions

What is the difference between AND and OR in vector queries?

AND narrows the candidate set (all conditions must hold); OR broadens it (any condition can hold). NOT excludes points that match a condition. Together they restrict which vectors are considered in nearest neighbor search or returned in results.

Should I use pre-filtering or post-filtering for boolean expressions?

Pre-filtering applies the expression before or during index traversal (e.g. via bitmaps), so only eligible candidates are scored—better when the filter is selective. Post-filtering is simpler but may require requesting more k to get enough results after filtering.

Do all vector DBs support AND, OR, and NOT?

Support varies. Some support only AND of equality filters; others support full boolean expressions and range queries. Check your VDB’s filter syntax and whether complex expressions are pushed into the index or evaluated after retrieval.

Can I combine range and equality in one filter?

Yes, e.g. date > X AND status = 'active'. How it’s evaluated depends on the VDB: if supported, range and equality are combined with AND/OR into a single filter (e.g. via bitmaps) and used during pre-filtering or after retrieval.