Database Internals & Storage · Topic 110

How compaction works in a vector database

Compaction is the background process that merges multiple immutable segments (and/or WAL-derived data) into fewer, larger segments. It reduces read amplification—fewer segments to query—and reclaims space from deleted or overwritten points. For metadata stored in an LSM-style structure, compaction is the same idea as in classic LSM: merge sorted runs and drop obsolete entries. This topic covers what compaction does for segments and the vector index, and how to tune it.

Summary

Compaction merges multiple immutable segments (and/or WAL-derived data) into fewer, larger segments; reduces read amplification and reclaims space from deletes and overwrites.
For metadata in LSM-style stores: merge sorted runs, drop obsolete entries—same as classic LSM. For the vector index: merge small HNSW/IVF segments or refresh index to include WAL/mutable buffer data.
Compaction is I/O and CPU heavy; often done incrementally or during low load. Tuning (frequency, which segments to merge, parallelism) affects write throughput and query latency.
Works with WAL, immutable segments, and snapshots to keep segment count and “tail” volume bounded for fast queries and bounded storage.
Practical tip: monitor segment count and WAL size; tune compaction frequency and parallelism so compaction keeps up with write rate without starving queries.

What compaction does for segments and metadata

Compaction merges multiple immutable segments into fewer, larger ones and reclaims space from deleted or overwritten points. For metadata in an LSM-style store, it’s the same as classic LSM: merge sorted runs and drop obsolete entries. That keeps read amplification under control so queries don’t have to scan too many files.

Pipeline: background job selects segments to merge (e.g. by size or age) → reads and merges sorted runs → drops obsolete keys (overwrites, deletes) → writes new segment → atomically switches visibility and drops old segments. For vector index: merge small graph or IVF segments into one, or rebuild/refresh to incorporate WAL tail.

Compaction of the vector index

For the vector index itself, “compaction” might mean merging several small HNSW or IVF segments into one, or refreshing the index to include new vectors that were only in the WAL or a mutable buffer. That can be I/O and CPU heavy, so it’s often done incrementally or during low load.

Trade-off: aggressive compaction keeps segment count low and query latency stable but consumes I/O and CPU; lazy compaction reduces resource use but can let segment count and WAL tail grow, increasing read merge cost. Practical tip: run compaction during off-peak or throttle it so it doesn’t compete too much with query latency.

Why compaction matters for performance

Good compaction strategy keeps the number of segments and the volume of “tail” data bounded, so that queries stay fast and storage doesn’t grow without limit. It works in tandem with WAL, immutable segments, and snapshots to give a consistent, durable view of the data.

Frequently Asked Questions

When does compaction run?

Typically in the background: when segment count or WAL size exceeds a threshold, or on a schedule. Some systems compact incrementally (merge a few segments at a time) to avoid long pauses. Tune to balance write throughput and query latency.

Does compaction block queries?

Usually no. Compaction produces new segments; once ready, the system switches visibility atomically so queries see the merged view. Old segments are dropped when no longer referenced. Heavy compaction can still compete for I/O and CPU and raise tail latency.

What if compaction can’t keep up with writes?

Segment count and WAL tail grow; query latency increases because more segments must be searched. Mitigate by scaling compaction parallelism, lowering write rate temporarily, or adding resources. Monitoring segment count and WAL size helps catch this early.

Is compaction the same for vectors and metadata?

Conceptually yes—merge and reclaim. For metadata in an LSM, it’s merge of sorted runs. For vector indexes, it’s merging index segments (e.g. HNSW graphs) or rebuilding to incorporate new vectors from the WAL or buffer.