Database Internals & Storage · Topic 124

Handling high-frequency updates (Streaming ingestion)

When vectors arrive continuously—e.g. from a log stream, event pipeline, or real-time indexing—the vector database must accept and index them without blocking and without rebuilding the whole index on every insert. Streaming ingestion is the pattern of writing in small batches and merging into the searchable index incrementally. This topic covers WAL, buffer, segment merge, and throughput.

Summary

Streaming ingestion: accept continuous writes in small batches; merge into the searchable index incrementally without full rebuilds. Uses WAL + immutable segments + mutable buffer.
New vectors → WAL and buffer; buffer flushes as new segment and merges into main index (e.g. incremental HNSW). Queries search WAL/buffer + segments so new data is visible quickly. Atomicity at batch or segment level.
Batching improves throughput; rate limiting and backpressure handle spikes. Very high frequency: “hot” recent data in fast path, merge into colder index async for higher ingest QPS.
Trade-off: low visibility latency vs. merge/compaction cost; buffer size and flush policy tune the balance.
Practical tip: batch writes (e.g. 100–1000 vectors per request); monitor buffer size and compaction lag; use backpressure if ingest exceeds merge capacity.

WAL, buffer, and segment merge

Many VDBs use a write-ahead log (WAL) plus immutable segments: new vectors are appended to the WAL and maybe to a mutable in-memory buffer. When the buffer fills or a timer fires, it’s flushed as a new segment and merged into the main index (e.g. incremental HNSW or a new IVF partition). Queries search across the WAL/buffer and all segments, so new data is visible shortly after write, while full index rebuilds are avoided. Atomicity is usually at the batch or segment level.

Pipeline: ingest request → append to WAL (and optionally flush) → add to mutable buffer → when buffer is full or timed, seal as new segment → merge segment into main index (incremental or background). Query path reads buffer + all segments and merges results.

Throughput and backpressure

Throughput is improved by batching: accept many small writes and commit them in larger chunks to reduce compaction and merge overhead. Backpressure and rate limiting help when ingestion spikes. For very high frequency, some systems separate “hot” recent data (e.g. last hour) in a fast path and merge into a colder index asynchronously, trading a short delay for searchability of the tail for higher ingest QPS.

Trade-off: smaller buffers give faster visibility but more frequent flushes; larger buffers reduce merge frequency but increase memory and visibility delay. Practical tip: size buffer and flush interval to your target visibility and throughput; scale indexing workers if merge can’t keep up.

Frequently Asked Questions

When is my newly ingested vector searchable?

After it’s in the WAL (and usually the mutable buffer). Queries typically read from the buffer + segments, so visibility is within milliseconds to seconds depending on flush policy. Until the buffer is flushed to a segment, data may only be in WAL + buffer.

What if ingestion rate exceeds merge capacity?

Buffer and WAL grow; query latency can increase (more segments to search). Use rate limiting or backpressure to slow producers, or scale indexing workers. Monitor buffer size and compaction lag.

Does streaming ingestion support deletes?

Yes. Deletes are recorded in the WAL (e.g. tombstone or delete list) and applied when merging segments; see soft vs. hard deletes and garbage collection. Streaming and delete handling share the same WAL + segment model.

How does this differ from batch-only ingestion?

Batch-only: you load a large dataset and build the index once (or periodically). Streaming: continuous small batches, incremental merge, so new data is always being added and is searchable soon. Streaming is for real-time or event-driven pipelines; batch for bulk backfills or nightly loads.