Atomic updates in a VDB
An atomic update in a vector database means that a single logical change—e.g. upserting a vector and its metadata, or deleting a point—is applied as a whole: either all parts of the change are visible to readers, or none are. That avoids partial state (e.g. new vector with old metadata) and supports consistency guarantees. This topic covers how atomicity is implemented and its scope.
Summary
- Atomic update = one logical change (e.g. upsert vector + metadata, or delete) applied as a whole: all visible or none; avoids partial state and supports consistency.
- Often implemented via WAL (one entry per update) or in-place pointer/version switch; with immutable segments, update = delete + append in new segment with atomic visibility.
- Atomicity is usually per point or per batch; multi-point transactions are a stronger guarantee and not always supported. Per-point atomic upsert/delete is usually enough for search and RAG.
- Trade-off: WAL gives durable atomicity; pointer switch gives low-latency visibility. Immutable segments trade write amplification for simple, consistent reads.
- Practical tip: ensure upsert and delete APIs are atomic per point; for batch updates, check whether the batch is applied atomically or as a sequence of point updates.
How atomicity is implemented
Implementation often uses a write-ahead log (WAL): the update is written to the WAL as one entry; on replay or visibility, the full record is applied. For in-place updates, the VDB may write the new version and switch a pointer or version number in one step so queries never see a half-updated point.
With immutable segments, an “update” can be implemented as delete + append in a new segment, with segment visibility updated atomically. Pipeline: write request → single WAL entry (vector + metadata) → apply to index and metadata store as one unit → visibility switch. No reader sees “new vector, old metadata” or vice versa.
Scope of atomicity
Atomicity applies to a single point or a batch depending on the API; multi-point transactions (e.g. update A and B together) may or may not be supported and are a stronger guarantee. For most search and RAG workloads, per-point atomic upsert and delete is enough to keep indexes and metadata in sync.
Trade-off: per-point atomicity is standard and sufficient for most use cases; multi-point transactions add implementation complexity and are less common in VDBs. Practical tip: if you need “update A and B or neither,” check the vendor’s transaction or batch semantics.
Frequently Asked Questions
What does “atomic” mean for a vector update?
Either the entire update (vector + metadata, or delete) is visible to readers, or none of it is. You never see “new vector with old metadata” or a half-applied change.
How does the WAL support atomic updates?
The write-ahead log records the full update as one entry. On replay or when making data visible, the system applies that entry as a unit, so the update is all-or-nothing from a reader’s perspective.
Can I update two vectors in one transaction?
It depends on the VDB. Many support only per-point or per-batch atomicity. Multi-point transactions (update A and B atomically together) are a stronger guarantee—check your vendor’s ACID/consistency docs.
Why do immutable segments use delete + append for updates?
With immutable segments, data is never overwritten in place. An “update” is implemented as marking the old point deleted and appending the new version to a new segment; segment visibility is then updated atomically so readers see the new state consistently.