Distributed Systems & Scaling · Topic 155

Cross-region replication

Cross-region replication copies vector data and indexes from one geographic region (e.g. us-east-1) to one or more others (e.g. eu-west-1, ap-south-1) so that reads (and sometimes writes) can be served locally, reducing network latency and enabling disaster recovery if a region fails.

Summary

Cross-region replication copies vector data and indexes from one region to others so reads (and sometimes writes) can be served locally, reducing network latency and enabling disaster recovery.
Usually async (wide-area latency); implies eventual consistency for reads in secondary regions. Replicating large vector indexes is bandwidth- and storage-intensive; primary accepts writes and secondaries are often read-only. Multi-region active-active is more complex.
Choose regions close to users for read latency; place a replica in a separate region from the primary for availability. See replication. Pipeline: primary accepts writes, async stream to secondary regions, secondaries apply and serve reads. Practical tip: use async cross-region for read replicas; keep primary in one region for write consistency.

Sync vs. async and consistency

Replication can be synchronous (write acknowledged only after replicas confirm) or asynchronous (primary acknowledges; replicas catch up later). Cross-region is usually async because of the latency of wide-area networks. That implies eventual consistency for reads in secondary regions: users may see slightly stale data until replication catches up. For many search and recommendation workloads, that is acceptable.

Challenges and topology

Challenges: replicating large vector indexes and bulk data is bandwidth- and storage-intensive; build and compaction in the primary region must be mirrored or replayed. Conflict resolution is simpler if the primary accepts all writes and secondaries are read-only replicas. Multi-region active-active (writes in multiple regions) requires conflict handling and is more complex. Choosing regions close to users improves read latency; placing a replica in a separate region from the primary improves availability.

Pipeline: primary accepts writes, async stream (WAL or segments) to secondary regions, secondaries apply and serve reads. Trade-off: async keeps write latency low but reads in secondaries are eventually consistent. Practical tip: use async cross-region for read replicas; keep primary in one region for write consistency and simpler conflict handling.

Frequently Asked Questions

What is cross-region replication?

Copying vector data and indexes from one geographic region to others so reads (and sometimes writes) can be served locally. Reduces network latency and enables disaster recovery if a region fails. See replication.

Why is cross-region usually async?

Wide-area network latency makes synchronous replication impractical. Primary acknowledges writes; replicas catch up later. That implies eventual consistency for reads in secondary regions—acceptable for many search and recommendation workloads.

What are the challenges?

Replicating large vector indexes and bulk data is bandwidth- and storage-intensive; index build and compaction in the primary must be mirrored or replayed. Conflict resolution is simpler if the primary accepts all writes and secondaries are read-only. Multi-region active-active requires conflict handling.

How do I choose regions?

Regions close to users improve read latency; placing a replica in a separate region from the primary improves availability for disaster recovery. Balance latency, cost, and compliance requirements.