Embeddings & Data Prep · Topic 39

Handling updates to the embedding model (Version drift)

If you change the embedding model—e.g. upgrade to a new version or switch to a different model—the vectors already in your vector database were produced by the old model. Query vectors come from the new model. The spaces are incompatible: similarity scores and ranking become meaningless until you fix the mismatch. The only robust fix is to re-embed and re-index.

Summary

Version drift: index vectors from old model, query from new → spaces incompatible; fix by re-embed and re-index.
Use batching and parallel workers; optional dual-write or second collection for transition.
Pin model name and version in config; dimension change requires new index; see data drift detection.
You cannot mix old and new model vectors; use a new collection for the new model and cut over when ready. Fine-tuning counts as version drift.
For zero-downtime: build new collection in parallel, dual-write if needed, switch reads then retire old index.

What is version drift

Version drift is the situation where index vectors and query vectors are not from the same model or version. The only robust fix is to re-embed and re-index: run all stored documents (or their source text) through the new model and replace the vectors in the VDB. That can be expensive for large collections; use batching and parallel workers.

Some systems support dual-write or a second index: build a new collection with the new model, run queries against both during a transition, then cut over and retire the old index. You cannot mix old and new model vectors in one collection—they live in different latent spaces. When dimension changes (e.g. 384 → 768), you also need a new index because the VDB schema is tied to vector size. See what is the latent space and the role of collection or index in a VDB.

Avoiding and managing drift

To avoid accidental drift, pin the model name and version in config (e.g. “sentence-transformers/all-mpnet-base-v2” not “latest”), and ensure the same config is used for ingestion and for query encoding. If you use a managed embedding API, version the endpoint.

When you do upgrade, plan a re-index window and consider data drift detection to catch any mixed-version state. Minor model updates (e.g. patch version): if the weights or tokenizer changed, re-index; if it’s only non-embedding code, you might not need to—verify with a sample. Fine-tuning counts as version drift: fine-tuned model produces a different space than the base model; you must re-embed and re-index after fine-tuning. For zero-downtime migration: build a new collection with the new model in parallel; dual-write new data to both if needed; when new index is ready, switch reads to it and retire the old one; old data may need a one-time re-embed job.

Frequently Asked Questions

Can I mix old and new model vectors during migration?

No. They live in different latent spaces. Run queries only against one version at a time; use a new collection for the new model and cut over when ready.

Do minor model updates (e.g. patch version) require re-index?

If the weights or tokenizer changed, yes—vectors can change. If it’s only non-embedding code (e.g. bug fix that doesn’t affect output), you might not need to re-index; verify with a sample.

How do I zero-downtime migrate to a new model?

Build a new collection with the new model in parallel; dual-write new data to both if needed. When new index is ready, switch reads to it and retire the old one. Old data may need a one-time re-embed job.

Does fine-tuning count as version drift?

Yes. Fine-tuned model produces a different space than the base model. You must re-embed and re-index after fine-tuning. See fine-tuning embedding models.