Iterative fine-tuning of the embedding models

82 Views Asked by Deepak Tatyaji Ahire At 27 July 2025 at 17:12

Context:

I learned that to improve the search results of the RAG based approach, we can finetune the open source embedding models.

Now I have one pdf file (my private dataset) using which I will create the train/eval dataset, finetune my embedding model, and store the embeddings into a vector DB. Suppose n embeddings were stored into the DB.

Question:

Now tommorrow, a new pdf file comes to me.

Then should I re-finetune the earlier finetuned embedding model?
Because this re-finetuned model will now generate slightly different embeddings, will the earlier n embeddings stored in my DB get outdated? Will I have to delete them?
So now, if there are m new embeddings from the second pdf file to be stored, will I have to store a total of (n + m) new embeddings into the DB?

So as the new data comes up, this problem will become a squared order problem as far as complexity is considered.

Original Q&A

Iterative fine-tuning of the embedding models

There are 0 best solutions below

Related Questions in ALGORITHM

Related Questions in EMBEDDING

Related Questions in WORD-EMBEDDING

Related Questions in LARGE-LANGUAGE-MODEL

Related Questions in LLAMA-INDEX

Trending Questions

Popular # Hahtags

Popular Questions