How does Annoy Index the embeddings?

666 Views Asked by Chaitanya Patil At 27 January 2021 at 19:42

I am trying to understand how Annoy Indexing works..I have referred the following documents: https://github.com/spotify/annoy#how-does-it-work
https://cloud.google.com/solutions/machine-learning/building-real-time-embeddings-similarity-matching-system
These documents explain how to get index from annoy but it does not explain HOW the Indexes are created?

Lets say I have sentence embedding matrix of 3 dimension (for simplicity)

[[1,2,3]                
[4,2,3]             
[1,2,3]             
[1,1,1]]

Looking at many resources has confused me in the following:

Will the Annoy first index these and then use to find nearest neighbors?
Apply nearest neighbor tree and then index based on the neighbors? This seems to be the most appropriate one.. If it is then How does it index? I want to know the algorithm behind it..

There are 0 best solutions below