How does ElasticSearch create a vector representation of a document?

63 Views Asked by At

For its approximate nearest neighbor (ANN) search using HNSW (Hierarchical Navigable Small Worlds), Elasticsearch performs document similarity by comparing documents represented in vector form. How are these vectors created? I am familiar with word embeddings for individual words (ala Word2Vec). I am also familiar with bag-of-words (BOW) representations. Are these vectors directly created from some amalgam of word embeddings, such as a predefined set of keywords? Any pointer to where in their documentation this process is described would be helpful.

0

There are 0 best solutions below