I am using a scoring script to do prefiltering with exact k-NN. Here is how a sample query looks like:
GET /my_index/_search
{
"query": {
"script_score": {
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"range": {
"price": {
"gte": 200,
"lte": 350
}
}
}
]
}
}
}
},
"script": {
"source": "knn_score",
"lang": "knn",
"params": {
"field": "my_vector",
"query_value": [1.5, 5.5, 4.5, 6.4],
"space_type": "cosinesimil"
}
}
}
}
}
Here is a sample of the response:
"max_score": 1.017859,
"hits": [
{
"_index": "my_index",
"_id": "1234",
"_score": 1.017859,
"_source": {
"my_vector": [
1.7,
4.9,
4.8,
5.3
],
"price": 250,
"category": 376,
"subcategory": 3265
},
...
}
...
How is the score computed here? Why is it over 1? After the prefiltering (scoring script) part, is there a way to get the similarity score for just the k-NN search? My use case is once the prefiltering is done, rank the documents based on just the k-NN score. How do I achieve that?
Opensearch adds 1 to the cosine similarity score for every document. The actual cosine similarity can be calculated by subtracting 1 from the score returned by the query. This is from the documentation: "Cosine similarity returns a number between -1 and 1, and because OpenSearch relevance scores can’t be below 0, the k-NN plugin adds 1 to get the final score."
https://opensearch.org/docs/latest/search-plugins/knn/knn-score-script/