The docs in my index has the following fields
{
"weight" : int
"tags" : string[]
}
tags
is a list of string. Eg - ["A", "B", "C", "D"]
. Lets assume my index has the following data
[
{
"weight": 1,
"tags": [
"B",
"C"
]
},
{
"weight": 2,
"tags": [
"A"
]
},
{
"weight": 3,
"tags": [
"B"
]
},
{
"weight": 4,
"tags": [
"A",
"C"
]
},
{
"weight": 5,
"tags": [
"C"
]
}
]
I have a param priority = ["A", "C"]
. I want to fetch documents based on the priority list. So since "A" appears first in list, the docs with tag "A" should appear first in output. If doc1
and doc2
both have the same tag, then the doc with greater weight
should appear first in output. So output should be
[
{
"weight": 4,
"tags": [
"A",
"C"
]
},
{
"weight": 2,
"tags": [
"A"
]
},
{
"weight": 5,
"tags": [
"C"
]
},
{
"weight": 1,
"tags": [
"B",
"C"
]
}
]
Can we achieve this in ElasticSearch ? I have also heard about Painless scripts. How can we use Painless scripts here, if we can ?
The first thing you need to know is that the tags indexed in the
tags
array are not necessarily indexed in the same order as you specify them in the source. Usually, the lexical order prevails, and while it works with simple letters likeA
,B
andC
, your real tags might be different and not listed in lexical order. To sum up, you cannot count on the order of the tags list in order to boost certain documents relative to others.Similarly, if you were to specify a
terms
clause in your query to give more importance toA
overC
(as inpriority = ["A", "C"]
), ES would not necessarily use that order to execute your query.The solution I'm giving you below respects the conceptual ordering of your priority, by using a
bool/should
query, where the first element has a bigger boost factor than the second, the second has a bigger boost factor than the third, etc. In this case, we should boostA
overC
so I'm giving documents having tagA
a boost of 2 and the ones with tagC
a boost of 1. If you had three tags, you would start at 3, instead. This will properly boost the documents as per your desired priorities.The next part is to account for documents having equal score, and for this we can simply sort by descending weight:
The above query, when executed over your sample set of documents, would yield the results you expect: