I'm building a cross-fields search using Elasticsearch 6.2. I'm having problems in figuring out how to handle partial matches for my term.
My query:
{
"index":"course",
"type":"course",
"body":{
"query":{
"bool":{
"must":{
"multi_match":{
"query":"macroeconomics",
"fields":[
"course_name",
"course_number",
"university_name"
],
"type":"cross_fields"
}
}
}
},
"sort":[
{
"_score":"desc"
},
{
"students":{
"order":"desc"
}
}
],
"from":0,
"size":50
}
}
The query returns decent results that exactly match the macroeconomics
search term in the cross-fields mode.
The problem is that as soon as I change the search term to macro
, I get a few results only for the macro
term (exact matches), while my expected results would include:
- any results for the
macro
term (as an exact match), plus - any results for the
macro
term (as a partial match), like e.g. in "macroeconomics"
I'm aware that using wildcards is performance-heavy, so that's not an optimal way.
How do I adjust my query to get the expected results as described above? It's not about treating "macro" as a prefix only, but as a potential substring available in other results.
Basically you will need to create a custom analyzer. For reference please check the link
If you just want to give it a go. To set up the NGram Tokenizer, we should declare as the following:
"my_analyzer" is the analyzer’s name that we will use for the ngram field Then for your mappings, you need to map the analyzer to the field
Just add the analyzer to the fields that you want
UPDATE Validate your analyzer
The other one I have seen a lot is,
But it all depends on your use case.