We have an ElasticSearch Index with around 150 GB of primary-data. It has around 120 million unique records in it. This index has 5 shards. The version of ES we are on is 6.3.2. We have various fields in the ES to which we are doing the below query:
{
"from":0,
"size":2000,
"query":{
"bool":{
"should":[
{
"match":{
"phoneNumber":{
"query":"9496073027",
"_name":"PHONEorMOBILE"
}
}
},
{
"multi_match":{
"query":"7fd3dd20c0a3c59c06eee38a94ca4",
"fields":[
"field1",
"field2",
"field3",
"field4",
"field5",
"field6"
],
"_name":"FIELD_COMB_NUM1"
}
},
{
"multi_match":{
"query":"38dc80296cba834eb76ef6eee38a",
"fields":[
"field1",
"field2",
"field3",
"field4",
"field5",
"field6"
],
"_name":"FIELD_COMB_NUM2"
}
},
{
"bool":{
"must":[
{
"match":{
"name":"Antony Chaplin"
}
},
{
"match":{
"dob":"1993-08-15"
}
},
{
"match":{
"pinCode":"682024"
}
}
],
"_name":"NDP"
}
}
]
}
}
}
We have observed that, this query takes lot of time in range of 15 seconds and also impacting the other search-queries as well.
Any suggestions on the improvisation of the query shall be highly appreciated.
I am thankful for your responses so far. We got in touch with ES team's experts through a licensed account and after running some statistics they have recommended us to reduce the RAM (= Half of the Actual RAM on ES server) of the ES cluster we are powering currently.
Also, we have moved some of the historical data(the one which is not used @ all in searching) to a different index and then closed that index, therefore the actual search is performed on some subset now. The original index has fewer documents i.e. in range of 10 million now, instead of 120 million earlier.