Elasticsearch weird sort results

1k Views Asked by At

I have some data stored in elasticsearch with tire. Assume that i have such data: customer_name/ amount.

Now when i'm sorting by amount everything is ok. But when sorting by name results are unexpected:

These are results of sorting by name desc:

Lukas Marcus
Visser
Visser
Meik Kalte
Meik Kalte
Kalte Meik
Meik Kalte
Meik Kalte
Cust Imp Mc
Cust Imp Mc
Cust Imp Mc
John Doe
John Doe
Born Joan
Born Joan
Born Joan
Card Image 
Card Image
Card Image 
Aelps_Iso 
Aelps_Iso 

Sorting asc:

Visser
Visser
Cust Imp Mc 
Card Image
Card Image
Cust Imp Mc
Cust Imp Mc
Aelps_Iso
Aelps_Iso
Born Joan
Born Joan
Born Joan
Card Image
John Doe
John Doe
Meik Kalte
Meik Kalte
Kalte Meik
Meik Kalte
Meik Kalte
Lukas Marcus

Note that 'Visser is on top in both of cases.

Query params:

@filters=[{:term=>{"user_id"=>605}}], 
@sort=[{"customer_name"=>{:order=>"asc"}}

Any hints?

2

There are 2 best solutions below

1
On

I think if we are analyzing the field of type string we cannot sort on it.

When sorting, the relevant sorted field values are loaded into memory. This means that per shard, there should be enough memory to contain them. For string based types, the field sorted on should not be analyzed / tokenized. For numeric types, if possible, it is recommended to explicitly set the type to six_hun types (like short, integer and float).

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-sort.html#_memory_considerations

0
On

I don't know why everybody recommends setting a field as not_analyzed for sorting by name. By default, a not_analyzed field will sort lexicographically when you most likely want it sorted alphabetically with case ignored.

What you really want is an analyzer with a keyword tokenizer and a lowercase filter.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/sorting-collations.html