Efficient way to get field counts for multipart query

29 Views Asked by At

I have a Solr collection with fields that I’ll call field_A, field_B (both string fields), field_C and field_D (both numeric fields). Field_A and Field_B have a many-to-one relationship; field_A values are unique, but there can be multiple field_A values for a given field_B value.

Suppose I want to express a condition: WHERE field_C > 0 AND field_D < 100. I want to get the field_A counts and the unique field_B counts for each subcondition, something like:

Field_C > 0

  • Num of field_A matches
  • Num of unique field_B matches

Field_D < 100

  • Num of field_A matches
  • Num of unique field_B matches

Through looking at the Solr docs and other StackOverflow posts, I’ve found several methods of doing so:

  1. facet.query for each subcondition and facet.field = field_B.
  2. group.query for each subcondition and facet.field=field_B.
  3. Facet or group query combined with stats.field={!count=true calcDistinct=true}field_B and stats.field={!count=true}field_A.

I’ve heard calcDistinct can be very costly for fields with high cardinality, which is the case for my field_B (around 10k unique values in the collection). Would the other two approaches be more efficient?

0

There are 0 best solutions below