I have a Solr collection with fields that I’ll call field_A, field_B (both string fields), field_C and field_D (both numeric fields). Field_A and Field_B have a many-to-one relationship; field_A values are unique, but there can be multiple field_A values for a given field_B value.
Suppose I want to express a condition: WHERE field_C > 0 AND field_D < 100. I want to get the field_A counts and the unique field_B counts for each subcondition, something like:
Field_C > 0
- Num of field_A matches
- Num of unique field_B matches
Field_D < 100
- Num of field_A matches
- Num of unique field_B matches
Through looking at the Solr docs and other StackOverflow posts, I’ve found several methods of doing so:
- facet.query for each subcondition and facet.field = field_B.
- group.query for each subcondition and facet.field=field_B.
- Facet or group query combined with stats.field={!count=true calcDistinct=true}field_B and stats.field={!count=true}field_A.
I’ve heard calcDistinct can be very costly for fields with high cardinality, which is the case for my field_B (around 10k unique values in the collection). Would the other two approaches be more efficient?