We do have an index of e-books that contains the metadata of the e-book and each page as a single document:
- Book 1
- Page 1
- Page 2
- Page 3
- ...
- Page n
The parent document contain a field is_parent:true
, an id
and doc_id
.
For parent documents, the id
is the same as doc_id, for the child documents the id
is {doc_id}_{page_number}
. The doc_id
is always the same for the parent and it's children (that's how we know which child belongs to which parent).
So now the challenge is to offer a fulltext search that uses eDisMax to find the best result for a searchterm from the parents metadata and the childrens content and combines them to a score value.
The current query looks like this:
'fl' => 'id,doc_id,main_type,title,subtitle,publishing_date,page_number,author,score',
'defType' => 'edismax',
'stopwords' => 'true',
'mm' => '1',
'qf' => 'id^10.0 title^50.0 subtitle^40.0 author^20.0',
'q' => 'TEST _query_:"{!parent which=is_parent:true score=Max}{!dismax qf='content' v='TEST'}"^20',
'sort' => 'score DESC'
But it seems the score of the _query_
is not added to the score of the main query. Also, if I do this without the "TEST" term in the beginning: 'q' => '_query_:"{!parent which=is_parent:true score=Max}{!dismax qf='content' v='TEST'}"^20',
I get a score value - so somehow the score is there but I'm not sure how to use it properly.
Is there another option to do a nested search with combined scores?
Before the questions comes up why we did not use the nested documents feature of SOLR: we created this index years ago using SOLR 4.6 and this feature didn't exist at this time. We now used the IndexUpdater Tools of SOLR 5 and SOLR 6 to update our index to SOLR 6.
We weren't able to create our fulltext search in the past, because SOLR BlockJoin feature did not support score calculation at all, but that changed in SOLR 5 so we want to give it another try. Every hint in the right direction would be helpfull.