Here is our MongoDB Atlas Search index definition:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "label": {
        "maxGrams": 7,
        "tokenization": "nGram",
        "type": "autocomplete"
      },
      "ptgc": {
        "type": "string"
      },
      "sort": {
        "type": "number"
      }
    }
  },
  "storedSource": true
}

Not sure if the node.js route is needed for this question, but sharing what we are using in case it does provide some help:

router.get('/live-search/type/:type/text/:text', function (req, res) { // atlas search    
    try {
        let type = req.params.type; // one of 'all', 'team', 'player', 'conference'
        let text = req.params.text;

        // define the atlas search terms
        let filterQuery = type === 'all' ? ['conference', 'player', 'team'] : type;
        let searchTerms = {
            index: 'app_search_index',
            returnStoredSource: true,
            'compound': {
                filter: {
                    text: {
                        "query": filterQuery,
                        "path": "type"
                    }
                },
                must: {
                    autocomplete: {
                        query: `${text}`,
                        path: 'label'
                    }
                }
            }
        };

        // define the query filters utilizing search above, plus sort & limit
        let queryFilters = [
            { $search: searchTerms },
            { $addFields: { score: { $meta: "searchScore" } } },
            { $sort: { sort: 1 } } // "sort" is 1: conference, 2: team, 3: player based on "type"
        ];

        // run aggregate pipeline and return
        db.our_table
            .aggregate(queryFilters)
            .then(data => {
                console.log(data.length);
                console.log(data);
                res.json(data); 
            })
            .catch(err => res.status(400).json('Error: ' + err));
    } catch (error) {
        console.log('error: ', error);
        res.status(500).json({ statusCode: 500, message: error.message });
    }
});

In the table we are searching, let's consider the 3 options Cameron Brink: Stanford, Cameron Boozer: AUM, and Haley Cameron: Eastern Ill.. All 3 options are type: player, therefore we can ignore the filtering and sorting by type as it does not come into play in this example.

We are searching the phrase cameron bri. The results - both Cameron Brink: Stanford and Cameron Boozer: AUM return a search score of 5.1432, whereas Haley Cameron: Eastern Ill. returns a score of 5.22797.

We need / strongly prefer that this search return a higher score for Cameron Brink: Stanford than it does for the other 2 options. Logically, I believe that cameron bri is a closer match to Cameron Brink: Stanford than it is to the other two options, but our index does not agree. It seems that the search is putting no stock into the bri of the search.

How can we modify our MongoDB Atlas Search index definition to better capture that Cameron Brink: Stanford is a better match for cameron bri than the other two options?

1

There are 1 best solutions below

0
Canovice On

Looks like I found a workable solution, with the help of Multiple documents having equal search score in MongoDB Atlas Search - in particular the suggestion to define the label as both autocomplete and text in the index definition seems to be helping.

The compound part of the search in the route has been updated to:

'compound': {
    filter: {
        text: {
            query: filterQuery,
            path: "ptgc"
        }
    },
    should: {
        text: {
            query: `${text}`,
            path: 'label',
            score: { boost: { value: 1 } } 
        }
    },
    must: {
        autocomplete: {
            query: `${text}`,
            path: 'label'
        }
    }
}

This update to the compound search is different from the recommendation that both options be should causes, as I have left the autocomplete as a must clause.

Cameron Brink: Stanford gets a score of 8.795, second only to the option Cameron Williams: Bridgewater (VA) at 8.82, which is okay makes sense given the match of bri to bridgewater for this option.

Cameron Boozer: AUM returns the same previous score of 5.1432, same with Haley Cameron: Eastern Ill. returning a score of 5.22797 still.