mongodb not using index unless field is specified in $match

62 Views Asked by At

compound index

{
  A: 1,
  B: 1
}

In this query, the compound index is not used:

db.user.aggregate([
  { $match: {
      B: { $gt: 100 }
  } },
  {
    $sort: {
      A: 1
    }
  },
  {
    $limit: 1000
  }
])

I need do this in order to get it to use the index:

db.user.aggregate([
  { $match: {
      A: { $exists: true }
      B: { $gt: 100 }
  } },
  {
    $sort: {
      A: 1
    }
  },
  {
    $limit: 1000
  }
])

Adding A: { $exists: true } is unnecessary because all documents will have this field. I thought that mongodb was smart enough to use an index to sort?

2

There are 2 best solutions below

2
user20042973 On BEST ANSWER

The claim as written seems incorrect or incomplete. When I attempt to reproduce this on version 6.0.1 I see the index being used* as opposed to a collection scan:

> db.version()
6.0.1
> db.user.createIndex({
...           "A": 1,
...           "B": 1
...         })
A_1_B_1
> db.user.aggregate([
...   {
.....     $match: {
.......       B: {
.........         $gt: 100
.........       }
.......     }
.....   },
...   {
.....     $sort: {
.......       A: 1
.......     }
.....   },
...   {
.....     $limit: 1000
.....   }
... ]).explain().queryPlanner.winningPlan
{
  stage: 'LIMIT',
  limitAmount: 1000,
  inputStage: {
    stage: 'FETCH',
    filter: { B: { '$gt': 100 } },
    inputStage: {
      stage: 'IXSCAN',
      keyPattern: { A: 1, B: 1 },
      indexName: 'A_1_B_1',
      isMultiKey: false,
      multiKeyPaths: { A: [], B: [] },
      isUnique: false,
      isSparse: false,
      isPartial: false,
      indexVersion: 2,
      direction: 'forward',
      indexBounds: { A: [ '[MinKey, MaxKey]' ], B: [ '[MinKey, MaxKey]' ] }
    }
  }
}

We can similarly see the index used in this mongoplayground demonstration.

So more specific details about your situation would be required in order to advise further on the question as written. Is collation involved perhaps? But the general answer to your question is that MongoDB can (and does) use the index in the situation that was described which aligns with the documentation referenced in the other answerv

*Notably the index is not used as efficiently as it could be, and so it is probably separately a good idea to include the $exists clause to improve performance anyway. It just shouldn't strictly be required in order to get the index to be used in the first place.


Edit

Based on your comment, it sounds like the corrected version of your problem statement and question is:

MongoDB uses the index with the aggregation as written, but it does not use it as efficiently as compared to when the A: { $exists: true } clause is added. Why is that?

This is related to the note at the end of my original answer. It appears to be a deficiency in the current querying system, specifically the one tracked here. Adding the $exists clause (or one logically equivalent) to the preceding field in the index is effectively the workaround.

2
Mads Hougesen On

The order of the index fields matter. Indexes are read left to right.

The order of the indexed fields impacts the effectiveness of a compound index. Compound indexes contain references to documents according to the order of the fields in the index. To create efficient compound indexes, follow the ESR (Equality, Sort, Range) rule.

Taken from MongoDB compound index documentation on field order.

$sort will use the index if is used in the first stage, or if the $sort stage is only preceded by a $match stage (documentation).

Creating a new index with the following order should fix your problem:

{
  "B": 1,
  "A": 1
}

This is of course assuming that your index is not already sorted correctly.