E.g. I have such documents in the collection:
{
"key": "key1",
"time": 1000,
"values": [] // this one is optional
}
I need to update the collection from, let's say, CSV file by modifying or removing values
column and where key
& time
are filters .
What I've tried so far:
- DeleteMany(with
or(and(key: key1), and(time: time2))
, ... 276k moreor
arguments) + InsertMany with 276k documents => ~ 90 seconds - Bulk ReplaceOne with (
filter: and(key: key1, time: time2)
) => ~ 40 seconds - Split huge bulk into several smaller batches (7500 seems to be the most performant), but this one is not atomic in terms of db operation => ~ 35 seconds
Notes:
- All tests were with
bulk.ordered = false
to improve performance. - There is unique index
key: 1, time: -1
Is there a possibility to optimize such kind of request? I know Mongo can burst to ~80k inserts/s, but what about replacements?
Bulk operations are not atomic as the submitted group. Only individual operations are atomic. Note also that the driver will split bulk operations into smaller batches automatically if you submit more than a certain number (1,000 when encryption is not used) which is why huge batches tend to perform worse than batches of under one thousand.
To answer your question on performance:
You are going to have lower performance using SSD and magnetic disk backing storage, naturally. The idea with the memory test is to ensure you are using the database as efficiently as possible.
Especially with a mixed read and write workload, if you are using a magnetic disk, switching to SSD storage should yield a noticeable performance gain.