Say I have the following MongoDB collection (am using mongomock for this example so it's easy to reproduce):
import mongomock
collection = mongomock.MongoClient().db.collection
objects = [{'name': 'Alice', 'age': 21}, {'name': 'Bob', 'age': 20}]
collection.insert_many(objects)
I then would like to update my existing objects with the fields from some new objects:
new_objects = [{'name': 'Alice', 'height': 170}, {'name': 'Caroline', 'height': 160}]
The only way I could think of doing this is:
for record in new_objects:
if collection.find_one({'name': record['name']}) is not None:
collection.update_one({'name': record['name']}, {'$set': {'height': record['height']}})
else:
collection.insert_one(record)
However, if new_objects is very large, then this method becomes slow - is there a way to use update_many for this?
You can't use
update_many(), because it requires a single filter which in your use case would not work as each filter is different.A simpler construct uses
upsert=Trueto avoid the insert/update logic, and also sets all the fields specified in the record which is less coding :If it is slowing down with a larger number of updates, make sure you have an index on the
namefield using (in mongo shell):You can squeeze a bit more performance out by using a bulk_write operation. Worked example:
Gives: