Can't replace mongo document

668 Views Asked by At

I am attempting to save documents to a mongoDB cluster (sharded replica sets) and am having a strange issue. I am using pymongo 2.7.2 and TokuMX 1.5 mongodb 2.4.10.

When I attempt to save (overwrite) existing documents I am getting an exception that looks like the document I am saving is too large:

doc = db.collection.find_one()
db.collection.save(doc)

pymongo.errors.OperationFailure: BSONObj size: 18798961 (0x71D91E01) is invalid. Size must be between 0 and 16793600(16MB) First element: op: "u"

However this works fine:

doc = db.collection.find_one()
db.collection.remove({'_id': doc['_id']})
db.collection.save(doc)

The document in question is about 9mb, so it looks like when I attempt to replace the document it is somehow doubling the size of the document, exceeding the 16mb limit.

Any ideas as to what could cause this behavior?

2

There are 2 best solutions below

0
On BEST ANSWER

Apparently this is a known issue with TokuMX. Oplog entries are twice the size of the document, so replacing a 9mb document will result in a 18mb oplog entry- which raises the exception.

The solution would be to limit document writes to less than 8mb so that oplog entries never exceed 16mb.

1
On

I think this is a side effect of how save is implemented in PyMongo.

Under the hood if the document has a _id then the save(doc) is turned into an update(doc, doc). That is where the doubling is coming into play since the query+update is 18MB.

When you removed the _id you changed the save(doc) into a insert(doc) of a new document with a new _id. I don't think that is what you wanted.

Rather than use save I would recommend constructing a query with just the _id field from the original document and doing the update call manually. I would even go so far as you should enter a Jira ticket to get PyMongo to do this for you.

HTH, Rob.