Iterating over a collection in MongoDB for updates

2.3k Views Asked by At

I'm iterating over a collection (running Moped as Ruby driver) but how to update one field for every document?

irb> session = Moped::Session.new(["127.0.0.1:27017"])
irb> session.use :demoapp
irb> users = session[:users]
irb> users.find.each {|u| u.update(age: rand(18..80))}

This doesn't update the field "age", while a simple

irb> users.find.each {|u| users.find(_id: u["_id"]).update(age: rand(18..80))}

does. But it seems not to be very effective to iterate over a collection and then look up for the id in every iteration. So how could I simplify that? I need some fast way to update millions of documents this way.

Regards, Chris

1

There are 1 best solutions below

4
On

You're treating moped more like it's mongoid. Moped is not an ODM -- it's a low-level mongodb driver.

When you iterate users.find you get a collection of simple Moped::BSON::Document objects, which are a lot more like ruby Hash objects than anything else. So when you call update on them, you're just updating the local one in memory and not touching the database.

Similarly

users.find(_id: u["_id"]).update(age: rand(18..80))

is not as bad as you think. Moped compiles this to a single update command -- it doesn't fetch the document, modify it, and then write it back.

For ease of development, you'll probably be happier actually using mongoid, like this:

class User
    include Mongoid::Document
    field :age, type: Integer
end 

User.all.each do |u| 
    u.age = rand(18..80)
    u.save!
end

But if performance is critical, moped is faster. You might also benchmark the official 10gen ruby driver. If you can port your code to javascript, you could run it on the mongodb server itself, which would eliminate network delays, but be careful about locking up the whole database while you do such things.