How would you go about processing a result set (length > 1) of a MongoDB query when using the rmongodb package and when your final object should be a list
?
I try to avoid R-typical "pass-by-value" copying inefficiencies which occur when simply appending the list
object while stepping through the result set. But in order to do that, I guess I would need to know how many "records" the query returned in total, don't I? That way I could span an empty list and just have that filled when stepping through the result set - or even better, I could use lapply()
and the like.
Here's a little example
Example Content
The example is taken from the MongoDB Website and implemented via rmongodb
mongo <- mongo.create(db="test")
ns <- "test.foo"
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "x", 1)
mongo.bson.buffer.append(buf, "y", 1)
x.1 <- mongo.bson.from.buffer(buf)
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "x", 2)
mongo.bson.buffer.append(buf, "y", "string")
x.2 <- mongo.bson.from.buffer(buf)
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "x", 3)
mongo.bson.buffer.append(buf, "y", NULL)
x.3 <- mongo.bson.from.buffer(buf)
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "x", 4)
x.4 <- mongo.bson.from.buffer(buf)
mongo.insert.batch(mongo, ns, list(x.1, x.2, x.3, x.4))
Query
cursor <- mongo.find(mongo, ns, query=list(y=NULL))
# Alternatively
buf <- mongo.bson.buffer.create()
mongo.bson.buffer.append(buf, "y", NULL)
query <- mongo.bson.from.buffer(buf)
cursor <- mongo.find(mongo, ns, query)
Processing Query Result
That's the best that I could come up with:
out <- NULL
while (mongo.cursor.next(cursor)) {
out <- c(out, list(mongo.bson.to.list(mongo.cursor.value(cursor))))
}
out
Yet what I'm looking for is something like as.list(cursor)
or something like this:
# Say I could find out the length of the result set:
cursor.length <- length(cursor
out <- lapply(cursor.length, function(x) {
mongo.cursor.value(cursor[[x]])
)}
# Alternative:
out <- vector("list", cursor.length)
for (x in 1:cursor.length) {
out[[x]] <- mongo.cursor.value(cursor[[x]])
)}
Is this possible? Unfortunately, I'm not really familiar with C/C++ pointer which I think are used by the package.
I have answered this question in the FAQ on rmongodb at http://cnub.org/rmongodb.ashx#FAQ This shows how to get arrays, lists and data frames from your queries.