rmongodb: converting mongo bson values into dataframe

1.3k Views Asked by At

I am running mongodb query in R using rmongodb. I have already obtained mongo.cursor object and need to convert these cursor values into R dataframe. However, my values contains some empty strings and unwanted long characters so I need to convert these empty strings and long strings into NA so that they can be converted to dataframe. Below is my code

library(rmongodb)
mongo <- mongo.create(host="localhost")
dbns <- mongo.get.database.collections(mongo, db="namedisambiguation")
query <- '{ "name": { "$exists": true }, "username": { "$exists": true } }'
fields <-  '{ "username": 1, "name": 1, "location": 1}'
cur <- mongo.find(mongo, dbns, query=query, fields=fields)
username <- name <- location <- NULL
while (mongo.cursor.next(cur)) {
        value <- mongo.cursor.value(cur)
        username <- rbind(username, mongo.bson.value(value, 'username'))
        name <- rbind(name, mongo.bson.value(value, 'name'))
        location <- rbind(location, mongo.bson.value(value, 'location'))
        }

data2 <- data.frame(username=username, name=name, location=location)

My location yields following output:

[9972,] "NA"                                        
[9973,] ""                                          
[9974,] ""                                          
[9975,] ""                                          
[9976,] ""                                          
[9977,] "Madrid"                                    
[9978,] ""                                          
[9979,] ""                                          
[9980,] "San Antonsdnndsjo\todurnv\tkckdn"                               
[9981,] ""                                          
[9982,] ""                                          
[9983,] ""                                          
[9984,] ""                                          
[9985,] ""     

How can I convert these empty values and long strings like "San Antonsdnndsjo\todurnv\tkckdn" into NA ?

1

There are 1 best solutions below

0
On

Not sure whether I get your question right but wouldn't something along these lines do the job:

maxlength <- 16

location <- ifelse( length( mongo.bson.value(value, 'location' ) ) == 0 | 
                    length( mongo.bson.value(value, 'location' ) ) > maxlength,
                rbind( location, NA ),
                rbind( location, mongo.bson.value( value, 'location' ) ) )

do the job (if you insist on using rbind())?