Most efficient list to data.frame method, when the list is a list of rows

174 Views Asked by At

This question covers the case where I have a list of columns, and I wish to turn them into a data.frame. What if I have a list of rows, and I wish to turn them into a data.frame?

rowList <- lapply(1:500000,function(x) sample(0:1,300,x))

The naive way to solve this is using rbind and as.data.frame, but we can't even get past the rbind step:

>Data <- do.call(rbind,vectorList)
Error: cannot allocate vector of size 572.2 Mb

What is a more efficient to do this?

2

There are 2 best solutions below

0
On BEST ANSWER

It would probably be fastest / most efficient to unlist your list and fill a matrix:

> m <- matrix(unlist(vectorList), ncol=300, nrow=length(vectorList), byrow=TRUE)

But you're going to need ~6GB of RAM to do that with integer vectors and ~12GB of RAM to do it with numeric vectors.

> l <- integer(5e6*300)
> print(object.size(l),units="Gb")
5.6 Gb
0
On

Try direct coercion to matrix, by relying on the column major aspect of R arrays:

Data <- matrix(unlist(vectorList), ncol = length(vectorList[[1]]), byrow = TRUE)

If that also does not work you do not have the resources to copy this thing, so consider creating the matrix first and populating it column by column.