Output of parApply different from my input

397 Views Asked by At

I am still quite new to r (used to program in Matlab) and I am trying use the parallel package to speed up some calculations. Below is an example which I am trying to calculate the rolling standard deviation of a matrix (by column) with the use of zoo package, with and without parallelising the codes. However, the shape of the outputs came out to be different.

# load library
library('zoo')
library('parallel')
library('snow')

# Data
z <- matrix(runif(1000000,0,1),100,1000)

#This is what I want to calculate with timing
system.time(zz <- rollapply(z,10,sd,by.column=T, fill=NA))

# Trying to achieve the same output with parallel computing
cl<-makeSOCKcluster(4)
clusterEvalQ(cl, library(zoo))
system.time(yy <-parCapply(cl,z,function(x) rollapplyr(x,10,sd,fill=NA)))
stopCluster(cl)

My first output zz has the same dimensions as input z, whereas output yy is a vector rather than a matrix. I understand that I can do something like matrix(yy,nrow(z),ncol(z)) however I would like to know if I have done something wrong or if there is a better way of coding to improve this. Thank you.

1

There are 1 best solutions below

0
On BEST ANSWER

From the documentation:

parRapply and parCapply always return a vector. If FUN always returns a scalar result this will be of length the number of rows or columns: otherwise it will be the concatenation of the returned values.

And:

parRapply and parCapply are parallel row and column apply functions for a matrix x; they may be slightly more efficient than parApply but do less post-processing of the result.

So, I'd suggest you use parApply.