I am still quite new to r (used to program in Matlab) and I am trying use the parallel package to speed up some calculations. Below is an example which I am trying to calculate the rolling standard deviation of a matrix (by column) with the use of zoo package, with and without parallelising the codes. However, the shape of the outputs came out to be different.
# load library
library('zoo')
library('parallel')
library('snow')
# Data
z <- matrix(runif(1000000,0,1),100,1000)
#This is what I want to calculate with timing
system.time(zz <- rollapply(z,10,sd,by.column=T, fill=NA))
# Trying to achieve the same output with parallel computing
cl<-makeSOCKcluster(4)
clusterEvalQ(cl, library(zoo))
system.time(yy <-parCapply(cl,z,function(x) rollapplyr(x,10,sd,fill=NA)))
stopCluster(cl)
My first output zz has the same dimensions as input z, whereas output yy is a vector rather than a matrix. I understand that I can do something like matrix(yy,nrow(z),ncol(z)) however I would like to know if I have done something wrong or if there is a better way of coding to improve this. Thank you.
From the documentation:
And:
So, I'd suggest you use
parApply
.