Previously I used bigstatsr to store computed values (double).
These values were computed in parallel using the doParallel package.
Now, I'm trying to use bigstatsr to store more complex objects. Typically in the form of list.
And only now I notice that bigstatsr do not support list object or even character type...
Are there any alternatives which I can use to store list objects?
Below a reproducible example.
Notice that bigMatrice_sepalLength can capture the numerical values
But bigMatrice_sepalLengthWidth fails to capture list of numerical values and bigMatrice_species couldn't store character values.
In my actual use case, I would need to parallelize some forecasting models.
Typically the outputs are list of values(historical, forecast), list of models, list of models parameters, etc.
Previously I can use bigstatsr because I only store the forecast values. Although this time around, I want to capture other information as well.
Thanks!
library(foreach)
library(doParallel)
bigMatrice_sepalLength <- bigstatsr::FBM(nrow=nrow(iris),ncol=1)
bigMatrice_sepalLengthWidth <- bigstatsr::FBM(nrow=nrow(iris),ncol=1)
bigMatrice_species <- bigstatsr::FBM(nrow=nrow(iris),ncol=1)
cl_start <- Sys.time()
# set nb core to use
#no_cores <- detectCores() - 3
no_cores <- 1
# create cluster
cl <- makeCluster(no_cores, outfile = "")
# prepare cluster
registerDoParallel(cl)
foreach(i = 1:nb_forecast_level, .combine=cbind) %dopar% {
bigMatrice_sepalLength[i,] <- iris[i,1]
bigMatrice_sepalLengthWidth[i,] <- iris[i,c(1,2)] %>% list()
bigMatrice_species[i,] <- iris[i,5]
}
stopCluster(cl)
stopImplicitCluster()
cl_end <- Sys.time()
(cl_runtime <- cl_end - cl_start)
as_tibble(bigMatrice_sepalLength[])
as_tibble(bigMatrice_sepalLengthWidth[])
as_tibble(bigMatrice_species[])
Results
> as_tibble(bigMatrice_sepalLength[])
# A tibble: 150 x 1
value
<dbl>
1 5.1
2 4.9
3 4.7
4 4.6
5 5
6 5.4
7 4.6
8 5
9 4.4
10 4.9
# … with 140 more rows
> as_tibble(bigMatrice_sepalLengthWidth[])
# A tibble: 150 x 1
value
<dbl>
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
# … with 140 more rows
> as_tibble(bigMatrice_species[])
# A tibble: 150 x 1
value
<dbl>
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
# … with 140 more rows