I am trying to compute AUC for multiple gene expression combinations. I have ~2000 upregulated genes and ~1500 downregulated genes. I need to use parallel for my code as it is taking very long to compute. I tried to change my code a little(below) but the output is not like how I want it.
Desired output:
{pos.Gene1}{neg.gene1}, {pos.gene2}{neg.gene2} x1 x2 x3 x4
{pos.Gene1}{neg.gene1}, {pos.gene2}{neg.gene3} x1 x2 x3 x4
....
Output I am getting: {pos.Gene1}{neg.gene1}, {pos.gene2}{neg.gene2} x1 x2 x3 x4{pos.Gene2}{neg.gene4}, {pos.gene3}{neg.gene4} x1 x2 x3 x4
Basically, not in the order I am looking for and not well formatted.
I am new to parallel computing using R. Can you help me fix my code?
Thanks!
library(foreach)
library(doParallel)
#setup parallel backend to use many processors
cores=detectCores()
cl <- makeCluster(cores[1]-1) #not to overload your computer
registerDoParallel(cl)
validation = readRDS("file with 5 lists each having sublists")
path = "XXX"
source(paste0(path,"Functions.R")) # importing functions to use in the script
genes = readRDS("list with 2 sublists")
pos.genes = genes$pos # list of upregulated genes
neg.genes = genes$neg # list of downregulated genes
mat = matrix(nrow = 1, ncol = 5) #creating empty matrix
fileConn = "validation_parallel.tsv"
foreach( i=1 : length(pos.genes)%:%
{
foreach( j=1: length(neg.genes) %:%
{
foreach( k=(j+1): length(neg.genes) %dopar%
{
score <- function(data){######} # function that returns a matrix of 1 row and 4 columns having numbers
pair1 = paste0("{",c(pos.genes[i], pos.genes[i+1]),"}","{",c(neg.genes[j],neg.genes[k]),"}",collapse = ',')
# for each iteration in k, calculating score and continuously writing output to a file
mat[1,1] = pair1
mat[1,2] = score[1,1]
mat[1,3] = score[1,2]
mat[1,4] = score[1,3]
mat[1,5] = score[1,4]
cat(mat,file = fileConn, sep = "\t", append = TRUE)
cat(file = fileConn, sep = "\n", append = TRUE)
}
}
}
#stop cluster
stopCluster(cl)