Is there a way in R for doing a pairwise-weighted correlation matrix?

927 Views Asked by At

I have a survey with a lot of numeric variables (both continuous and dummy-binary) and more than 800 observations. Of course, there is missing data for most of the variables (at a different rate). I need to use a weighted correlation table because some samples represent more population than others. Also, I want to minimize the not used samples, and in this way keep the max. of observations for each pair of variables. I know how to do a pairwise correlation matrix (e.g., cor(data, use="pairwise.complete.obs")). Also I know how to do a weighted correlation matrix (e.g., cov.wt(data %>% select(-weight), wt=data$weight, cor=TRUE)). However, I couldn't find a way (yet) to use both together. Is there a way for doing a pairwise-weighted correlation matrix in R? Super appreciate it if any help or recommendations.

1

There are 1 best solutions below

0
On BEST ANSWER

Good question Here how I do it It is not fast but faster than looping.

df_correlation is a dataframe with only the variables I want to compute the correlations and newdf is my original dataframe with the weight and other variables

   data_list <- combn(names(df_correlation),2,simplify = FALSE)
    data_list <- map(data_list,~c(.,"BalancingWeights"))

    dimension <- length(names(df_correlation))
    allcorr <- matrix(data =NA,nrow = dimension,ncol = dimension)
    row.names(allcorr)<-names(df_correlation)
    colnames(allcorr) <- names(df_correlation)

myfunction<- function(data,x,y,weight){
  indice <-!(is.na(data[[x]])|is.na(data[[y]]))
 return(wCorr::weightedCorr(data[[x]][indice], 
                             data[[y]][indice], method = c("Pearson"), 
             weights = data[[weight]][indice], ML = FALSE, fast = TRUE))
}

b <- map_dbl(data_list,~myfunction(newdf,.[1],.[2],.[3]))


allcorr[upper.tri(allcorr, diag = FALSE)]<- b

allcorr[lower.tri(allcorr,diag=FALSE)] <- b
view(allcorr)