Making data format for sankey plot in R

33 Views Asked by PesKchan At 28 February 2024 at 21:09

So this is my overall data frame.

dput(gene_data)
structure(list(Addiction = c("PMC6202276", "PMC10542560", "PMC8357835", 
"PMC8463497", "PMC10497570", "PMC10256172", "PMC10616319", "PMC4224688", 
"PMC10247417"), Parkinson = c("PMC6868244", NA, NA, NA, NA, NA, 
NA, NA, NA), Autism = c("PMC7102905", "PMC9668748", NA, NA, NA, 
NA, NA, NA, NA), Aggression = c("PMC4896746", NA, NA, NA, NA, 
NA, NA, NA, NA), depression = c("PMC5734943", NA, NA, NA, NA, 
NA, NA, NA, NA), schizophrenia = c("PMC8761210", "PMC8938529", 
NA, NA, NA, NA, NA, NA, NA), reward = c("32059760", "37657442", 
NA, NA, NA, NA, NA, NA, NA)), class = "data.frame", row.names = c(NA, 
-9L))

So Here under each column i have various PMID which is from various publications.

Now as an small example I will take schizophrenia = ("PMC8761210", "PMC8938529") and Autism = ("PMC7102905", "PMC9668748")

So one of my objective was to find out no of genes that was significant in our analysis how many of them overlap with similar studies.

Genes which are significant from our studies the list is fixed which 910 genes.

For schizophrenia

this study PMC8761210 = Total genes 1800 genes when we overlapped with our studies we got 50 genes matching or overlapping.
PMC8938529 = Total 800 genes, overlapped genes 75

For Autism

This study PMC7102905 = Total 600 genes , overlapped genes 100.
This study PMC9668748 = Total 200 genes, overlapped genes 150

I followed this tutorial for sankey plot, but i'm not able to formulate how to make the data frame to fit into the sankey function framework

The x axis should be Disorders such as the schizophrenia or Autism then it should go for the given publication or PMCID and then it should show me the overlap percentage between our gene and the given publication gene.

Any help or suggestion would be really appreciated

Original Q&A

Making data format for sankey plot in R

There are 0 best solutions below

Related Questions in R

Related Questions in GGPLOT2

Related Questions in SANKEY-DIAGRAM

Trending Questions

Popular # Hahtags

Popular Questions