I'm trying to produce a heat map of gene expression from samples of different conditions, faceted by the conditions:
require(reshape2)
set.seed(1)
expression.mat <- matrix(rnorm(100*1000),nrow=100)
df <- reshape2::melt(expression.mat)
colnames(df) <- c("gene","sample","expression")
df$condition <- factor(c(rep("C1",2500),rep("C2",3500),rep("C3",3800),rep("C4",200)),levels=c("C1","C2","C3","C4"))
I'd like to color by expression range:
df$range <- cut(df$expression,breaks=6)
The width
parameter in ggplot
's aes
is supposed to control the width of the different facets. My question is how to find the optimal width
value such that the figure is not distorted?
I played around a bit with this plot command:
require(ggplot2)
ggplot(df,aes(x=sample,y=gene,fill=range,width=100))+facet_grid(~condition,scales="free")+geom_tile(color=NA)+labs(x="condition",y="gene")+theme_bw()
Setting width
to be below 100 leaves gaps in the last facet (with the lowest number of samples), and already at this value of 100 you can see that the right column in the first facet from left is distorted (wider than the columns to its left):
So my question is how to fix this/find a width
that doesn't cause this.
Edit showing the issue with the sample variable faceted by condition
There is no C1 sample between 25 and 100, because they are by C2, c3 and C4. Here is an illustration for the sample < 200.
The number of sample is not the same in all facets and faceting on conditoins creates wholes between sample numbers for each condition.
One way to go around this problem would be to create a sample2 number. I work using the dplyr package.
Then plot using sample2 as the x variable
Using the scales argument to vary scales on the x axis.
Old answer using width
See for example this answer.
Adding a
width
aesthetic produces wider columns: