Using the data generated from here, I would like to have ggplot2 create a plot similar to the one below.
Desirable plot panels with undesirable labeling (facet_wrap
):
The objective in this plot is to allow visual comparison between the mean of the distribution of first column (Approach1) with the mean of density from each column. The script to create the density plots is as follows:
ggplot(surg_df, aes(x=op_tm, col=color_hex)) +
geom_density(aes(fill=color_hex), alpha=0.3) +
geom_density(data = base_comp_df, col='#a269ff', alpha=1, size=0.5) +
geom_vline(data=avg_surg_df, aes(xintercept= avg_op_tm),
size=1, col=avg_surg_df$color_hex, linetype="dashed") +
geom_vline(data=avg_comp_df, aes(xintercept= avg_op_tm+2),
size=1, colour='#a269ff', linetype="dashed") +
annotate(geom= "text",
label=paste("mean diff: ",
as.character(floor(avg_comp_df$avg_op_tm-avg_surg_df$avg_op_tm)),
sep=""),
col='black', size=4, x=100, y=0.006) +
geom_segment(aes(x = avg_op_tm, y=0.006, xend = avg_surg_df$avg_op_tm,
yend = 0.006, colour = "red") ,
size=1, data = avg_comp_df) +
facet_wrap(~surg_grp) +
theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
scale_colour_identity(guide="none", breaks=base_surg_df$color_hex) +
scale_fill_identity(guide="legend", breaks=base_surg_df$color_hex,
name="Surgical Approaches",
labels=base_surg_df$surg_apprch)
The above plot was desirably created using facet_wrap()
with grouping by variable surg_grp
. But when I decided to give it a less messy look by the advantage of labeling from facet_grid()
instead, things went out of control. What I managed to create so far by facet_grid()
was the following lousy collection of density plots that each comes with 3 ridiculous vertical line for each panel :)) The lines turn out to be the average values across each column picked from variable avg_op_tm
from avg_surg_df
calculated table of summaries.
Undesirable plot panels, desirable labeling (facet_grid
):
As you notice in the script below, unlike the previous plot, there is only one geom_vline and the three lines on each panel comes from that single line:
ggplot(surg_df)+
geom_density(aes(x=op_tm, col=color_hex, fill=color_hex), alpha=0.3) +
scale_fill_identity("Approaches", guide="legend", breaks=base_surg_df$color_hex,
labels=base_surg_df$surg_apprch,
aesthetics = "fill")+
scale_colour_identity(guide="none",breaks=base_surg_df$color_hex)+
geom_density(data = base_comp_df, aes(x=op_tm), alpha=1, col='#a269ff', size=0.5) +
geom_vline(data=avg_surg_df, aes(xintercept= avg_op_tm), size=1,
linetype="dashed")+
annotate(geom= "text",
label=paste("mean diff: ",
as.character(floor(avg_surg_df$avg_op_tm)), sep=""),
col='black', size=4, x=100, y=0.006)+
facet_grid(rows = vars(condition_grp), cols=vars(surg_apprch), scales = 'free')
The community is rich with similar Q&As around facet_wrap() + geom_vline()
but not so much helpful questions and answers about facet_grid() + geom_vline()
. How can I have geom_vline()
use two grouping parameters that have been fed to facet_wrap
(condition_grp
and surg_apprch
) and get it to map data correctly? What educational point have I failed to learn that made my approach with facet_grid()
fail?
Any help is highly appreciated.
Update
> head(avg_comp_df)
surg_grp avg_op_tm Cnt surg_apprch color_hex
1 A1 309.5494 74 Approach1 #a269ff
2 A2 309.5494 74 Approach2 #00CC00
3 A3 309.5494 74 Approach3 #FFAA93
4 A4 309.5494 74 Approach4 #5DD1FF
5 B1 263.0835 71 Approach1 #a269ff
6 B2 263.0835 71 Approach2 #00CC00
> head(surg_df) #used to create 12 different curves
surg_grp surg_apprch condition_grp op_tm color_hex
1 A1 Approach1 Benign-1 287.2103 #a269ff
2 A1 Approach1 Benign-1 261.2655 #a269ff
3 A1 Approach1 Benign-1 308.9267 #a269ff
4 A1 Approach1 Benign-1 257.9060 #a269ff
5 A1 Approach1 Benign-1 408.0310 #a269ff
6 A1 Approach1 Benign-1 405.4334 #a269ff
> head(avg_surg_df)
surg_grp avg_op_tm Cnt surg_apprch color_hex
1 A1 309.5494 74 Approach1 #a269ff
2 A2 378.4466 113 Approach2 #00CC00
3 A3 242.9890 101 Approach3 #FFAA93
4 A4 273.0774 71 Approach4 #5DD1FF
5 B1 263.0835 71 Approach1 #a269ff
6 B2 243.1910 85 Approach2 #00CC00
> head(base_comp_df) #to create similar orchid control distributions in each row
surg_grp surg_apprch condition_grp op_tm color_hex
1 A1 Approach1 Benign-1 287.2103 #a269ff
2 A1 Approach1 Benign-1 261.2655 #a269ff
3 A1 Approach1 Benign-1 308.9267 #a269ff
4 A1 Approach1 Benign-1 257.9060 #a269ff
5 A1 Approach1 Benign-1 408.0310 #a269ff
6 A1 Approach1 Benign-1 405.4334 #a269ff
> head(base_surg_df) #to make the legend
surg_apprch condition_grp surg_grp color_hex
1 Approach1 Benign-1 A1 #a269ff
2 Approach2 Benign-1 A2 #00CC00
3 Approach3 Benign-1 A3 #FFAA93
4 Approach4 Benign-1 A4 #5DD1FF
5 Approach1 Benign-2 B1 #a269ff
6 Approach2 Benign-2 B2 #00CC00
You need to precalculate the means in a separate dataframe that replicates the structure of your
facet_grid
in its groups. A rough and dirty example withmtcars
: