I have used the code below to make the ggplot added as an image further down. The plot is a duration curve showing water discharge on the y-axis, and percentage of time on the x-axis. The lines represents one singular year of water discharge measurements, and in total there are 20 years = 20 lines. I want to use gghighlight to highlight the average water discharge over time. How can I add the average water discharge?
sy2.1 %>%
group_by(year(t)) %>%
arrange(desc(WaterDis)) %>%
mutate(t3 = 1:n()/n()*100) %>%
ggplot(aes(t3, WaterDis, colour=year(t),
group=year(t))) +
geom_line(size=1) +
scale_y_continuous(expand=c(0, 0)) +
scale_x_continuous(expand=c(0.001, 0)) +
labs(x="% of time", y="Water discharge (m3/s)", colour="Year") +
theme_classic()
You can either summarize first from your data and then plot the summarized data, or you can summarize directly within your plot code using
stat_summary()
. I'll show you the latter method below with an example dataset.Here's the data and basic plot.
To find the average of the lines, you can use
stat_summary()
and tell it to use themean()
function.Personally, I use both methods (that shown here or summarizing before), depending on the situation.
As a final note, your coloring scheme for each line is on a continuous scale, but the data really should be segmented in your example. I would force
ggplot2
to treat your lines as a factor via referencingas.factor(year(t))
orfactor(year(t))
instead ofyear(t)
.