Connecting geom_points (which are the means of different columns in a dataset) with a line

22 Views Asked by At

I am new to using R and I am sure there is a simple solution to this but having tried multiple methods and watched quite a few videos I wondered if someone could help give me some advice on this.

I want to connect a line between 2 geom_points which are the means of two columns in a table. I have tried the following but this doesn't work to connect a line between them. Please see screenshots and r script below.

ggplot()+
  geom_point(data = wsd, mapping = aes(x = "Mud intact", y = mean(Mud_intact)), colour = "blue", size = 3)+
  geom_point(data = wsd, mapping = aes(x = "Sand intact", y = mean(Sand_intact)), colour = "blue", size = 3)+
  geom_line()+
  geom_point(data = wsd, mapping = aes(x = "Mud hair cut", y = mean(Mud_hair.cut)), colour = "red", size = 3)+
  geom_point(data = wsd, mapping = aes(x = "Sand hair cut", y = mean(Sand_hair.cut)), colour = "red", size = 3)+
  geom_line()+
  labs(title = "Comparison of mean worm speed with/without hairs on different substrates", x = "Worm condition and Surface type", y = "Mean Speed")+
  theme_bw()

Which gives me the following. I would like to connect the red dots with a red line and the blue dots with a blue line.

Graph created with the above code

Any help much appreciated.

Please see the above. I have tried storing the mean values and then including these in the geom_line() function and writing out the info used in geom_point() into geom_line()graphs but to no avail.

1

There are 1 best solutions below

0
Murad Khalilov On

for your case, you dont need to specify data for each time to call geom functionality and it is better not to add any calculation for ggplot, and make final dataset before adding into it.

You can check my answer for your solution, I created sample dataset, which contains four columns (please adjust with your column names and dataset format)


library(ggplot2)

# Example data

wsd <- data.frame(mud_interact = c(1,2,3,4,5,6),
                  sand_interact = c(1,3,5,6,8,9),
                  mud_hair_cut = c(2,4,5,6,7,8),
                  sand_hair_cut = c(5,6,7,8,9,9))

below, I create final data to use in plot via calculating means for each column

# Create a data frame with column means
means <- data.frame(
  variable = names(wsd),
  mean_value = colMeans(wsd)
)

you can add new column if you want to scale your colors manually in plot, so by using grepl I add red color name for all interact columns, and this is helpful as well to decrease workload for ggplot

# Determine colors based on variable names
means$color <- ifelse(grepl("interact", means$variable), "red", "blue")

I added one data (means) and aes for x axis variable name, y axis for appropriate mean values color command is to specify colors of included values which is color column in your final dataset and we need to specify our intention in scale_color_manual function as well.

geom_line was used for firstly split data for colors (to take interacts and muds differently) and after that it is easy to specify as group and color.

# Create a ggplot scatter plot
ggplot(means, aes(x = variable, y = mean_value, color = color)) +
  geom_point(size = 3) +
  geom_line(data = subset(means, color == "red"), aes(group = 1), linetype = "solid", color = "red") +
  geom_line(data = subset(means, color == "blue"), aes(group = 1), linetype = "solid", color = "blue") +
  scale_color_manual(values = c("red" = "red", "blue" = "blue")) +
  labs(title = "Column Means",
       x = "Variable",
       y = "Mean Value")