Stacked bar plot with points, but with different aestetics length - ggplot2

360 Views Asked by At
I have a dataframe which I used the melt function to get to this (length = 118): 

 record_id          value Values
1           8     int_to_out     20
2          14     int_to_out     32
3           5     int_to_out     22
4           6     int_to_out     19
5          31     int_to_out     15
6          48     int_to_out     20
7         100     int_to_out     30
...       ...        ...        ...
113        87 symptom_to_int      7
114        72 symptom_to_int      4
115        99 symptom_to_int      3
116       102 symptom_to_int     36
117       103 symptom_to_int     13
118       111 symptom_to_int      6

I made a stacked barplot with this:

enter image description here

The plot has 59 y elements, and I need to add points to them based in the original (non-melted) data. So I wrote this:

ggplot(t, aes(y=as.factor(record_id), x=Values, fill=value)) + 
    geom_bar(position=position_stack(reverse= TRUE), stat="identity") +
    geom_point(data = new_df, aes(x=sorolog, y = record_id), 
                colour = "#a81802", size = 4, shape = 1)

The x = sorolog has 59 values for the 59 IDs foun in record_id.

But when I run it I get this:

    Error: Aesthetics must be either length 1 or the same as the data (59): fill
Run `rlang::last_error()` to see where the error occurred.

Which I believe that is a conflict with the melted data, since it's length is the double of the original dataframe.

The question is: How can I add the points with this difference of aestetics length?

Another problem: How can I add a second legend to the plot?

I used this code:

ggplot() + 
    geom_bar(data=t, aes(y=as.factor(record_id), x=Values, fill=value), 
        position=position_stack(reverse= FALSE), stat="identity", width = 0.5) +
        scale_fill_manual(values = c("brown1","chocolate1"),name = "", 
            labels = c("Hospitalization to Discharge", "Symptom to Hospitalization")) +
    geom_point(data = new_df, aes(x=sorolog, y = as.factor(record_id)), 
                colour = "darkcyan", size = 5, shape = 1)+
    geom_point(data = new_df, aes(x=final, y = as.factor(record_id)), 
                colour = "darkred", size = 4, shape = 16)+

        theme_minimal()+
    labs(title="Patient timeline - from symptoms to hospitalization and discharge",
        x ="Days", y = "Patient ID")+
    theme(text = element_text(family = "Garamond", color = "grey20"))

and got this: enter image description here

but I can't add a legend for the geom_point elements, how can I do it?

EDIT

With the edit from Dave Armstrong I got this:

enter image description here

1

There are 1 best solutions below

8
On BEST ANSWER

Without access to the data you'll have to confirm, but if you remove the data and aesthetics from ggplot() and put them in geom_bar(), it should work:

ggplot() + 
    geom_bar(data=t, aes(y=as.factor(record_id), x=Values, fill=value), 
        position=position_stack(reverse= TRUE), stat="identity") +
    geom_point(data = new_df, aes(x=sorolog, y = record_id), 
                colour = "#a81802", size = 4, shape = 1)

EDIT

I am adding an answer to the question about adding a color legend for the points. Also added size and shape to the points, too.

ggplot() + 
  geom_bar(data=t, aes(y=as.factor(record_id), x=Values, fill=value), 
           position=position_stack(reverse= FALSE), stat="identity", width = 0.5) +
  scale_fill_manual(values = c("brown1","chocolate1"),name = "", 
                    labels = c("Hospitalization to Discharge", "Symptom to Hospitalization")) +
  geom_point(data = new_df, aes(x=sorolog, y = as.factor(record_id), colour="Point Label 1",
                                size="Point Label 1", shape="Point Label 1")) +  
  geom_point(data = new_df, aes(x=final, y = as.factor(record_id), colour="Point Label 2", 
                                size="Point Label 2", shape="Point Label 2")) + 
  scale_colour_manual("points", values=c("Point Label 1" = "darkcyan", "Point Label 2" = "darkred"), 
                      labels= c("Point Label 1", "Point Label 2")) + 
  scale_shape_manual("points", values=c("Point Label 1" = 1, "Point Label 2" = 16), 
                      labels= c("Point Label 1", "Point Label 2")) + 
  scale_size_manual("points", values=c("Point Label 1" = 5, "Point Label 2" = 4), 
                     labels= c("Point Label 1", "Point Label 2")) + 
  theme_minimal()+
  labs(title="Patient timeline - from symptoms to hospitalization and discharge",
       x ="Days", y = "Patient ID")+
  theme(text = element_text(family = "Garamond", color = "grey20"))

The trick here is to put all of the point attributes - colour, size and shape, in the aesthetics with the same labels. The attributes themselves supplied to values need to be named vectors where the names are the same as the aesthetic names. I found this post helpful in putting the pieces together.

The main idea is that you have to add a colour aesthetic to the points, but it doesn't have to come from a variable in the data frame, you can make it up on the fly.