"re-start" color range in ggplot

92 Views Asked by At

In my scatterplot, I want to use two different shapes of points depending on "type" column, and range of fill colors depending on "probe" column, but I don't want the color range to be continuous, I want it to be re-started for every "type". In addition, I want to have it to be reflected in the legend. Here is what I have now:

require(data.table); require(ggplot2);

mydat <- fread(
"N,type,probe,value
1,A,Abc,2
1,A,Ade,4
1,B,Bfg,3
2,A,Ade,4
2,B,Bhi,3
3,B,Bfg,3
3,A,Axy,2
4,A,Ade,5
5,B,Bfg,2
5,A,Ade,1
6,A,Abc,1
6,B,Bhi,4
  ")

ggplot(mydat,
       aes(x=N, y=value, fill=probe, shape=type, label=probe)) +
  geom_point(size=4, alpha=0.8) +
  scale_shape_manual(values=c(21,22)) +
  scale_fill_discrete(name='probe') +
  geom_text(vjust=-0.8) +
  guides(fill=guide_legend(override.aes=list(shape=24))) 

sample plot

And here is what I want: desired view

(in this example, "Abc" has the same color as "Bfg" etc, but in fact I don't care about exact correspondence of colors, I only need the scale to be re-started for each new "type")

1

There are 1 best solutions below

1
On BEST ANSWER

This is rather non-standard usage of variable mapping, so you may want to rethink the design of your plot (I'm with @Chase here). But if that's really what you want, then you'll have to precompute the palette manually, like so. I'm assuming the alphabetical order of type variable, you'll need additional ordering if that's not the case.

library(scales)
pal <- numeric(0)
for (tp in unique(mydat$type))
{
  n <- length(unique(subset(mydat, type == tp)$probe))
  pal <- c(pal, hue_pal()(n))
}

ggplot(mydat,
       aes(x=N, y=value, fill=probe, shape=type, label=probe)) +
  geom_point(size=4, alpha=0.8) +
  scale_shape_manual(values=c(21, 22)) +
  scale_fill_manual(name='probe', values = pal) +
  geom_text(vjust=-0.8) +
  guides(fill=guide_legend(override.aes=list(shape=24)))

enter image description here

Edit: for consistent colors, use e.g. the following adjustment

pal <- numeric(0)
max_pal <- 5
for (tp in unique(mydat$type))
{
  n <- length(unique(subset(mydat, type == tp)$probe))
  pal <- c(pal, hue_pal()(max_pal)[1:n])
}

enter image description here