Fewer colors than specified by num_colors in choroplethr

104 Views Asked by At

num_colors often doesn't seem to be respected. A simple case of 9 states with 7 different values:

> df
      region value
1    alabama     1
2    wyoming     5
3    arizona     5
4   arkansas     5
5 california     8
6   colorado    15
7       iowa    22
8       ohio    29
9    florida    36
> dput(df)
structure(list(region = c("alabama", "wyoming", "arizona", "arkansas", 
"california", "colorado", "iowa", "ohio", "florida"), value = c(1, 
5, 5, 5, 8, 15, 22, 29, 36)), class = "data.frame", row.names = c(NA, 
-9L))

A simple map where num_colors is 9 using a brewer color scale yields a legend with a separate color for each of the 7 values (Alaska and Hawaii don't work with this method, but that's another issue):

library(choroplethr)
library(ggplot2)

g <- state_choropleth(df, num_colors = 9)
gg <- g + scale_fill_brewer(name="Count",palette="YlOrRd", drop=FALSE, na.value="grey")
gg

9 colors

If I drop the number of colors to 7, the actual number of unique values in the data, the legend has only 5 colors. Two sets of values are binned, instead of none. 7 colors specified; 5 are used

If I drop further to 5 colors, only 4 get used. 5 colors specified; 4 are used

Specifying 6 colors results in 5, as 7 does, but binned differently from 7.

I can force it to use all 7 colors if I cut the data according to the values, in which case a lower num_colors value is ignored:

df$value <- cut (df$value, breaks = c(0,unique(sort(df$value))))

# of colors forced by cuts

My question then is why the specified number of colors doesn't get respected and is there a way to force that.

TIA.

1

There are 1 best solutions below

7
Ari On

I think that there's several things going on here.

The easiest issue for me to address is that Hawaii and Alaska are not getting your custom scale applied to them (and so are appearing black). This is because choroplethr uses ggplot2's "custom annotation" feature to render them separately and then manually place there where Mexico is. And the way that custom annotations in ggplot2 work is that your + scale_fill_brewer() call only gets applied to the main image (and not the custom annotations).

The way to get a custom scale to apply to all 3 images (continental US, Alaska and Hawaii) simultaneously is to use Choroplethr's Object Oriented features.

To see how they work, first look at how state_choropleth actually works:


> state_choropleth # no parentheses

function(df, title="", legend="", num_colors=7, zoom=NULL, reference_map = FALSE)
{
  c = StateChoropleth$new(df)
  c$title  = title
  c$legend = legend
  c$set_num_colors(num_colors)
  c$set_zoom(zoom)
  if (reference_map) {
    if (is.null(zoom))
    {
      stop("Reference maps do not currently work with maps that have insets, such as maps of the 50 US States.")
    }
    c$render_with_reference_map()
  } else {
    c$render()
  }
}
<bytecode: 0x7fde42ffd910>
<environment: namespace:choroplethr>

And here is how to use those features to apply a custom scale to the annotations as well as the continental US:

c = StateChoropleth$new(df)
c$set_num_colors(9)
c$ggplot_scale = scale_fill_brewer(name="Count",palette="YlOrRd", drop=FALSE, na.value="grey")
c$render()

enter image description here

I'm sorry, but I don't exactly understand the other issue you are reporting. One thing I will say is that binning was one of the more complex features to imlement (I am the author of choroplethr). It's not just the math of doing the binning, it's also making the labels appear nice. You can see the code for it here.

One option in your case might be to convert the numbers to factors, and then feed them to choroplethr:

df$value = as.factor(df$value)
state_choropleth(df)

enter image description here

I should also say that I cover issues like these in my online courses on choroplethr, all of which are now free. If you are interested in taking them, you can learn more here.