Descending order in ggplot bar_col

926 Views Asked by At

Below is the dataset,

# A tibble: 449 x 7
   `Country or Area` `Region 1`      Year   Rate MinCI MaxCI Average
   <chr>             <chr>           <chr> <dbl> <dbl> <dbl>   <dbl>
 1 Afghanistan       Southern Asia   2011    4.2   2.6   6.2    4.4 
 2 Afghanistan       Southern Asia   2016    5.5   3.4   8.1    5.75
 3 Aland Islands     Northern Europe NA     NA    NA    NA     NA   
 4 Albania           Southern Europe 2011   18.8  14.8  23     18.9 
 5 Albania           Southern Europe 2016   21.7  17    26.7   21.8 
 6 Algeria           Northern Africa 2011   24    19.9  28.4   24.2 
 7 Algeria           Northern Africa 2016   27.4  22.5  32.7   27.6 
 8 American Samoa    Polynesia       NA     NA    NA    NA     NA   
 9 Andorra           Southern Europe 2011   24.6  19.8  29.8   24.8 
10 Andorra           Southern Europe 2016   25.6  20.1  31.3   25.7

I need to draw a bar_col using the above dataset to compare the average obesity rate of each region. Further, I need to order the bar from the highest to the lowest.

I have also calculated the Average obesity rate as shown above.

Below is the code I used to generate the ggplot, but unable to figure out how to order from the highest to lowest.

region_plot <- ggplot(continent) + aes(x = continent$`Region 1`, y = continent$Average, fill = Average) +
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  ggtitle("Average obesity rate of each region")
region_plot
2

There are 2 best solutions below

0
On BEST ANSWER

The problem can be solved by preprocessing the data and sorting the result by Average. Then coerce Region 1 to factor.

library(ggplot2)
library(dplyr)

continent %>%
  group_by(`Region 1`) %>%
  summarise(Average = mean(Average, na.rm = TRUE)) %>%
  arrange(desc(Average)) %>% 
  mutate(`Region 1` = factor(`Region 1`, levels = unique(`Region 1`))) %>%
  ggplot(aes(x = `Region 1`, y = Average, fill = Average)) +
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  ggtitle("Average obesity rate of each region") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) -> region_plot

region_plot

enter image description here

Data

continent <- read.table(text = "
 'Country or Area' 'Region 1'      Year   Rate MinCI MaxCI Average
 1 Afghanistan       'Southern Asia'   2011    4.2   2.6   6.2    4.4 
 2 Afghanistan       'Southern Asia'   2016    5.5   3.4   8.1    5.75
 3 'Aland Islands'     'Northern Europe' NA     NA    NA    NA     NA   
 4 Albania           'Southern Europe' 2011   18.8  14.8  23     18.9 
 5 Albania           'Southern Europe' 2016   21.7  17    26.7   21.8 
 6 Algeria           'Northern Africa' 2011   24    19.9  28.4   24.2 
 7 Algeria           'Northern Africa' 2016   27.4  22.5  32.7   27.6 
 8 'American Samoa'    Polynesia       NA     NA    NA    NA     NA   
 9 Andorra           'Southern Europe' 2011   24.6  19.8  29.8   24.8 
10 Andorra           'Southern Europe' 2016   25.6  20.1  31.3   25.7
", header = TRUE, check.names = FALSE)
0
On

After checking your data, you have multiple regions so in order to show the average per region you must to compute it and then plot. You can do that with dplyr using group_by() and summarise(). Your data is limited but for the real one, NA should not be present. Here the code using part of the shared data. Be careful with names when using your real data. reorder() function can arrange bars. Here the code:

library(dplyr)
library(ggplot2)
#Code
df %>% group_by(Region) %>%
  summarise(Avg=mean(Average,na.rm=T)) %>%
  filter(!is.na(Avg)) %>%
  ggplot(aes(x=reorder(Region,-Avg),y=Avg,fill=Region))+
  geom_col() +
  xlab("Region") + ylab("Average Obesity") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  ggtitle("Average obesity rate of each region")

Output:

enter image description here

Some data used:

#Data
df <- structure(list(Region = c("Southern Asia", "Southern Asia", "Northern Europe", 
"Southern Europe", "Southern Europe", "Northern Africa", "Northern Africa", 
"Polynesia", "Southern Europe", "Southern Europe"), Year = c(2011L, 
2016L, NA, 2011L, 2016L, 2011L, 2016L, NA, 2011L, 2016L), Rate = c(4.2, 
5.5, NA, 18.8, 21.7, 24, 27.4, NA, 24.6, 25.6), MinCI = c(2.6, 
3.4, NA, 14.8, 17, 19.9, 22.5, NA, 19.8, 20.1), MaxCI = c(6.2, 
8.1, NA, 23, 26.7, 28.4, 32.7, NA, 29.8, 31.3), Average = c(4.4, 
5.75, NA, 18.9, 21.8, 24.2, 27.6, NA, 24.8, 25.7)), row.names = c(NA, 
-10L), class = "data.frame")