Creating a population pyramid plot in R using GGPlot

362 Views Asked by At

I tried to follow the example given here :


n1 <- ggplot(nigeria, aes(x = Age, y = Population, fill = Gender)) + 
  geom_bar(subset = .(Gender == "Female"), stat = "identity") + 
  geom_bar(subset = .(Gender == "Male"), stat = "identity") + 
  scale_y_continuous(breaks = seq(-15000000, 15000000, 5000000), 
                     labels = paste0(as.character(c(seq(15, 0, -5), seq(5, 15, 5))), "m")) + 
  coord_flip() + 
  scale_fill_brewer(palette = "Set1") + 
  theme_bw()

n1

however my data is setup a little different. Males and females are located in sparate columns in my data. Is there a way create a population pyrimad using the following data:

pop_1950 = data.frame (age_groups = c("0-4", "5 - 12", "13 - 18", "19 - 24", "25 - 34", "35  - 44", "45 - 54", "55 - 64", "65 - 74", "75 - 84", "85 - 94", "95+"),
                       females = c(151272.13, 207176.23, 138778.36, 115109.06, 192698.05,  18232.01, 156810.06, 124283.91, 105981.35,  48945.70,  7273.47,   301.96),
                       males = c(158878.66, 215774.86, 148482.68, 123611.00, 194782.15,  19387.82, 163137.82, 126669.64, 104974.21,  46382.39,   5146.77,   170.79))

4

There are 4 best solutions below

0
On

You need to convert your data to long format, so instead of having columns for Male and Female you have 'Sex' and 'Number'.

pop_1950 %>% pivot_longer(cols = c('females', 'males'), names_to = 'Sex', values_to = 'Number') %>% ggplot(...)

Converting data to long format is often useful for ggplot2 in general.

0
On

We can also use gather

 library(dplyr)
 library(tidyr)
 pop_1950 %>%
      gather(Sex, Number, females, males)
0
On

After doing some experimenting I came up with the following solution that works with my data

ggplot(pop_1950) +
  geom_col(aes(x=age_groups, y=females, fill = "red")) +
  geom_col(aes(x=age_groups, y=-males, fill = "blue")) + 
  scale_y_continuous(breaks = seq(-200000, 200000, 50000), 
                     labels = paste0(as.character(c(seq(200, 0, -50), seq(50, 200, 50))))) + 
  coord_flip() + 
  scale_fill_discrete(name = "Gender", labels = c("Female", "Male"))+
  xlab("Age")+
  ylab("Population (000's)")+
  theme_minimal()

0
On

You can try reshape2::melt like below

reshape2::melt(pop_1950,
  id.vars = "age_groups",
  value.name = "Number",
  variable.name = "Sex"
)

which gives

   age_groups     Sex    Number
1         0-4 females 151272.13
2      5 - 12 females 207176.23
3     13 - 18 females 138778.36
4     19 - 24 females 115109.06
5     25 - 34 females 192698.05
6    35  - 44 females  18232.01
7     45 - 54 females 156810.06
8     55 - 64 females 124283.91
9     65 - 74 females 105981.35
10    75 - 84 females  48945.70
11    85 - 94 females   7273.47
12        95+ females    301.96
13        0-4   males 158878.66
14     5 - 12   males 215774.86
15    13 - 18   males 148482.68
16    19 - 24   males 123611.00
17    25 - 34   males 194782.15
18   35  - 44   males  19387.82
19    45 - 54   males 163137.82
20    55 - 64   males 126669.64
21    65 - 74   males 104974.21
22    75 - 84   males  46382.39
23    85 - 94   males   5146.77
24        95+   males    170.79