I have data consisting of x,y-coordinates and heading angle that I'd like to divide into 2D bins in order to calculate mean heading for each bin and plot with ggplot
's geom_spoke
.
Here's an example of what I want to do, with bins created manually:
# data
set.seed(1)
dat <- data.frame(x = runif(100,0,100), y = runif(100,0,100), angle = runif(100, 0, 2*pi))
# manual binning
bins <- rbind(
#bottom left
dat %>%
filter(x < 50 & y < 50) %>%
summarise(x = 25, y = 25, angle = mean(angle), n = n()),
#bottom right
dat %>%
filter(x > 50 & y < 50) %>%
summarise(x = 75, y = 25, angle = mean(angle), n = n()),
#top left
dat %>%
filter(x < 50 & y > 50) %>%
summarise(x = 25, y = 75, angle = mean(angle), n = n()),
#top right
dat %>%
filter(x > 50 & y > 50) %>%
summarise(x = 75, y = 75, angle = mean(angle), n = n())
)
# plot
ggplot(bins, aes(x, y)) +
geom_point() +
coord_equal() +
scale_x_continuous(limits = c(0,100)) +
scale_y_continuous(limits = c(0,100)) +
geom_spoke(aes(angle = angle, radius = n/2), arrow=arrow(length = unit(0.2,"cm")))
I know how to create 2D bins containing count data for each bin, e.g.:
# heatmap of x,y counts
p <- ggplot(dat, aes(x, y)) +
geom_bin2d(binwidth = c(50, 50)) +
coord_equal()
#ggplot_build(p)$data[[1]] #access binned data
But I can't seem to find a way to summarise other variables such as heading for each bin before passing to geom_spoke
. Without first binning, my plot looks like this instead:
Here's one approach. You'll need to determine the number / range of bins in each dimension (x & y) once, & everything else should be covered by code:
Automatically assign each row to a bin based on which x & y intervals it falls into:
Calculate the mean values for each bin:
Plot without hard-coding any bin number / bin width:
Other details such as the choice of fill palette, legend label, plot title, etc can be tweaked subsequently.