I have a series of 2D Histograms that I created using the kde2d function of MASS in the following way:
# Loading libraries
library(MASS)
library(RcolorBrewer)
# Loading data
data <- as.matrix(read.table('data.dat'))
# Create the 2dhist object
hist_2d <- kde2d(data[,1],data[,2],n = 60, lims=c(-180,180,-180,180))
# Define the color palette
rf <- colorRampPalette(rev(brewer.pal(11,'Spectral')))
r <- rf(60)
# Defining the axis
at_x = seq(-180,180,by=30)
at_y = seq(-180,180,by=30)
# Plot the 2DHistogram
image(hist_2d,col=r,cex.main=3,main='Q68L',axes=F)
axis(1,lwd.ticks=2,at=at_x,labels=T,cex.axis=2)
axis(2,lwd.ticks=2,at=at_y,labels=T,cex.axis=2)
The histogram generated looks like this. How I can identify all the zones with high density ( which I marked inside the white squares)? The ideal solution for this problem would be a function that throws an (x,y) range for every high density zone so that it can be applied in several datasets.
Thanks in advance and let me know if you need additional information
With the right representation of the data, this can be done with cluster analysis. Since you do not provide data, I will illustrate with the data used on the
kde2d
help page - the geyser data. This data is gives a pretty clean separation of "high density" areas (like your example pictures), so I will just use a simple k-means clustering.We need to find the "hot spots". In order to get an idea about what values should be considered "high", we can look at a boxplot.
Based on this, I will somewhat arbitrarily use points where the z-value is greater than 0.012. You will need to tune this for your particular problem.
Now we need to cluster the points and find the x & y ranges for the clusters. First I do it simply and show that the results are reasonable.
I am not sure how you want the results presented to you, but one way to get them all in one place is this:
This gives a min and max for x and y for each of the three clusters.
However, I made a few choices here and I wish to point out that I still left some work for you. What you still need to do:
1. You need to choose a cut-off point for how high the density needs to be to get a cluster.
2. Given the points above your cut-off, you will need to say how many clusters you want to generate.
The rest of the machinery is there.