What do the main diagonal plots mean in scatter_matrix from pandas.plotting?

667 Views Asked by At

I am a bit confused about how scatter_matrix in the pandas.plotting module works. e.g., see the plot below from https://www.geeksforgeeks.org/pair-plots-using-scatter-matrix-in-pandas/

The 3 plots along the main diagonal looks like distributions. But the y and x axis labels indicate it's plotting a variable vs. itself, so shouldn't it be a straight line? Where did the distribution come from?

3x3 scatter matrix

1

There are 1 best solutions below

0
On

By default pandas.plotting.scatter_matrix plots histograms on the diagonal. Each histogram shows the counts for just that column of data. Otherwise, as you mentioned, we'd only have (useless) straight lines on the diagonal.

There is a diagonal parameter to choose between a histogram or kernel density:

diagonal : Pick between 'kde' and 'hist' for either Kernel Density Estimation or Histogram plot in the diagonal.