Mahalanobis distance between two vectors in MATLAB

3.8k Views Asked by At

I have the following two vectors, and trying to find the Mahalanobis distance between them. The two vectors are as follows:

A=[2,4,5,7];
B=[6,3,8,1];

For calculating the Mahalanobis distance, I did the following:

> mahal(A(:),B(:))

For that, I got the following results:

0.6466
0.0259
0.0259
0.6466

But, how can I get one value, as when you calculate Euclidean distance for instance?

Thanks.

1

There are 1 best solutions below

0
On

The Mahalanobis distance is actually the distance from the mean of a distribution. So if there is no distribution it becomes similar (not equal) to the Euclidean distance.

According to MATLAB:

mahal(Y,X) computes the Mahalanobis distance (in squared units) of each observation in Y from the reference sample in matrix X. If Y is n-by-m, where n is the number of observations and m is the dimension of the data, d is n-by-1. X and Y must have the same number of columns, but can have different numbers of rows. X must have more rows than columns.

so you will have something like this, you can compare the Mahalanobis with Euclidean distances:

X = mvnrnd([0;0],[1 .9;.9 1],100);
Y = [1 1;1 -1;-1 1;-1 -1];

d1 = mahal(Y,X) % Mahalanobis (**it still gives one value**)
d1 =
    1.3592
   21.1013
   23.8086
    1.4727

d2 = sum((Y-repmat(mean(X),4,1)).^2, 2) % Squared Euclidean
d2 =
    1.9310
    1.8821
    2.1228
    2.0739
% if you check the figure it will be easier to understand
scatter(X(:,1),X(:,2))
hold on
scatter(Y(:,1),Y(:,2),100,d1,'*','LineWidth',2)
hb = colorbar;
ylabel(hb,'Mahalanobis Distance')
legend('X','Y','Location','NW')

enter image description here

Mahalanobis distance (or "generalized squared interpoint distance" for its squared value) can also be defined as a dissimilarity measure between two random vectors x and y of the same distribution with the covariance matrix S:

enter image description here

If the covariance matrix is the identity matrix, the Mahalanobis distance reduces to the Euclidean distance. If the covariance matrix is diagonal, then the resulting distance measure is called a normalized Euclidean distance:

enter image description here

where Si is the standard deviation of the Xi and Yi over the sample set.