My objective is to replicate the functionality of pdist()
from SciPy
in Julia.
I tried using Distances.jl
package to perform pairwise computation of distance between observations. However, the results are not same as seen in the below mentioned example.
Python Example:
from scipy.spatial.distance import pdist
a = [[1,2], [3,4], [5,6], [7,8]]
b = pdist(a)
print(b)
output --> array([2.82842712, 5.65685425, 8.48528137, 2.82842712, 5.65685425, 2.82842712])
Julia Example:
using Distances
a = [1 2; 3 4; 5 6; 7 8]
dist_function(x) = pairwise(Euclidean(), x, dims = 1)
dist_function(a)
output -->
4×4 Array{Float64,2}:
0.0 2.82843 5.65685 8.48528
2.82843 0.0 2.82843 5.65685
5.65685 2.82843 0.0 2.82843
8.48528 5.65685 2.82843 0.0
With reference to above examples:
- Is
pdist()
fromSciPy
in python has metric value set toEuclidean()
by default? - How may I approach this problem, to replicate the results in Julia?
Please suggest a solution to resolve this problem.
Documentation reference for pdist() :--> https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.pdist.html
Thanks in advance!!
According to the documentation page you linked, to get the same form as Julia from python (yes, I know, this is the reverse of your question), you can pass it to squareform. I.e. in your example, add
Also, yes, from the same documentation page, you can see that the 'metric' parameter defaults to 'euclidean' if not explictly defined.
For the reverse situation, simply note that the python vector is simply all the elements in the off-diagonal (since for a 'proper' distance metric, the resulting distance matrix is symmetric).
So you can simply collect all the elements from the off-diagonal into a vector.