Calculating pairwise distances from external file

Question

Calculating pairwise distances from external file

521 Views Asked by kalle At 20 October 2024 at 03:54

I have data in a text file of the following format:

t  x      y      z
Pb 1.0000 1.5000 2.5000
Pb 1.1000 1.6000 2.6000
S  1.2000 1.4000 2.4000

I would like to be able to calculate pairwise distances for all of the coordinates I have, but somehow retain the atomic species identity (i.e. attach a string describing with t values were used in the calculation). The pairwise distance formula I am using is just the Euclidean distance matrix:

r_ij = abs(ri - rj)

Where ri/rj are the coordinates in 3D space.

I can easily find the values of r_ij using this method (and reformatting the data so it a 3xN numpy array of just coordinate data):

def r_ij_numpy(coords):
r = np.dot(coords, coords.T)
m = np.tile(np.diag(r), (N,1))
r = np.sqrt(m + m.T - 2*r)
r = np.triu(r).ravel()
return r[np.nonzero(r)]

But I can't seem to find a way to tag atom types with it (i.e. refactor the output to an array of tuples, with each tuple being the value of r_ij attached to a string of both atom types (i.e. 'Pb-S').

Thank you!

Original Q&A

There are 1 best solutions below

**Paddy Harrison** · Answer 1

I would use cdist to calculate pairwise distances and then add each pair to a dictionary of pairs, eg. 'Pb-Pb' and 'Pb-S'.

import pandas as pd
import numpy as np
from scipy.spatial.distance import cdist
from collections import defaultdict

# load data in question
df = pd.read_csv('atoms.txt', delimiter='\t')
# get atom strings
atoms = df['t'].values

# get xyzzy values as array
loc = df[['x', 'y', 'z']].values

# compute pairwise distances
dists = cdist(loc, loc)
# only keep one count of each
dists = np.triu(dists, k=1)

# get indices of non-self distances
indices = np.column_stack(np.nonzero(dists))

# add pairs to dictionary
pairs = defaultdict(list)

for i, j in indices:
    pair = f'{atoms[i]}-{atoms[j]}'
    print(f'{pair}, distance: {dists[i, j]}')
    pairs[pair].append(dists[i, j])

>>> Pb-Pb, distance: 0.1732050807568879
    Pb-S, distance: 0.24494897427831783
    Pb-S, distance: 0.30000000000000016

pairs
>>> {'Pb-Pb': [0.1732050807568879],
     'Pb-S': [0.24494897427831783, 0.30000000000000016]}

This way you can see all the atomic spacings between different chemicals. By keeping the array order dists you keep the atom order from your dataset.

Calculating pairwise distances from external file

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in EUCLIDEAN-DISTANCE

Related Questions in PAIRWISE-DISTANCE

Trending Questions

Popular # Hahtags

Popular Questions