Calculating distance/similarity with grouped features

29 Views Asked by Matt At 18 September 2023 at 00:41

I have tabular data in the form of binary codes in each feature.

Student	Group 1 - SubGroup1	Group 1 - SubGroup2	Group 2 - SubGroup2
A	1	0	0
B	0	1	0
C	0	0	1

"1" represents a student belonging to a specific subgroup of a group, and "0" indicates that the student does not belong to the corresponding group-subgroup. Note that each row are not necessarily a one-hot vector (a student may belong to several groups-subgroups).

I would like to calculate the pair-wise distances (or similarities). However, I want the distance between A and B to be smaller than the distance between A-C or B-C, because A and B belong to the same group (although their subgroup is different), while C belongs to a completely different group. Are there any known kinds of setting that allow modifying the "weight" attached to features in different groups when calculating similarities/distances?

I got to know that Hamming discance is suitable for measuring the distance between two binary strings, but in this case, it will give equal distances for all A-B, B-C, C-A (and so does Euclidian distance).

Original Q&A

Calculating distance/similarity with grouped features

There are 0 best solutions below

Related Questions in DISTANCE

Related Questions in SIMILARITY

Related Questions in HAMMING-DISTANCE

Trending Questions

Popular # Hahtags

Popular Questions