function for counting discordant pairs is not working

246 Views Asked by At

I am trying to count the number of discordant pairs. For example:

arg1=c("b","c","a","d")
arg2 = c("b","c","d","a")

There is 1 discordant pair in the above (the pair: "a" and "d")

But when I run:

require(asbio)
sum(ConDis.matrix(arg1,arg2)==-1,na.rm=TRUE)

The answer I receive is: 5 (instead of the correct answer - 1)

I also tried:

require(RankAggreg)
require(DescTools)
xy <- table(arg1,arg2)
cd <- ConDisPairs(xy)
cd$D

the answer is 5 again.

What am I missing?

2

There are 2 best solutions below

2
On BEST ANSWER

I think you are misunderstanding how ConDis.matrix works.

The pairs it refers to are pairs of indices of elements and the function checks, for each pair, whether they are moving in the same way in both vectors.

So, in your vector, you have indeed 5 discordant pairs, that is (considering letters with an ordered quantitative view):

  1. between obs1 and obs3 ("a" is lower than "b" in arg1 but "d" is higher in arg2)
  2. between obs1 and obs4 ("a" is lower than "b" in arg2 but "d" is higher in arg1)
  3. between obs2 and obs3 ("a" is lower than "c" in arg1 but "d" is higher in arg2)
  4. between obs2 and obs4 ("a" is lower than "c" in arg2 but "d" is higher in arg1)
  5. between obs3 and obs4 ("a" is lower than "d" in arg1 but "d" is higher than "a" in arg2)
7
On

Based on @Cath's initial comment, converting the character vectors into factors seems like it might provide a workaround by mapping the text values to integers that can then be used in the function. Edit: be aware that reordering the factor levels changes the final result. I don't know enough about the discordance function to say if this is the expected behavior.

# Original Character vectors
arg1 <- c("b","c","a","d")
arg2 <-  c("b","c","d","a")

# Translate character vectors into factors
all_levels <- unique(arg1, arg2)
arg1 <- factor(arg1, levels = all_levels)
arg1
[1] b c a d
Levels: b c a d

arg2 <- factor(arg2, levels = all_levels)
arg2
[1] b c d a
Levels: b c a d

# This maps each text string to a number 
as.numeric(arg1)
[1] 1 2 3 4
as.numeric(arg2)
[1] 1 2 4 3

# Use the underlying numeric data in the function
require(asbio)
sum(ConDis.matrix(as.numeric(arg1), as.numeric(arg2))==-1,na.rm=TRUE)
[1] 1

Edit: sorting the factor levels changes the final output

arg1 <- c("b","c","a","d")
arg2 <- c("b","c","d","a")

all_levels <- sort(unique(arg1, arg2))  # sorted

arg1 <- factor(arg1, levels = all_levels)
arg2 <- factor(arg2, levels = all_levels)

sum(ConDis.matrix(as.numeric(arg1), as.numeric(arg2))==-1,na.rm=TRUE)
[1] 5