How to merge two input values of contingency matrix?

114 Views Asked by At

I have this kind of xtabs object:

structure(c(1, 4, 7, 2, 5, 8, 3, 6, 9), .Dim = c(3L, 3L), 
.Dimnames = structure(list(Var1 = c("A", "B", "C"), Var2 = c("A", "B", "C")),
.Names = c("Var1", "Var2")), class = c("xtabs", "table"))

which gives:

    Var2
Var1 A B C
   A 1 2 3
   B 4 5 6
   C 7 8 9

I would like to merge two values of Var1 and Var2, e.g. "A" and "B", in a new value "D" keeping the contingency matrix properties.

So the result would be for the example:

     Var2
Var1  D C
   D 12 9 
   C 15 9 
2

There are 2 best solutions below

2
On BEST ANSWER

Reading the xtabs object into a, you can then:

b<-as.data.frame.table(a)
levels(b$Var1) <- list(D=c("A","B"),C="C")
levels(b$Var2) <- list(D=c("A","B"),C="C") #Thanks @Henrik!
xtabs(Freq ~ Var1 + Var2, b)

An alternate approach, which you could adapt for larger factors, using a couple of merges b$Var1.merge <-b$Var1 b$Var2.merge <-b$Var2

# Add the new level for the factors
levels(b$Var1.merge) <- c(levels(b$Var1.merge), "D")
levels(b$Var2.merge) <- c(levels(b$Var2.merge), "D")

#Collapse the factors
b$Var1.merge[b$Var1%in%c("A","B")] <- "D"
b$Var2.merge[b$Var2%in%c("A","B")] <- "D"

xtabs(Freq ~ Var1.merge + Var2.merge, b, drop.unused.levels=T)

Of course depending where your original xtabs object came from, you may be able to this more readily on the original data.

0
On

This side-steps the issues that might arise in dealing with factors if one takes the data.frame route:

> tbl2 <- rbind(colSums(tbl[1:2, ]), tbl[3, ])
> tbl2
     A B C
[1,] 5 7 9
[2,] 7 8 9
> tbl3 <- cbind(rowSums(tbl2[, 1:2]), tbl2[ ,2] )
> tbl3
     [,1] [,2]
[1,]   12    7
[2,]   15    8

And this shows how to do it with a specified set of letter values to collapse on:

> tbl2 <- rbind(colSums(tbl[row.names(tbl) %in% c("A","B"), ]), 
                 tbl[!row.names(tbl) %in% c("A","B"), ])
> tbl2
     A B C
[1,] 5 7 9
[2,] 7 8 9
> tbl3 <- cbind(rowSums(tbl2[, colnames(tbl) %in% c("A","B")]), 
                 tbl2[,!colnames(tbl) %in% c("A","B")] )
> tbl3
     [,1] [,2]
[1,]   12    9
[2,]   15    9