Grouping of two variables in r

Question

Grouping of two variables in r

299 Views Asked by cyrusjan At 17 August 2025 at 10:53

I have a data frame like below.

dat <- data.frame(v1=c("a","b","c","c","a","w","f"),
              v2=c("z","a","a","w","p","e","h"))

 v1 v2
1  a  z
2  b  a
3  c  a
4  c  w
5  a  p
6  w  e
7  f  h

I want to add a group column based on whether these letters appear in the same row.

   v1 v2  gp
1  a  z   1
2  b  a   1
3  c  a   1
4  c  w   1
5  a  p   1
6  w  e   1
7  f  h   2

My idea is to first assign the first row to group 1, and then any row that v1 or v2 is "a" or "z" will also be assigned to group 1.

There are scenarios like row 3 and 4, where c is assigned to group 1, because, in row 3, v2 is "a". And "w" is assigned to group 1 because in row 4 v1 is "c", which is assigned to group 1 previously. But my list is very long, so I cannot keep checking all the "descendants".

I wonder if there is a way to group these letters, and return a list with group number. Something like the below table will do.

Original Q&A

There are 1 best solutions below

**G5W** · Accepted Answer

One way to solve this is to consider the letters as vertices of a graph and being in the same row as a link between the vertices. Then what you are asking for is the connected components of the graph. All of that is easy using the igraph package in R.

library(igraph)
G = graph_from_edgelist(as.matrix(dat), directed=FALSE)
letters = sort(unique(c(as.character(dat$v1), as.character(dat$v2))))
(gp = components(G)$membership[letters])
a b c e f h p w z 
1 1 1 1 2 2 1 1 1

If you want a data.frame containing this information

(Groups = data.frame(letters, gp, row.names=NULL))
  letters gp
1       a  1
2       b  1
3       c  1
4       e  1
5       f  2
6       h  2
7       p  1
8       w  1
9       z  1

In order to think through why this works, it may help you to look at the graph that was created and think how that represents your problem.

Grouping of two variables in r

There are 1 best solutions below

Related Questions in R

Related Questions in GROUPING

Related Questions in TRANSITIVE-CLOSURE

Trending Questions

Popular # Hahtags

Popular Questions