Enumerating equivalence classes

242 Views Asked by At

What I have: an equivalence relation on the set {1,...,n}, given as the full list of equivalent pairs (a,b) (this is already transitive; no need to compute a transitive closure). n is large (say, a few millions at most); the total number of equivalent pairs is at most O(n) (and usually much less), and individual equivalence classes are small (I don't have a figure for this, but say O(log n)).

What I want: a way to individually enumerate all equivalence classes, i.e. successively get the full set of elements of each equivalence class; with complexity at most O~(n) (all other computations performed in my code are O(n log(n)) and I definitely don't want to do anything quadratic on that size if possible).

Using an union-find structure would help in constructing the transitive closure (which I don't need) but, I believe, not in constructing the equivalence classes. I can also easily compute (say) the smallest representative in each class (just scan the full equivalence relation until I find it) and even store all of them in time O(n) (build an identity vector of 1...n then scan the full equivalence relation, replacing by the smallest value if possible); however, this also does not help much in enumerating the equivalence classes.

Does there exist a classical solution to this problem?

1

There are 1 best solutions below

0
On

You can think of this as a graph search problem. You’re given a list of all the edges that make up a graph, and you want to find its connected components. If you had an adjacency list for the graph, this could be done with a DFS or BFS in time O(n), since there are n nodes and O(n) total edges.

Fortunately, you can construct an adjacency list in time O(n). Make an array of n lists in time O(n). Then, iterate across the edges and for each edge (a, b), append b to list number a.

Overall, this lets you find all the equivalence classes (connected components) in time O(n). This works even if the relation isn’t transitive.