How can I best represent a 3-uniform hypergraph on n vertices, for purposes of finding the minimum clique cover or a close approximation of it?
This may sound like a math problem but it isn't; the question is about data structures for computation. Since there is math involved, I'll define some terms before sharing what I have so far.
A hypergraph is identical to a graph except that edges are redefined from being a pair of vertices to being a set of vertices.
A r-uniform hypergraph is a hypergraph where every edge is a set of exactly r vertices. E.g., a regular graph is identical to a 2-uniform hypergraph.
A clique in a 3-uniform hypergraph is a set S of vertices such that every subset of 3 vertices in S is an edge.
This is an np-hard problem. I'll be using probabilistic approaches to find the cliques I'm looking for. The main operation I need to support (many many times) is:
- Given a clique and a vertex, can the vertex be merged into the clique while preserving the clique property (i.e. it's still a clique with the new vertex)>
My current conceptual approach, again for a 3-uniform hypergraph H on n vertices:
- Associate each vertex v with an nxn grid where cell (i,j) is 1 if {v,i,j} is an edge of H, and 0 otherwise. We always set cells (v,i) and (j,v) to 1 for vertex v.
A clique is represented by an nxn grid as well, together with a 1xn bitvector where a 1 in position i means that vertex i can be merged into the clique. The grid associated with the clique is the AND of the grids associated with its vertices, and the 1xn bitvector is the AND of the rows of the clique-grid associated with its members.
For example, say our hypergraph includes vertices {0,1,2,3}, and edges {0,1,2}, {0,1,3}, {0,2,3} but not {1,2,3}
Vertex representations:
0) 1111 1) 1111 2) 1111 3) 1111
1111 1111 1110 1101
1111 1110 1111 1011
1111 1101 1011 1111
Now, any pair of vertices can be merged, so we only care about cliques that have at least 2 vertices. Say we start by merging 1 & 2. We should know that 3 can't be merged.
Merge(1,2):
step 1: take the AND of the two grids: 1111
1110
1110
1001
step 2: 1xn bitvector is the AND of rows 1 & 2: 1110
Observation: 3 cannot be part of this clique because the 3rd elt of the 1xn bitvector is zero.
On the other hand, after merging 0 & 1, we should still be able to merge either 2 or 3:
Merge(0,1):
step 1: take the AND of the two grids: 1111
1111
1110
1101
step 2: 1xn bitvector is the AND of rows 0 & 1: 1111
Observation: As expected, the 1xn bitvector still shows 2 & 3 as eligible for merging.
This is O(n^3) space, which I doubt we can avoid except possibly in special sparse cases because there are potentially choose(n,3) edges.
My plan for representing the data is to use bitvectors for everything so the bitwise operations are fast.
- nxn grids: represent with n bitvectors of length n
I'm trying to make this as efficient as possible, since that will determine the size of graph I can explore in a reasonable time. I'm not sure it matters, but in case it does, the language I intend to use is Rust.
Question
- Is there a better data structure to use, either in general, or for certain classes of hypergraph (e.g. sparse hypergraphs)?
If anyone looking at this has thoughts on an alternate approach to solving the problem, I'd welcome that as well.