A classic trick for optimizing Datalog programs is symmetry-breaking. Symmetry breaking can halve the number of facts you must compute in your database, since you only need to compute edge(0, 1)
and not also edge(1, 0)
.
Consider an n x m coordinate grid. If we want to write a relation describing undirected edges on this grid, a simple way to break symmetry is to say that edges only proceed from smaller to larger coordinates (e.g., (2, 3)
to (3, 3)
, but not vice versa). Here is a simple clingo program which models a 2x2 grid in precisely this way:
row(0..1).
col(0..1).
node(n(I, J)) :- row(I), col(J).
delta(0,1).
delta(1,0).
edge(e(N1, N2)) :-
node(N1), node(N2),
N1 = n(I1, J1),
N2 = n(I2, J2),
delta(I2-I1, J2-J1).
#show edge/1.
We can see that this generates four facts about edge, rather than eight if symmetry had not been broken:
edge(e(n(0,0),n(0,1)))
edge(e(n(1,0),n(1,1)))
edge(e(n(0,0),n(1,0)))
edge(e(n(0,1),n(1,1)))
Now consider the line graph induced by this coordinate grid. This line graph is also undirected, and we'd like to symmetry break in a similar manner. What is a compact way to break symmetry on this graph?
For reference, here is a short definition of edges on the line graph which does not break symmetry.
ledge(l(E1, E2)) :-
edge(E1), edge(E2),
E1 = e(N1, N2),
E2 = e(N1, N3; N3, N1; N3, N2; N2, N3),
N3 != N1, N2 != N1.
#show ledge/1.
We can see it does not break symmetry as it generates these two facts (among others):
ledge(l(e(n(0,0),n(1,0)),e(n(1,0),n(1,1))))
ledge(l(e(n(1,0),n(1,1)),e(n(0,0),n(1,0))))
To symmetry break, it is sufficient to define some total order on elements, so you can then simply say that p(A, B) only when A < B. Clingo defines lexicographic order on all elements, so the solution is in fact as simple as:
giving
as desired. There may be opportunities for other optimizations, though!