Does Prolog need GC when the occurs check is globally enabled?

426 Views Asked by At

As far as I can tell, with sound unification, SLD resolution should not create cyclic data structures (Is this correct?)

If so, one could, in theory, implement Prolog in such a way that it wouldn't need garbage collection (GC). But then again, one might not.

  • Is this true for WAM-based Prolog implementations?

  • Is this true for SWI-Prolog? (I believe it's not WAM-based) Is it safe to disable GC in SWI-Prolog when the occurs check is globally enabled?

Specifically:

:- set_prolog_flag(occurs_check, true).
:- set_prolog_flag(gc, false). /* is this safe? */
3

There are 3 best solutions below

0
On BEST ANSWER

The creation of cyclic terms is far from being the only operation that can create garbage (as in garbage collection) in Prolog (also worth noting that not all Prolog systems provide comprehensive support for cyclic terms but most of them support some form of garbage collection).

As an example, assume that you have in your code the following sequence of calls to go from a number to an atom:

...,
number_codes(Number, Codes),
atom_codes(Atom, Codes),
...

Here, Codes is a temporary list that should be garbage collected. Another example, assume that you're calling setof/3 to get an ordered list of results where you're only interested in the first two:

...,
setof(C, x(X), [X1, X2| _]),
...

You just created another temporary list. Or that you forgot about sub_atom/5 and decide to use atom_concat/3 to check the prefix of an atom:

...,
atom_concat(Prefix, _, Atom),
...

That second argument, the atom suffix that you don't care about (hence the anonymous variable), is a temporary atom that you just created. But not all Prolog systems provide an atom garbage collector, which can lead to trouble in long running applications.

But even when you think that you have carefully written your code to avoid the creation of temporary terms, the Prolog system may still be creating garbage when running your code. Prolog systems use different memory areas for different purposes, and operations may need to make temporary copies of segments of memory between different memory areas, depending on the implementation. The Prolog system may be written in a language, e.g. Java, that may eventually take care of that garbage. But most likely it's written in C or C++ and some sort of garbage collection is used internally. Not to mention that the Prolog system may be grabbing a big block of memory to be able to prove a query and then reclaiming that memory after the query terminates.

0
On

This here would be safe:

:- set_prolog_flag(gc, false).

But if your Prolog system has garbage collection, switching it off might be not a good idea, since even with occurs check set to true, there can be still garbage through temporary results. And having garbage continuously removed can improve locality of code, i.e. your memory gets less trashed by cache misses:

p(X,Y) :- q(X,Z), r(Z,Y).

The variable Z might point to some Prolog term which is only temporarily needed. Most modern Prolog systems can remove such Prolog terms through so called environment trimming.

But an occurs check opens up the path to a special kind of garbage collection. Namely since no more cyclic terms can appear, it is possible to use reference counting. An old Prolog system that had reference counting was this beast here:

xpProlog: High Performance Extended Pure Prolog - Lüdemann, 1988
https://open.library.ubc.ca/media/download/pdf/831/1.0051961/1

Also Jekejeke Prolog does still do reference counting. A problem with reference counting are attributed variables, which can create a cyclic term nevertheless, for example a freeze/2 as follows creates a cycle through the frozen goal back to the variable:

?- freeze(X, (write(X), nl)).

Edit 04.09.2021:
What also might demand garbage collection is setarg/3. It
can create cycles which cannot be so easily removed by
reference counting.

?- X = f(0), setarg(1,X,X).
X = f(X).

Since setarg/3 is backtrackable, the cycle would go away in
backtracking, at least I guess so. But the cycle might still bother
when we are deep in backtracking and running out of memory.

Or the cycle might not go away through backtracking since we
used the non-backtracking version nb_setarg/3.

1
On

Well, something has to free up multiply-referenced memory to which references exist that can be dropped at any step in the computation. This is the case independently of whether structures are cyclic or not.

Consider variables A and B naming the same structure in memory (they "name the same term"). The structure is referenced from 2 places. Suppose the predicate in which B is defined succeeds or fails. The Prolog Processor cannot just deallocate that structure: it is still referenced from A. This means you need at least reference counting to make sure you don't free memory too early. That's a reference counting garbage collector.

I don't know what kind of garbage collection is implemented in any specific Prolog implementation (there are many approaches, some better suited to Prolog than others ... in a not completely-unrelated context 25 years of Java have created all of these) but you need to use one, not necessarily a reference counting one.


(Cyclic structures are only special to garbage collection because reference counting garbage collection algorithms are unable to free up cyclic structures, as all the cells in a loop have a reference count of at least 1.)

(Also, an IMHO, never a trust a programming languages in which you have to call free yourself. There is probably a variation of Greenspun's tenth rule ("Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.") in that any program written in a programming language in which you have to call free yourself contains an ad hoc, informally-specified, bug-ridden, slow implementation of a garbage collection algorithm.")

(OTOH, Rust seems to take kind of a middle way, offloading some effort onto the developer but having the advantage of being able to decide whether to free memory when a variable goes out of scope. But Rust is not Prolog.)