One of the many issues with finalize
methods in Java is the "object resurrection" issue (explained in this question): if an object is finalized, and it saves a copy of this
somewhere globally reachable, the reference to the object "escapes" and you end up with a finalized but living object (that won't be finalized again, and otherwise is something of a problem).
In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object. In most cases, this achieves the goal of allowing the original object to be deallocated, rather than resurrected.
However, the Java garbage collector supports garbage collection of reference cycles; this means that an object can be finalized while (directly or indirectly) containing a reference to itself, and two objects can be finalized while (directly or indirectly) containing references to each other. In this case, the "copy all the fields into a new object" advice doesn't actually solve the problem; although we discard the this
reference once the finalizer finishes running, the partially finalized object will be resurrected via the reference from the field. So we end up with the object being resurrected anyway.
In the case where the object indirectly holds a reference to itself, it's possible to recursively look through all the fields of the object until we find the self-reference (in which case we can replace it with a reference to the new object we're constructing), thus preventing the resurrection. So that solves the issue in that case.
However, if two objects hold references to each other (and thus both get deallocated at the same time), and we're creating a new instance of each, then each of the new objects will be holding a reference to the old, finalized object (rather than the new object that's been constructed as a replacement). This is obviously an undesirable state of affairs, so one thing I've been looking into is attempting to use the same solution as in the single-object case: recursively scanning the fields of the (living, newly constructed) objects looking for finalized objects, and replacing them with the corresponding replacement objects.
The problem is: how can I recognise a finalized/resurrected object, when I'm doing this? The obvious way to do this is to somehow record the identity of the finalized object in the finalizer, and then compare all the objects we find during the recursive scan with a list of finalized objects. The problem is, there doesn't seem to be a valid way to record the identity of the object in question:
- A regular (strong) reference would hold the object alive, effectively resurrecting it automatically, and gives no method via which to determine that the object is not in fact referenced. This would solve the problem of identifying the resurrected objects, but comes with a problem of its own: although the resurrected objects would never be used, except for their identities, there would be no means via which to deallocate them (e.g. you can't use a
PhantomReference
to detect that the object is now truly dead, like you normally would in Java, because the object is now strongly reachable and thus the phantom reference never clears). So this would effectively mean that the objects in question stay allocated forever, causing a memory leak. - Using a weak reference was my first idea, but has the problem that at the time we construct the
WeakReference
object, the referenced object is not in fact strongly, softly, nor weakly reachable. As such, as soon as we store theWeakReference
anywhere that's strongly reachable (to prevent theWeakReference
itself being deallocated), theWeakReference
's target becomes weakly reachable and the reference automatically clears. So we can't store any information that way. - Using a phantom reference has the problem that there's no way to compare a phantom reference with an object to see if that reference references that object. (Maybe there should be – unlike
get()
, which can resurrect an object, there's never any danger in this operation because we clearly have a reference to the object anyway – but it doesn't exist in the Java API. Likewise,.equals()
onPhantomReference
objects is==
, not value equality, so you can't use it to determine whether two phantom references reference the same thing.) - Using
System.identityHashCode()
to record a number corresponding to the object's identity almost works – deallocation of the object won't change the recorded number, the number won't prevent the object's deallocation, and resurrecting an object leaves the value the same – but unfortunately, being ahashCode
, it's subject to collisions, so might have false positives in which an object appears to be resurrected when it isn't. - One final possibility is to modify the object itself to mark it as finalized (and track the location of its replacement), meaning that observing this mark on a strongly reachable object would reveal it as a resurrected object, but this requires adding an additional field to any object that might be involved in a reference cycle.
As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process". The approach I've been trying to use is "when an object that might potentially be involved in a cycle is finalized, keep track of that object's identity so that it can subsequently be replaced with its copy if it turns out to be reachable from another finalized object"; but none of the five approaches mentioned above seems satisfactory.
Is there some other way to keep track of finalized objects, so that they can be recognised if accidentally redirected? Is there an entirely different solution to the original problem, of safely making a copy of an object during its finalization?
This is not the “normal advice”, not even the linked answer claims that. The linked answer starts with “If you absolutely must resurrect objects, …” which makes it pretty clear that this is not an advice on how “to avoid the creation of resurrected objects”.
The approach described in that answer is an object resurrection and ironically, it’s precisely the scenario, you describe as the problem you want to solve, a resurrection of objects (those referenced via the copied fields) by another object’s finalizer.
This keeps all but one of the problems associated with finalizers and with object resurrection. The only problem it solves, is that a finalized object won’t get finalized again, which is the smallest problem.
When an application abandons an object, it doesn’t have to be in a valid state. Objects only need to be kept in a valid state when they are intended to be used again. E.g. it is normal for an application to invoke
close()
on objects representing resources when done with them. But it’s also reasonable to abandon an object in the middle of an operation when an error occurs. The erroneous result state can be represented by a different object and the other, now-inconsistent object is not used.A finalizer would have to deal with all these possible object states and even worse, with unusable object states caused by finalizers. As you recognized yourself, object graphs may get collected as a whole and all their finalizers get executed in an arbitrary order or even concurrently. So it doesn’t need loops and it doesn’t need resurrection attempts to get into trouble. When object A has a reference to object B and both have finalizers, an attempt of cleaning up A may fail when needing B in the process, as B may be already finalized or even in the middle of a concurrent finalization.
In short, finalization is not even suitable for the cleanup it was originally intended for. That’s why the
finalize()
method has been deprecated with Java 9.Your attempt to reuse field values of an object under finalization is just adding fuel to the flames. Just think about the A→B scenario above. When A’s finalizer copies the field values to another object, it implies copying the reference to B and it doesn’t need an attempt by B’s finalizer to do the same. It’s already enough if B’s finalizer does what it is intended for, cleaning up associated resources, thus leaving B in an unusable state.
As explained, “an object that’s currently being finalized” and “safely” is a contradiction in itself. It doesn’t need mutual attempts of reuse to break it. Even when looking on your original narrow problem statement only, all of your approaches have the problem that they do not even attempt to prevent the problem. They all only try to detect the problem at some arbitrary later time after the fact.
That said, there is no problem in comparing the referent of a
WeakReference
with some other strong reference, likeweakReference.get() == someStrongReference
. A weak reference only gets cleared when the referent has been garbage collected, which implies that it is impossible for the strong reference to point to it, so the answerfalse
for comparing anull
reference withsomeStrongReference
would be the right answer then.