One of the many issues with finalize methods in Java is the "object resurrection" issue (explained in this question): if an object is finalized, and it saves a copy of this somewhere globally reachable, the reference to the object "escapes" and you end up with a finalized but living object (that won't be finalized again, and otherwise is something of a problem).

In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object. In most cases, this achieves the goal of allowing the original object to be deallocated, rather than resurrected.

However, the Java garbage collector supports garbage collection of reference cycles; this means that an object can be finalized while (directly or indirectly) containing a reference to itself, and two objects can be finalized while (directly or indirectly) containing references to each other. In this case, the "copy all the fields into a new object" advice doesn't actually solve the problem; although we discard the this reference once the finalizer finishes running, the partially finalized object will be resurrected via the reference from the field. So we end up with the object being resurrected anyway.

In the case where the object indirectly holds a reference to itself, it's possible to recursively look through all the fields of the object until we find the self-reference (in which case we can replace it with a reference to the new object we're constructing), thus preventing the resurrection. So that solves the issue in that case.

However, if two objects hold references to each other (and thus both get deallocated at the same time), and we're creating a new instance of each, then each of the new objects will be holding a reference to the old, finalized object (rather than the new object that's been constructed as a replacement). This is obviously an undesirable state of affairs, so one thing I've been looking into is attempting to use the same solution as in the single-object case: recursively scanning the fields of the (living, newly constructed) objects looking for finalized objects, and replacing them with the corresponding replacement objects.

The problem is: how can I recognise a finalized/resurrected object, when I'm doing this? The obvious way to do this is to somehow record the identity of the finalized object in the finalizer, and then compare all the objects we find during the recursive scan with a list of finalized objects. The problem is, there doesn't seem to be a valid way to record the identity of the object in question:

  • A regular (strong) reference would hold the object alive, effectively resurrecting it automatically, and gives no method via which to determine that the object is not in fact referenced. This would solve the problem of identifying the resurrected objects, but comes with a problem of its own: although the resurrected objects would never be used, except for their identities, there would be no means via which to deallocate them (e.g. you can't use a PhantomReference to detect that the object is now truly dead, like you normally would in Java, because the object is now strongly reachable and thus the phantom reference never clears). So this would effectively mean that the objects in question stay allocated forever, causing a memory leak.
  • Using a weak reference was my first idea, but has the problem that at the time we construct the WeakReference object, the referenced object is not in fact strongly, softly, nor weakly reachable. As such, as soon as we store the WeakReference anywhere that's strongly reachable (to prevent the WeakReference itself being deallocated), the WeakReference's target becomes weakly reachable and the reference automatically clears. So we can't store any information that way.
  • Using a phantom reference has the problem that there's no way to compare a phantom reference with an object to see if that reference references that object. (Maybe there should be – unlike get(), which can resurrect an object, there's never any danger in this operation because we clearly have a reference to the object anyway – but it doesn't exist in the Java API. Likewise, .equals() on PhantomReference objects is ==, not value equality, so you can't use it to determine whether two phantom references reference the same thing.)
  • Using System.identityHashCode() to record a number corresponding to the object's identity almost works – deallocation of the object won't change the recorded number, the number won't prevent the object's deallocation, and resurrecting an object leaves the value the same – but unfortunately, being a hashCode, it's subject to collisions, so might have false positives in which an object appears to be resurrected when it isn't.
  • One final possibility is to modify the object itself to mark it as finalized (and track the location of its replacement), meaning that observing this mark on a strongly reachable object would reveal it as a resurrected object, but this requires adding an additional field to any object that might be involved in a reference cycle.

As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process". The approach I've been trying to use is "when an object that might potentially be involved in a cycle is finalized, keep track of that object's identity so that it can subsequently be replaced with its copy if it turns out to be reachable from another finalized object"; but none of the five approaches mentioned above seems satisfactory.

Is there some other way to keep track of finalized objects, so that they can be recognised if accidentally redirected? Is there an entirely different solution to the original problem, of safely making a copy of an object during its finalization?

2

There are 2 best solutions below

0
On

In order to avoid the creation of resurrected objects, the normal advice (as, e.g., seen in this answer) is to create a fresh instance of the object, rather than save the object itself; this would typically be accomplished by copying all the object's fields into a fresh object.

This is not the “normal advice”, not even the linked answer claims that. The linked answer starts with “If you absolutely must resurrect objects, …” which makes it pretty clear that this is not an advice on how “to avoid the creation of resurrected objects”.

The approach described in that answer is an object resurrection and ironically, it’s precisely the scenario, you describe as the problem you want to solve, a resurrection of objects (those referenced via the copied fields) by another object’s finalizer.

This keeps all but one of the problems associated with finalizers and with object resurrection. The only problem it solves, is that a finalized object won’t get finalized again, which is the smallest problem.

When an application abandons an object, it doesn’t have to be in a valid state. Objects only need to be kept in a valid state when they are intended to be used again. E.g. it is normal for an application to invoke close() on objects representing resources when done with them. But it’s also reasonable to abandon an object in the middle of an operation when an error occurs. The erroneous result state can be represented by a different object and the other, now-inconsistent object is not used.

A finalizer would have to deal with all these possible object states and even worse, with unusable object states caused by finalizers. As you recognized yourself, object graphs may get collected as a whole and all their finalizers get executed in an arbitrary order or even concurrently. So it doesn’t need loops and it doesn’t need resurrection attempts to get into trouble. When object A has a reference to object B and both have finalizers, an attempt of cleaning up A may fail when needing B in the process, as B may be already finalized or even in the middle of a concurrent finalization.

In short, finalization is not even suitable for the cleanup it was originally intended for. That’s why the finalize() method has been deprecated with Java 9.

Your attempt to reuse field values of an object under finalization is just adding fuel to the flames. Just think about the A→B scenario above. When A’s finalizer copies the field values to another object, it implies copying the reference to B and it doesn’t need an attempt by B’s finalizer to do the same. It’s already enough if B’s finalizer does what it is intended for, cleaning up associated resources, thus leaving B in an unusable state.

As a summary, my underlying problem is "given an object that's currently being finalized, safely create a copy of it, without accidentally resurrecting any objects that may be in a reference cycle of it in the process".

As explained, “an object that’s currently being finalized” and “safely” is a contradiction in itself. It doesn’t need mutual attempts of reuse to break it. Even when looking on your original narrow problem statement only, all of your approaches have the problem that they do not even attempt to prevent the problem. They all only try to detect the problem at some arbitrary later time after the fact.

That said, there is no problem in comparing the referent of a WeakReference with some other strong reference, like weakReference.get() == someStrongReference. A weak reference only gets cleared when the referent has been garbage collected, which implies that it is impossible for the strong reference to point to it, so the answer false for comparing a null reference with someStrongReference would be the right answer then.

4
On

As the other answers indicate, trying to solve the underlying problem in this way is something that can't be accomplished, and something of a wider rethink is needed when trying to solve this sort of problem. This post describes the solution that I used to my problem, and how I got there.

Assuming that the goal is "keep track of what an object looked like at the time it became unreferenced", this can only safely be accomplished when the object itself has no finalizer (otherwise, there are a number of hard-to-solve problems, as described in the question, its comments, and the other answer). The only reason we actually need a finalizer here is that we can't otherwise get at the object after it's become unreferenced.

It's clearly a bad idea to allow the object to become unreferenced and then revive it from its finalizer. However, "reviving" an object with no finalizer is much less of a problem (as this is equivalent to the object never being deallocated at all – it doesn't end up "partially finalized" like an object with a finalizer would). This can be accomplished via creating a separate object with a finalizer, and intentionally creating a reference loop between the original object and the separate, finalizer-bearing object (which has just a finalizer and a reference t to the original object, nothing else); when the object becomes otherwise unreferenced, the finalizer on the new object will run, but the original object won't be deallocated and won't end up in any awkward finalization-related state.

The finalizer will, of course, have to break the loop (removing itself from the original object), in order to avoid resurrecting itself; if a new strong reference to the original object is created during finalization (cancelling its deallocation), the finalization object will therefore have to replace itself with a new finalization object (but this is easy to do, because it doesn't carry state, there's only one reference to it, and we know where that object is).

In conclusion: there is no safe way to keep an object alive during its own finalization, not even if you copy all its fields elsewhere: instead, you need to ensure that the object has no finalizer, and instead keep it alive using some other object's finalization.