Java PhantomReferences: A better choice than finalize()

We’ve talked about soft and weak references, and how they differ from strong references. But what about phantom references? What are they useful for?

Starting with Java 9, the intended usage of PhantomReference objects is to replace any usage of Object.finalize() (which was deprecated in Java 9), in order to allow for this sort of object clean-up code to be run in a more predicable manner (as designated by the programmer), rather than subject to the constraints/implementation details of the garbage collector.

How to use them

The basic usage is to create a PhantomReference that wraps some object reference. However, the get() method will always return null for a PhantomReference, so what can this object even be used for?

Firstly, creating a phantom reference does not make the object phantom-reachable. (The same applies for weak references, and soft references) The object must first lose all (strong) references to it, and then sometime after that, the JVM will determine it’s phantom-reachable.

This is why you must register a phantom reference with a ReferenceQueue in order for it to be useful. This is indicated by the constructor signature, and the accompanying Javadoc:

“It is possible to create a phantom reference with a null queue, but such a reference is completely useless: Its get method will always return null and, since it does not have a queue, it will never be enqueued.”

The phantom reference will then be enqueued in the reference queue at the moment the JVM determines the reference object is only phantom-reachable, (that is, it has no strong, soft, nor weak references), and this serves as a notification of the reachability change.

Alternative to finalize

Using a PhantomReference along with a ReferenceQueue can allow you to be notified when an object has been finalized by the GC, and thus allow you to perform any necessary clean-up action.

In fact, starting with Java 9, Object.finalize() has been deprecated, recognizing what many Java developers have know for some time: That implementing finalize() can lead to error-prone code: The thread which calls finalize() is the garbage-collector thread, which introduces concurrency concerns, and an improper finalize() method could leak a reference to the object itself, preventing it from being GC’d.

Before Java 9, PhantomReference didn’t fully address all of these concerns, but there were changes made to bridge this gap. Check out the notable differences in the Java 8 vs Java 9 docs:

Java 8 PhantomReference:
– Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism.
– The Javadoc states: “Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.”

Java 9 PhantomReference:
– Phantom references are most often used to schedule post-mortem cleanup actions.
– There’s no mention of them not being automatically cleared by the garbage collector.
– The Javadoc states: “Suppose the garbage collector determines at a certain point in time that an object is phantom reachable. At that time it will atomically clear all phantom references to that object and all phantom references to any other phantom-reachable objects from which that object is reachable. At the same time or at some later time it will enqueue those newly-cleared phantom references that are registered with reference queues.”

This means that in Java 9, PhantomReference objects are dequeued at a later change (from pre-mortem to post-mortem) and the PhantomReference itself should not prevent garbage collection of the object, and thus create a resource leak. Previously, this was not the case.

A simple example

Let’s take a look at a simple and contrived example, my favourite kind of example. (An example is also at: https://ideone.com/I8f4U3)

package net.unitstep.examples.references;

import java.lang.ref.PhantomReference;
import java.lang.ref.ReferenceQueue;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * @author Peter Chng
 */
public class PhantomReferenceExample {
  private static final Logger LOGGER = LoggerFactory.getLogger(PhantomReferenceExample.class);

  // Just so we have a non-primitive, non-interned object that will be GC'd.
  public static class SampleObject<T> {
    private final T value;

    public SampleObject(final T value) {
      this.value = value;
    }

    @Override
    public String toString() {
      return String.valueOf(this.value);
    }
  }

  public static class PhantomReferenceMetadata<T, M> extends PhantomReference<T> {
    // Some metadata stored about the object that will be used during some cleanup actions.
    private final M metadata;

    public PhantomReferenceMetadata(final T referent, final ReferenceQueue<? super T> q,
        final M metadata) {
      super(referent, q);
      this.metadata = metadata;
    }

    public M getMetadata() {
      return this.metadata;
    }
  }

  public static void main(final String[] args) {
    // The object whose GC lifecycle we want to track.
    SampleObject<String> helloObject = new SampleObject<>("Hello");

    // Reference queue that the phantom references will be registered to.
    // They will be enqueued here when the appropriate reachability changes are detected by the JVM.
    final ReferenceQueue<SampleObject<String>> refQueue = new ReferenceQueue<>();

    // In this case, the metadata we associate with the object is some name.
    final PhantomReferenceMetadata<SampleObject<String>, String> helloPhantomReference = new PhantomReferenceMetadata<>(
        helloObject, refQueue, "helloObject");

    new Thread(() -> {
      LOGGER.info("Starting ReferenceQueue consumer thread.");
      final int numToDequeue = 1;
      int numDequed = 0;
      while (numDequed < numToDequeue) {
        // Unfortunately, need to downcast to the appropriate type.
        try {
          @SuppressWarnings("unchecked")
          final PhantomReferenceMetadata<SampleObject<String>, String> reference = (PhantomReferenceMetadata<SampleObject<String>, String>) refQueue
              .remove();

          // At this point, we know the object referred to by the PhantomReference has been finalized.
          // So, we can do any other clean-up that might be allowed, such as cleaning up some temporary files
          // associated with the object.
          // The metadata stored in PhantomReferenceMetadata could be used to determine which temporary files
          // should be cleaned up.
          // You probably shouldn't rely on this as the ONLY method to clean up those temporary files, however.
          LOGGER.info("{} has been finalized.", reference.getMetadata());
        } catch (final InterruptedException e) {
          // Just for the purpose of this example.
          break;
        }
        ++numDequed;
      }
      LOGGER.info("Finished ReferenceQueue consumer thread.");
    }).start();

    // Lose the strong reference to the object.
    helloObject = null;

    // Attempt to trigger a GC.
    System.gc();
  }
}

In this straightforward example, we:

1. Create an object.
2. Create a ReferenceQueue for the JVM to use.
3. Wrap that object in a PhantomReference-derived class, attach some metadata to it, and register it with the ReferenceQueue
4. When the JVM detects the appropriate reachability changes (i.e, there’s no longer a strong reference to helloObject, and it’s only phantom-reachable), it will enqueue the phantom reference object into the reference queue.
5. We create a separate thread to monitor the reference queue, and could do some clean-up associated with the object here.

Note that when using a separate thread to dequeue from the reference queue, as in step (5), in the general case we have to be careful:

1. If you get a reference to the object via Reference.get(), you could leak it outside of the thread, causing a resource leak.
2. You might also be invoking a method on the object while it is being finalized, (since the GC runs in a separate thread), introducing concurrency issues.

Both of these issues do not matter when using a PhantomReference, since its get() method always returns null (though you could use reflection to get access to the object reference, a horrible idea), and PhantomReferences are enqueued after the object has been finalized.

However, with soft and weak references, get() could still return a reference to the object. During clean-up, when dequeuing the reference from the ReferenceQueue, I would not use this approach for the reasons above.

Instead, extend the appropriate reference class and attach some additional metadata (in the form of an object that does not have a reference to the original object) that can be used to perform the clean up, as in the example above.

For example, you could extend PhantomReference to store the name of some temporary file that could be removed when the object is no longer reachable. Then, you could retrieve this file name from the PhantomReference child class and use that, rather than getting it directly from the original object.

Avoid complexity

In general, having your clean-up code rely on some finalization mechanism (whether it be the finalize() method, or using PhantomReferences) should not be your first option. This is because you’d be relying on certain JVM GC behaviour in order to enforce your application logic, which isn’t a good idea in my mind. Relying on clean-up to take place when some object goes out-of-scope is a fragile linkage.

Prefer using an explicit (i.e. within your own application’s logic) clean-up mechanism, if possible. This is likely to be simpler. This is reinforced by the deprecation note in Java 9 for Object.finalize(): “Classes whose instances hold non-heap resources should provide a method to enable explicit release of those resources, and they should also implement AutoCloseable if appropriate”.

I’d only use the above approach (PhantomReference/ReferenceQueue) as a fallback approach to compliment an explicit clean-up mechanism, and would not rely on it unless absolutely necessary – as you can tell, it’s a little complicated. Prefer simplicity.

Comments for this entry are closed

But feel free to indulge in some introspective thought.