Java and Soft References

In a previous post, I described what weak references were and how the JVM interacted with them and how they contrasted with strong references. In summary, a strong reference is what we typically think of as a reference in Java; that is, when we instantiate an object and assign its reference to a variable, that is a strong reference. Obviously, these objects will not be cleared by the garbage collector (GC) until the reference is removed.

By contrast, a weak reference doesn’t prevent the garbage collector from clearing the referred object. That is, if the only references that remain to an object are weak references, that object will be treated as if there are no strong references to it, and thus it will be cleared by the GC on its next run. Weak references are implemented mainly by the WeakReference “wrapper” and WeakHashMap, the latter of which only maintains weak references to the keys, so that once the keys are inaccessible (i.e. no strong references exist), the WeakHashMap will automatically drop/remove the corresponding entries, which can also make the value objects eligible for GC so as long as there are no other references to them.

But how do these references contrast with a soft reference?

Soft References vs. Weak References

The basic difference between a soft reference and a weak reference is how aggressively the garbage collector will attempt to clear them. An object that has only weak references is treated by the GC no differently than an object with no references at all; that is, the GC would clear these objects and reclaim the memory (through the object life cycle) as soon as it sees fit.

A soft reference is treated slightly differently. According to the Javadoc, a soft reference is only cleared by the garbage collector in response to memory demand or need. This would seem to imply some sort of algorithm is used to determine when soft references should be cleared. The Javadoc is a little vague here, with the only requirement that all soft references be cleared before an OutOfMemoryError is thrown, with only a suggestion that VMs first clear out older soft references before clearing out newer ones:

Virtual machine implementations are, however, encouraged to bias against clearing recently-created or recently-used soft references.

In practice, at least the Sun/Oracle JVM (HotSpot) will attempt to GC soft-references in a globally LRU manner in response to memory demand.

Caches

Both weak references and soft references can be used to build caches, although the utility of such a cache (as opposed to a more traditional size-based cache) is debatable.

A weak-reference based cache is seen in WeakHashMap. A typical use case would be:

1. You have a lot of large objects that you want to associate with a certain key, and hence store in a map.
2. You’d like these large objects to be automatically removed/GC’d when the key is no longer reachable, rather than having to make an explicit remove() call.

A WeakHashMap can accomplish this; when the keys are no longer reachable, the associated entry will be removed, and the value object will be GC’d, provided there are no other references to it.

Soft References, on the other hand, won’t be garbage collected until there is memory demand. This can be useful to build a cache of objects that gets automatically expired in response to memory pressure. Guava’s CacheBuilder offers an option with this strategy.

However, both the weak and soft reference-based approach to caches are less predictable than a traditional size-based cache, as the expiration policy is now governed by GC behaviour. You will also have to make sure that the objects you wish to cache are being compared via object identity, and not object equality, since the garbage collector depends on object identity.

Because of these complexities, when in doubt, using a size-based cache (like that offered by Guava) or a simple LRU cache based on LinkedHashMap is probably a simpler and better option.

Conclusion

Soft and Weak references are similar in that if an object only has a soft/weak reference pointing at it, this alone won’t prevent it from being garbage collected. However, objects with weak references will be GC’d similar to an object which has no references, while objects with soft references will be GC’d at the JVM’s discretion, usually in response to memory demand.

Because of the additional factor of GC behaviour, I would not recommend using a cache based on soft references unless you had a very specific use case for it. Instead, a typical maximum-size cache would probably be a better sensible default if one needed a cache. (See Guava’s CacheBuilder for examples of this)

In a following article, we’ll look at PhantomReference and how it might be a better choice than overriding finalize() to schedule cleanup actions.

Comments for this entry are closed

But feel free to indulge in some introspective thought.