[jmm-dev] finalization (was: ECOOP, JVMLS)

Tue Aug 5 22:45:27 UTC 2014

Native resource allocation makes for good examples here, but not all use
cases involve native memory allocation.  If I want to attach additional
fields f to a small subset of objects of type T, it makes sense to use an
IdentityHashMap to maintain the extra values, rather than adding a field to
the class T.  Again, in the general case, it seems I need a finalizer to
clean up the IdentityHashMap entries when a T object is no longer needed.
 This has the same problem.  The IdentityHashMap entry may be reclaimed by
the finalizer while I'm still accessing it.  And this doesn't sound to me
like it should be that advanced a use case.  (The results are likely to be
a much more debuggable null pointer exception or the like.  But that's
still not good.

I do not see how to hide this behind some sort of native pointer wrapper
class.  Somehow the programmer still needs to specify that a particular
object shouldn't be finalized until a particular point in the code.  And
the need for that annotation seems extremely surprising to most people,
even experts.

I also don't see how to usefully factor BigInteger to separate out this
problem.  It is still the case that when I call multiply some function
between the one I call and the native one has to explicitly ensure that the
referenced Java BigInteger objects stay around.  That seems to unavoidably
require substantial amounts of boilerplate code, at least in the absence of
something like AspectJ.

Again, this does not just impact finalizers.  It also impacts
java.lang.ref.  BigInteger could be using reference queues rather than
finalizers.  Thus there doesn't appear to be easy way for an IDE or
compiler to tell whether an object is "finalizable", and hence needs
attention.  And the "this" object is not really different form other
parameters or local variables.

It seems to me that this is another case, like OOTA values, where there are
no easy solutions.  My sense is that in both cases, we're better off going
with the performance hit than the questionable/unusable semantics that we
now have.  In the finalizer case, we can punt it back to the programmer,
but I think we've seen in the last ten tears that we will not magically get
correct code if we do.  This problem isn't just overlooked by average
programmers.   All of the code in which I've been finding these problems
was written by experts, and has been examined by many people.

Hans

On Tue, Aug 5, 2014 at 12:15 PM, John Rose <john.r.rose at oracle.com> wrote:

> On Aug 5, 2014, at 11:34 AM, Hans Boehm <boehm at acm.org> wrote:
>
> I think a good specific example to think about is a, not entirely
> hypothetical, implementation of java.math.BigInteger that just forwards
> calls to an underlying native library that actually manages the memory for
> the number representations.  The BigInteger finalizer deallocates native
> memory.  Currently every BigInteger operation needs a synchronized(this){}
> at the end of the method, in order to prevent the underlying native
> representation from getting deallocated by a finalizer while the native
> arithmetic operation is still running, but the enclosing Java object is
> potentially no longer needed.
>
> You may not like this particular example.  But, as far as I can tell, this
> is a fairly common pattern.  "Rewrite it in Java" may or may not be good
> advice here, but it's certainly not universally applicable.   And it's a
> case for which I don't have a good solution except along the lines that
> Jeremy proposed.  It's also a case in which premature finalization has
> particularly nasty consequences.
>
>
> I like the example and the pattern.  Maybe the root problem is that domain
> experts (multi-precision math programmers) feel the need to confront native
> resource allocation problems directly.
>
> There are things we could do to patch the problem without entangling LVTs
> or burdening the majority of Java classes who don't use finalizers.  A
> class with a finalizer can provide (by fiat) extra liveness on 'this' to
> the end of every method.  More generally, references of a finalizable type
> (but not Object) can be declared to be live until reassignment or the next
> return instruction.  This will require tweaks to the JITs which will be
> error-prone, but less error-prone than hacking LVTs.
>
> But (if the source code complexity is not too bad) I think it would be
> better to require an explicit fence in source code to stretch liveness,
> even of 'this' in BigInteger.  Warnings from IDEs or FindBugs could help
> the programmer place fences.
>
> Adding native pointers to a Java data structure is the work of an expert
> library designer (L3 in Scala taxonomy).  We need to give them good tools
> (see Project Panama) for native programming, but it will always be a task
> requiring special knowledge.  ("Always", or at least as long as Java adds
> value by imposing a stronger set of safety invariants than native code,
> requiring programmers at the interface to take extra care.)
>
> In short, finalizers are for expert library designers, and there's a book
> somewhere you must read if you want to use one.
>
> Even a class like BigInteger could be factored into two parts, the
> arithmetic part and the native-resource-management part.  A key part of
> Project Panama is creating "safe native pointer" APIs which can manage
> native memory resources without requiring expertise from their users.  The
> BigInteger could have a reference to a wrapped native pointer with all the
> tricky fences and finalizers needed.  That way an L3 expert in native
> interfaces can handle one tricky aspect of BigInteger, while an L2 expert
> in arithmetic can handle the others.
>
> (And there are other design patterns for resource management which don't
> rely on global reachability; these should be used when possible.  The C++
> RAII pattern is broadly applicable and can be expressed in Java using
> try-with-resources.  But I agree BigInteger might need long-lived native
> resources managed by reachability tests.)
>
> — John
>