[jmm-dev] finalization (was: ECOOP, JVMLS)

Thu Aug 14 00:59:02 UTC 2014

On Aug 8, 2014, at 4:54 PM, Hans Boehm <boehm at acm.org> wrote:

> 
> On Wed, Aug 6, 2014 at 2:45 PM, John Rose <john.r.rose at oracle.com> wrote:
> >
> > On Aug 5, 2014, at 3:45 PM, Hans Boehm <boehm at acm.org> wrote:
> >
> > > Native resource allocation makes for good examples here, but not all use cases involve native memory allocation.  If I want to attach additional fields f to a small subset of objects of type T, it makes sense to use an IdentityHashMap to maintain the extra values, rather than adding a field to the class T.  Again, in the general case, it seems I need a finalizer to clean up the IdentityHashMap entries when a T object is no longer needed.  This has the same problem.  The IdentityHashMap entry may be reclaimed by the finalizer while I'm still accessing it.  And this doesn't sound to me like it should be that advanced a use case.  (The results are likely to be a much more debuggable null pointer exception or the like.  But that's still not good.
> >
> > OK, that's a bad one.
> >
> > It cannot be a common case, partly because it will fail if stress-tested.  Also, the author of class T has to bypass two mental warning signs in order to get into that bad spot.  First, define some complicated IdentityHashMap stuff outside of T, consciously choosing to do that instead of the more obvious and natural route of defining T.extensionSlotTable.  Second, fix the storage leak by defining T.finalize(), which should set off alarm bells, since finalizers are mysterious.  Note that the user has to have write access to T's definition in order to do the second step, so the first step could have been done more naturally.
> 
> [HB] I'm not sure what you're proposing here.  You want to put the extensionSlotTable reference in each object?

Yes.  It's a better trade-off than overriding Object.finalize, because that kind of finalizer is likely to be more expensive than an extra word per object.  But (as you say) something from java.lang.ref would be cheaper.

> That already adds to the size of all objects, when I only want to add the optional field to a few.
> ...
> [HB] But if I use java.lang.ref "finalization", I can avoid the finalization overhead on objects without the field.  Another one of those finalization deficiencies that java.lang.ref avoids, at the not-strictly-necessary cost of making potentially finalizable objects difficult to identify.

Yes.  (Maybe we could try a marker interface or annotation...?)

> ...
> > Put all accesses to the native data inside the wrapper object W, and have the wrapper object (written by a finalizer expert) include suitable fences so that W.this does not go dead until the native access (read or write) is done.  The BigInteger has a W field.  The BigInteger can go dead, but the W will stay live until it finishes its access.  If there is another access to do, the W will be live as a local inside the BigInteger method, even if the BigInteger is dead.  The user of the BigInteger doesn't have to know anything about finalizers or native pointers.
> 
> [HB] My concern is that if all native accesses go inside W, then W has to wrap all arithmetic operations etc. on BigIntegers.  W becomes essentially what BigInteger is now in such implementations.  The wrapper W is very BigInteger specific and has to know about all BigInteger operations.

That's not so bad.  Just the reads and writes would go inside of W.  Think of W as working like a C "int*" pointer.  The last use of W would produce or consume a native value, and W could go dead then.

> 
> >
> > > Again, this does not just impact finalizers.  It also impacts java.lang.ref.  BigInteger could be using reference queues rather than finalizers.  Thus there doesn't appear to be easy way for an IDE or compiler to tell whether an object is "finalizable", and hence needs attention.
> >
> > Yuck.  More "drop-sensitive" types for the heuristic would be any of the java.lang.ref types.
> >
> > My patch begins to fall apart, since drop-sensitivity might arguably depend recursively on drop-sensitivity of child objects (e.g., a WeakReference field of the parent object).  And of course you can create scenarios where a drop-sensitive object is obscured by a normal static type like Object.
> 
> [HB] Right.  Creating a WeakReference to an object may make the constructor argument "drop-sensitive".  Which means we need something like a whole program flow analysis to identify "drop-sensitivity".  I think in practice everything whose lifetime can't be effectively bounded by escape analysis has to be considered "drop-sensitive".

That leaves me with half baked ideas like this: Localize the analysis by requiring @DropSensitive on various declarations.  (Roughly like @NotNull or @SafeVarargs.)  Demonstrate that almost every object that is "finalized" (in the broad sense) is declared drop-sensitive, and every drop-sensitive object is live to the end of its scope.  Flag gaps in the demonstration with errors or warnings.  This seems unworkable to me given the large number of types that would have to be marked @DropSensitive.  Sigh.

— John