[jmm-dev] Non-core-model issues status

Sun Oct 5 14:24:59 UTC 2014

On 10/03/2014 08:20 PM, Hans Boehm wrote:
> Let me try restating the proposed finalization rules and moving the annotation
> from a field to the containing class.  As far as I can tell, the annotation
> we've been discussing really only impacts the treatment of the containing class,
> so that seems to be a more logical place for it.

Thanks. Seeing this fleshed out leads me again to prefer a simpler
(for us) option: Defining reachabilityFence() and introducing but not
requiring support for your annotation @ReachabilitySensitive (modulo
names for these).  Backing up to explain why...

Finalization reveals a mismatch between JVM-level lifetimes and
source-code-level block scopes. Most programmers implicity expect that
reachability continues to the end of enclosing block. But the only
Java syntactic constructs guaranteed to preserve this entail locking.
Probably the best default advice is to always use synchronized
blocks/methods (or other j.u.c locks in try-finally constructions)
when accessing finalizable resources (as well as in the finalize
methods themselves). But when people can't or don't want to use
locking (not even the hacky and possibly slow but effective trailing
"synchronized(x){}") they should be able to use try {...}  finally
{reachabilityFence(x);} (which acts as an RCU-ish read-lock with
respect to GC). Or, in some cases (e.g., ThreadPoolExecutor), do
nothing, because nothing bad can happen anyway.

Automated placement of any of these forms of lifetime control must be
done at source level, because scope information is not guaranteed to
be preserved in bytecode.  (Although live-range debugging information
in class files might suffice in many cases.)  Different languages
running on JVMs might have different scoping constructs and rules.  To
nail them down, languages would need to add C++-destructor-like rules
as Hans sketched out.  But even if they do, in Java and probably all
JVM-hosted languages, a front-end compiler's ability to automate
placement of reachability fences is limited by static type
information.  For example, a reference of nominal type Object might be
a FileHandle, and even a call to an Object method like hashCode()
might require reachabililty protection.  Insisting on the elimination
of all possible lifetime mismatches requires compilers to always
insert reachability fences unless provably not necessary, either
because the object is known to be of a type without a finalizer, or
one of the above manual techniques already occurs in the source code.
Even though it is rarely necessary to insert reachability fences, most
proofs require analyses that front-end compilers do not perform.

So, guaranteeing lack of scope-lifetime surprise doesn't look like a
promising option. Instead we can require something like Hans sketched
out that is good but imperfect, or fall back to introducing annotation
@ReachabilitySensitive (implicitly applied to classes with non-default
finalize()), but not requiring any compile-time or run-time
properties. And then encouraging tool/IDE developers to support it to
help programmers avoid bugs.  These might include points-to analyses
to minimize impact, warnings about upcasts from
@ReachabilitySensitive, suggested alternatives to using finalize(),
and so on. If people using finalization came to rely on such tools,
I'd expect the tools would become better at this than we could ever
mandate as part of language/VM spec.

As always, I realize that finding a middle ground between deprecating
and "fixing" finalize() can be a tough sell to both sides.  Other
ideas welcome.

-Doug

>
> 1. A class is reachability-sensitive if it has a non-default finalizer or is
> suitably annotated (e.g. because its instances are cleaned up by a finalizer
> guardian or through a reference queue).
>
> 2. A reference variable is reachability-sensitive if its static type is a
> reachability-sensitive class (Q1: or array of such?).
>
> 3. The end of the scope of a reachability-sensitive variable synchronizes with
> the invocation of the finalize() method on the object to which it last
> referred.  For this purpose, the "this" reference is treated as an implicit
> parameter to member functions.  (Q2: The treatment of expression temporary
> reachability-sensitive references is unclear.  Do we treat them as though they
> had a scope that lasts through the end of the full expression, as for C++
> destructors?)
>
> I think this is implementable at modest performance cost, though non-trivial
> compiler engineering cost.  To enforce (3) the compiler mist either treat the
> end of such a scope as a release operation (though without the actual fence) or
> refrain from moving operations across the next GC safe-point.  The GC should
> then guarantee sufficient synchronization.  Hard real-time GCs may have other
> issues, but I would be surprised if this became expensive to enforce.
>
> We don't enforce a corresponding property for references whose static type is
> not reachability-sensitive while their dynamic type is.  I think that's
> generally OK, since the fields being cleaned up by a finalizer can't be accessed
> in such a context without entering another one in which the static
> reachability-sensitive type is visible.  There's usually a similar argument for
> indirect references to reachability-sensitive objects.
>
> It seems to me that this would implicitly make most code that either naively
> used finalizers (instead of finalizer guardians or java.lang.ref) or correctly
> annotated finalizable classes correct.
>
> Q3: Where does this leave the current conditions in
> http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.6.2 ?  Can
> we just drop it?
>
> Hans
>
> On Wed, Sep 17, 2014 at 2:45 PM, Hans Boehm <boehm at acm.org
> <mailto:boehm at acm.org>> wrote:
>
>     On Tue, Sep 9, 2014 at 4:26 AM, Doug Lea <dl at cs.oswego.edu
>     <mailto:dl at cs.oswego.edu>> wrote:
>     >
>     > On 09/07/2014 01:55 AM, Hans Boehm wrote:
>     >>
>     >> we seem to have lots of evidence that essentially everyone currently gets
>     >> this code wrong, introducing premature deallocations.
>     >
>     >
>     > Continuing in vacillation mode.... What do you think of
>     > the following concrete version of Jeremy's original
>     > suggestion:
>     >
>     > 1. Introduce reachabilityFence (or whatever name;
>     > How about "keepReachable"?)
>     >
>     > 2. Spec at syntax level that every declaration and
>     > use of a local (or param) ref to an object of a class F
>     > defining finalize() is translated from:
>     >   F f;
>     >   ...
>     >   <end of scope>
>     > to:
>     >   F f;
>     >   try {
>     >     ...
>     >     <end of scope>
>     >   } finally
>     >      if (f != null) reachabilityFence(f);
>     >   }
>     >
>     > It might be a little challenging to spec this to cover multiple
>     > refs while keeping blocked scopes, but it seems feasible.
>
>     I think that's similar to the earlier proposal with annotations, except that
>     we're back to "defining finalize()" as the criterion for when to apply it.
>     I think that's the wrong criterion.  Counterexamples:
>
>     - Any class that uses a finalizer guardian (recommended practice in places,
>     I think) to perform finalization on its behalf.
>
>     - Any object that's "finalized" by enqueuing it on a reference queue instead
>     of using a "finalize()" method.  (Recommended practice to deal with the lack
>     of finalize() ordering.)
>
>     I think we need a field annotation A that basically says: "This field
>     becomes unusable once the enclosing object becomes unreachable".  That's
>     only loosely correlated with defining a finalize() method.  We then
>     essentially guarantee that every method that uses a field x.f, where f is
>     annotated with A keeps x reachable as long as that would be expected by the
>     naive semantics, by performing the transformation you suggest.
>
>     I'm not sure, but I think we might be able to avoid dealing with transitive
>     reachability issues.  If field f is annotated with a, and I have
>
>     T' x = ...;
>     {
>         T y = x.a;
>         {
>             foo(y.f);
>         }
>     }
>
>     where x is finalizable with a finalizer that invalidates f, I should
>     annotate both a and f, and the right things should happen.  And that
>     actually makes sense, I think.
>
>     For things like new T'().a.foo(), where foo uses f, to work correctly, we
>     probably need something like the C++ rule in the "naive semantics", that
>     references stay around until the end of the enclosing full expression.
>
>     This requires more thought, but this annotation seems to be significantly
>     easier and cleaner to use than reachabilityFence.
>
>     Hans
>
>     >
>     > This seems to completely address premature finalization,
>     > at the expense of lack of control in the other direction.
>     >
>     > The only middle ground I see is the one you like least,
>     > of telling people to write their own try-finally blocks.
>     >
>     > -Doug
>     >
>     >
>     >
>
>