RFR: 8241605: Shenandoah: More aggressive reference discovery

Thu Mar 26 12:20:21 UTC 2020

Hi Roman,

First of all, interesting idea. Thanks for sharing your thoughts. I 
would however like to share
some conceptual concerns. The patch has some flaws that needs further 
consideration.
The core assumption that is wrong is that if there are no stores/returns 
exposing the referent,
then it is safe to elide the keep-alive barrier. This is not true.

Listing a selection of concerns/bugs here:

1) What the compiler proves can't happen, can always suddenly happen, 
due to class redefinition.
So any idea utilizing compiler proof of what paths can be taken, always 
need to be backed up
by fall-back paths when the optimization eventually gets proven wrong. 
Let's have a look at
this trivial example:

void A(List<Reference> references) {
   for (var ref: references) {
     Object local = ref.get();
     for (int i = 0; i < 10000; ++i) (
       B(local);
       // safepoint poll with local being live
     }
   }
}

void B(Object referent) {
   if (referent == System.out) {
     System.out.println("I like fish. They swim. Blub blub.");
   }
}

Assume that we compile A, inlining B. The compiler can now prove that 
the referent doesn't escape.
However, we then hit a safepoint in A, and in that safepoint, B is 
redefined to do this:

void B(Object referent) {
   globalList.add(referent);
}

What will happen now is that the nmethod of A (and B inlined) gets 
deoptimized. Once execution
starts, the deopt handler will run, transferring the state (including 
the local) over to the
interpreter, and in the interpreted version of B, the referent illegally 
escapes, eventually
causing heap corruption, as subsequent remarks will not see the escaped 
local.

2) Even without class redefinition changing the logic that the compiler 
proved had no escaping
referents through stores, stack walkers can expose locals at any 
safepoint poll. So in the above
example, a stack walker can expose the local to other Java code, which 
attaches it to the object
graph, without ever marking it, again causing heap corruption.

Because of this, stack walkers and deoptimization logic needs to keep 
everything they see in the
stack alive, or things will break in subtle ways that are not fun to debug.

3) The analysis needs to also consider not just exposing of the 
referent, but exposure of anything
transitively reachable from the object graph under the referent. If you 
load the referent, which is a
wrapper object, then load the contents of the wrapper, and expose that 
to the object graph, then that
will similarly break, even though the referent itself was never exposed. 
I can't see that your analysis
catches that (LoadP with an AddP with the referent as base, and 
transitively performing the analysis on that)

4) Even then, there are more problems, involving class unloading. It is 
not just the object (and its
transitive closure) you need to consider w.r.t. whether to keep it alive 
or not, but implicitly
transitively also its class, and the classes of everything transitively 
reachable from the referent,
which is now ignored. For example, if someone calls 
referent.(chain_of_objects).getClass(), then you can
expose classes that have not been kept alive, as your strong native 
barriers do not keep things alive, AFAICT.

5) Classes are implicitly used when code is run from its class holder, 
and there are a few ways in
which this could go wrong. The nmethod entry barriers shields you 
against many of them (including
speculative inlining of bytecodes from otherwise not kept alive Klass), 
but not all.
For example, if such a transitive class under the referent is acquired, 
and passed as an argument
to the interpreter, it can reflectively call methods without holding its 
class alive, causing various
forms of failure modes crashing the JVM.

6) Sometimes the use of classes is even more subtle and implicit. Here 
is one example I can think of:
Imagine that your referent performs a virtual call. It is found throuch 
CHA that there exists a single
unique selected method (callee). Then it is inlined straight into the 
code, without any conditional checking
that the klass pointer was right. An entry is created into the 
dependency context. But the class tracked
in the dependency context is not the selected class (i.e. the one you 
inlined), but the base class,
because that is the class that will get poked during class loading, to 
check when the CHA assumption
fails in the future. So the dependency context does not track the callee 
method(). The oop finalization
will not find any explicit mentions of this klass. It is just implicitly 
used. The only scenario in which
we explicitly track inlined callees classes, is with the following code 
for class redefinition:

   // Always register dependence if JVMTI is enabled, because
   // either breakpoint setting or hotswapping of methods may
   // cause deoptimization.
   if (C->env()->jvmti_can_hotswap_or_post_breakpoint()) {
      C->dependencies()->assert_evol_method(method());
   }

This dependency will create a metadata entry in the nmethod which is 
caught by the oop finalizer, and
hence by your nmethod entry barriers. When 
jvmti_can_hotswap_or_post_breakpoint() is disabled, the callee holder
is not tracked anywhere, because it is assumed that the receiver of such 
an inlined method call is live,
which seems like a reasonable assumption. Now that this elision of 
keep-alive barriers is used, we enter a
gateway to hell. In bytecodes of this not kept alive class loader, you 
can do lots of fun things, like
allocating an object of a classin that class loader, or materialize 
constants from its static finals,
that will not be caught by the GC.This results in objects with broken 
class pointers being allocated,
and dead objects being attached to the object graph. From there on, it 
is game over.

7) I can imagine code involving method handles going wrong too. You can 
perform method handle calls through
a referent, and the whole method handle call may get inlined. It tracks 
the holder through a MemberName.
It is assumed that if you have a reference to the MemberName, it has a 
reference to the class holder, and
hence it should be safe to execute its bytecodes. But when we start 
relaxing things such that this has not
been kept alive, I can imagine such code blowing up in unexpected and 
obscure ways. It looks like nothing
transitively reachable from the referent is being exposed in any way, 
and it looks like there are no calls,
but inlined use of bytecodes of reachable metadata is an implicit side 
effect that exposes not the referent
but just references to its transitively reachable class loaders, without 
there ever being an obvious store
or side effect involving the referent or its transitive closure of 
objects or uses.

This aside, I think there are many other fun assumptions made over the 
years about things being live, not
listed above, implyingvarious subtle assumptions in our compilers. I 
would be quite wary of messing with
that without a very careful lookinto all the things that can go 
wrong.Catching such bugs will be a nightmare.

So given these various issues, are you still sure you want to continue 
down this path? If I were you, I
think I would probably not perform this change.It gives me a migraine to 
think about the things that can
go wrong, and I feel like I have just scratched the surface of it. I 
would be quite afraid of destabilizing.

/Erik

On 2020-03-25 18:29, Roman Kennke wrote:
> Shenandoah uses SATB for concurrent marking, and thus needs to track
> stores and explicitely make previous values grey. An exception to this
> are (soft/weak/final) references: the reachability of referents can be
> changed (from weak to strong) when a referent is accessed. For this
> reason we require a keep-alive barrier which also makes referents grey
> whenever they are accessed via Reference.get(). The downside of this is
> if a workload churns weak-references (i.e. accesses them often) they
> might never get a chance to be reclaimed.
>
> The key insight for improving the situation is that the change of
> reachability of the referent is only relevant when that referent is
> stored somewhere else. We can elide the keep-alive barrier, if the
> compiler can prove that this doesn't happen. (We also need to check if
> the referent leaves the scope of the method via method-call or return,
> because it may be stored outside of the method.) The caveat here is that
> we must do another scan of the threads' stacks at final-mark to catch
> referents that are in a local variable at final-mark - we must not loose
> those. However, according to my measurements, this is only a minor (if
> any) regression in final-mark-latency.
>
> Issue:
> https://bugs.openjdk.java.net/browse/JDK-8241605
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8241605/webrev.00/
>
> Testing: hotspot_gc_shenandoah (including new testcase that verifies the
> improved behaviour), manual tests, specjbb and specjvm runs
>
> I'd like to push this to shenandoah/jdk first, to give it a few more
> rounds of testing before hitting mainline JDK.
>
> Ok?
>
> Roman
>