RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls

Wed Jul 17 10:26:31 UTC 2019

Revised fix:
   http://cr.openjdk.java.net/~vlivanov/8227260/webrev.00/

It turned out the problem is not specific to i2c2i: fast class 
initialization barriers on nmethod entry trigger the assert as well.

JNI upcalls (CallStatic<type>Method) don't have class initialization 
checks, so it's possible to initiate a JNI upcall from a 
non-initializing thread and JVM should let it complete.

It leads to a busy loop (asserts in debug) between nmethod entry barrier 
& SharedRuntime::handle_wrong_method until holder class is initialized 
(possibly infinite if it blocks class initialization).

Proposed fix is to keep using c2i, but jump over class initialization 
barrier right to the argument shuffling logic on verified entry when 
coming from SharedRuntime::handle_wrong_method.

Improved regression test reliably reproduces the problem.

Testing: regression test, hs-precheckin-comp, tier1-6

Best regards,
Vladimir Ivanov

On 04/07/2019 18:02, Erik Österlund wrote:
> Hi,
> 
> The i2c adapter sets a thread-local "callee_target" Method*, which is 
> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c 
> call is "bad" (e.g. not_entrant). This error handler forwards execution 
> to the callee c2i entry. If the SharedRuntime::handle_wrong_method 
> method is called again due to the i2c2i call being still bad, then we 
> will crash the VM in the following guarantee in 
> SharedRuntime::handle_wrong_method:
> 
> Method* callee = thread->callee_target();
> guarantee(callee != NULL && callee->is_method(), "bad handshake");
> 
> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits the 
> new class initialization entry barrier of the c2i adapter.
> The solution is to simply not clear the thread-local "callee_target" 
> after handling the first failure, as we can't really know there won't be 
> another one. There is no reason to clear this value as nobody else reads 
> it than the SharedRuntime::handle_wrong_method handler (and we really do 
> want it to be able to read the value as many times as it takes until the 
> call goes through). I found some confused clearing of this callee_target 
> in JavaThread::oops_do(), with a comment saying this is a methodOop that 
> we need to clear to make GC happy or something. Seems like old traces of 
> perm gen. So I deleted that too.
> 
> I caught this in ZGC where the timing window for hitting this issue 
> seems to be wider due to concurrent code cache unloading. But it is 
> equally problematic for all GCs.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8227260
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/
> 
> Thanks,
> /Erik