RFR[13]: 8227260: Can't deal with SharedRuntime::handle_wrong_method triggering more than once for interpreter calls
Erik Österlund
erik.osterlund at oracle.com
Mon Jul 8 10:07:52 UTC 2019
Hi Dean and Vladimir,
the callee->is_method() in the guarantee is there probably to find
corrupt memory.
So the problem is specifically when performing upcalls from JNI. The
call wrapper tries to "quack like an interpreter" and performs i2c
calls, failing due to the nmethod being not entrant. Then the subsequent
c2i attempt fails again due to clinit barriers. In the template
interpreter calls, the clinit barriers have already been taken, but in
the JNI upcall path, we don't perform that barrier.
So as our current i2c calls can't actually deal with blocking at all
(and no safepoints), the right solution seems to be sticking in some
clinit barriers into the JavaCalls API, so that when the call is
performed, we know the clinit barrier won't be hit.
I still think that allowing only one thing to go wrong across an i2c2i
call is pretty scary, and I'd love to remove that restriction.
Anyway, Vladimir offered to find the right place to put the clinit
barrier, so I'm handing this one over. :)
Thanks,
/Erik
On 2019-07-05 23:46, dean.long at oracle.com wrote:
> What is callee->is_method() doing? Like Vladimir, I'm concerned about
> pointers to stale metadata.
>
> dl
>
> On 7/4/19 8:02 AM, Erik Österlund wrote:
>> Hi,
>>
>> The i2c adapter sets a thread-local "callee_target" Method*, which is
>> caught (and cleared) by SharedRuntime::handle_wrong_method if the i2c
>> call is "bad" (e.g. not_entrant). This error handler forwards
>> execution to the callee c2i entry. If the
>> SharedRuntime::handle_wrong_method method is called again due to the
>> i2c2i call being still bad, then we will crash the VM in the
>> following guarantee in SharedRuntime::handle_wrong_method:
>>
>> Method* callee = thread->callee_target();
>> guarantee(callee != NULL && callee->is_method(), "bad handshake");
>>
>> Unfortunately, the c2i entry can indeed fail again if it, e.g., hits
>> the new class initialization entry barrier of the c2i adapter.
>> The solution is to simply not clear the thread-local "callee_target"
>> after handling the first failure, as we can't really know there won't
>> be another one. There is no reason to clear this value as nobody else
>> reads it than the SharedRuntime::handle_wrong_method handler (and we
>> really do want it to be able to read the value as many times as it
>> takes until the call goes through). I found some confused clearing of
>> this callee_target in JavaThread::oops_do(), with a comment saying
>> this is a methodOop that we need to clear to make GC happy or
>> something. Seems like old traces of perm gen. So I deleted that too.
>>
>> I caught this in ZGC where the timing window for hitting this issue
>> seems to be wider due to concurrent code cache unloading. But it is
>> equally problematic for all GCs.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8227260
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8227260/webrev.00/
>>
>> Thanks,
>> /Erik
>
More information about the hotspot-dev
mailing list