C1 code installation and JNIHandle::deleted_handle() oop
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Tue Nov 14 17:21:46 UTC 2017
On 11/14/17 8:00 PM, Roman Kennke wrote:
> Am 14.11.2017 um 17:04 schrieb Vladimir Ivanov:
>> Thanks, now I see that as well: OopRecorder::find_index() can delegate
>> to ObjectLookup::find_index() which does resolve the handle w/o
>> transitioning to VM.
>>
>> But I don't believe you hit that path: ObjectLookup was added as part
>> of JVMCI and is guarded by a flag (deduplicate) which is turned on
>> only for JVMCI.
> Ah ok. Didn't know that.
>
> However, as Aleksey pointed out, we hit JNIHandles::resolve() in product
> path, JVMCI or not, and this touches the naked oop by comparing it with
> another oop. This doesn't sound like a reliable thing to do.
Can you double-check you observe the crash with product binaries as
well? My current understanding is that it happens only with debug builds.
Best regards,
Vladimir Ivanov
> This simple change seems to fix it:
> https://paste.fedoraproject.org/paste/poQ5caTCuN6jHSGbK1n0iQ
>
> Doing more testing...
>
> Roman
>
>> Anyway, I'll file a bug to investigate ObjectLookup::find_index().
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> On 11/14/17 6:45 PM, Roman Kennke wrote:
>>> The code below the assert also unwraps the oop and does lookups with
>>> it. I'm not on my computer but I can dig out the relevant parts when
>>> I'm back at work...
>>>
>>> Roman
>>>
>>>
>>> Am 14. November 2017 16:36:10 MEZ schrieb Vladimir Ivanov
>>> <vladimir.x.ivanov at oracle.com>:
>>>
>>> Aleksey,
>>>
>>> I agree with your & Roman analysis: compilers shouldn't touch
>>> naked oops
>>> unless the thread is in _thread_in_vm mode.
>>>
>>> Looking at the crash log, the problematic code is under assert:
>>>
>>> void ConstantOopWriteValue::write_on(DebugInfoWriteStream* stream) {
>>> assert(JNIHandles::resolve(value()) == NULL ||
>>> Universe::heap()->is_in_reserved(JNIHandles::resolve(value())),
>>> "Should be in heap");
>>> stream->write_int(CONSTANT_OOP_CODE);
>>> stream->write_handle(value());
>>> }
>>>
>>> So, the proper fix would be to make the verification code more
>>> robust.
>>>
>>> Best regards,
>>> Vladimir Ivanov
>>>
>>> On 11/14/17 5:16 PM, Aleksey Shipilev wrote:
>>>
>>> Hi,
>>>
>>> In some of our aggressive test configurations for Shenandoah, we
>>> sometimes see the following failure:
>>> http://cr.openjdk.java.net/~shade/shenandoah/c1-race-fail-hs_err.log
>>>
>>> It seems to happen when C1 code installation is happening during
>>> Full GC.
>>>
>>> The actual failure is caused by touching the
>>> JNIHandles::deleted_handle() oop in
>>> JNIHandles::guard_value() during JNIHandles::resolve() against
>>> the constant oop handle when we are
>>> recording the debugging information for C1-generated Java call:
>>> http://hg.openjdk.java.net/jdk/hs/file/5caa1d5f74c1/src/hotspot/share/runtime/jniHandles.hpp#l220
>>>
>>>
>>> The C1 thread is in _thread_in_native state, and so the runtime
>>> thinks the thread is at safepoint,
>>> but the thread touches the deleted_handle oop(). When Shenandoah
>>> dives into Full GC and moves that
>>> object at the same time, everything crashes and burns.
>>>
>>> Is C1 (and any other compiler thread) supposed to transit to
>>> _vm_state when touching the naked oops,
>>> and thus coordinate with safepoints? I see VM_ENTRY_MARK all
>>> over ci* that seems to transit there
>>> before accessing the heap. Does that mean we need the same
>>> everywhere around JNIHandles::resolve too?
>>>
>>> Or is there some other mechanism that is supposed to get
>>> compiler threads to coordinate with GC?
>>>
>>> Thanks,
>>> -Aleksey
>>>
>>>
>>> --
>>> Diese Nachricht wurde von meinem Android-Gerät mit K-9 Mail gesendet.
>
>
More information about the hotspot-compiler-dev
mailing list