RFR: JDK-8203172: Primitive heap access for interpreter BarrierSetAssembler/aarch64
Erik Österlund
erik.osterlund at oracle.com
Mon Jun 4 16:43:47 UTC 2018
Hi Roman,
On 2018-06-04 17:24, Roman Kennke wrote:
> Ok, right. Very good catch!
>
> This should do it, right? Sorry, I couldn't easily make an incremental diff:
>
> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.01/
Unfortunately, I think there is one more problem for you.
The signal handler is supposed to catch SIGSEGV caused by speculative
loads shot from the fantastic jni fast get field code. But it currently
expects an exact PC match:
address JNI_FastGetField::find_slowcase_pc(address pc) {
for (int i=0; i<count; i++) {
if (speculative_load_pclist[i] == pc) {
return slowcase_entry_pclist[i];
}
}
return (address)-1;
}
This means that the way this is written now, speculative_load_pclist
registers the __ pc() right before the access_load_at call. This puts
constraints on whatever is done inside of access_load_at to only
speculatively load on the first assembled instruction.
If you imagine a scenario where you have a GC with Brooks pointers that
also uncommits memory (like Shenandoah I presume), then I imagine you
would need something more here. If you start with a forwarding pointer
load, then that can trap (which is probably caught by the exact PC
match). But then there will be a subsequent load of the value in the
to-space object, which will not be protected. But this is also loaded
speculatively (as the subsequent safepoint counter check could
invalidate the result), and could therefore crash the VM unless
protected, as the signal handler code fails to recognize this is a
speculative load from jni fast get field.
I imagine the solution to this would be to let speculative_load_pclist
specify a range for fuzzy SIGSEGV matching in the signal handler, rather
than an exact PC (i.e. speculative_load_pclist_start and
speculative_load_pclist_end). That would give you enough freedom to use
Brooks pointers in there. Sometimes I wonder if the lengths we go to
maintain jni fast get field is *really* worth it.
> Unfortunately, I cannot really test it because of:
> http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2018-May/005843.html
That is unfortunate. If I were you, I would not dare to change anything
in jni fast get field without testing it - it is very error prone.
Thanks,
/Erik
> Roman
>
>
>> Hi Roman,
>>
>> Oh man, I was hoping I would never have to look at jni fast get field
>> again. Here goes...
>>
>> 93 speculative_load_pclist[count] = __ pc(); // Used by the
>> segfault handler
>> 94 __ access_load_at(type, IN_HEAP, noreg /* tos: r0/v0 */,
>> Address(robj, roffset), noreg, noreg);
>> 95
>>
>> I see that here you load straight to tos, which is r0 for integral
>> types. But r0 is also c_rarg0. So it seems like if after loading the
>> primitive to r0, the subsequent safepoint counter check fails, then the
>> code will revert back to a slowpath call, but this time with c_rarg0
>> clobbered, leading to a broken JNI env pointer being passed in to the
>> slow path C function. That does not seem right to me.
>>
>> This JNI fast get field code is so error prone. :(
>>
>> Unfortunately, the proposed API can not load floating point numbers to
>> anything but ToS, which seems like a problem in the jni fast get field
>> code.
>> I think to make this work properly, you need to load integral types to
>> result and not ToS, so that you do not clobber r0, and rely on ToS being
>> v0 for floating point types, which does not clobber r0. That way we can
>> dance around the issue for now I suppose.
>>
>> Thanks,
>> /Erik
>>
>> On 2018-05-14 22:23, Roman Kennke wrote:
>>> Similar to x86
>>> (http://mail.openjdk.java.net/pipermail/hotspot-dev/2018-May/032114.html)
>>> here comes the primitive heap access changes for aarch64:
>>>
>>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.00/
>>>
>>> Some notes:
>>> - array access used to compute base_obj + index, and then use indexed
>>> addressing with base_offset. This means we cannot get base_obj in the
>>> BarrierSetAssembler API, but we need that, e.g. for resolving the target
>>> object via forwarding pointer. I changed (base_obj+index)+base_offset to
>>> base_obj+(index+base_offset) in all the relevant places.
>>>
>>> - in jniFastGetField_aarch64.cpp, we are using a trick to ensure correct
>>> ordering field-load with the load of the safepoint counter: we make them
>>> address dependend. For float and double loads this meant to load the
>>> value as int/long, and then later moving those into v0. This doesn't
>>> work when going through the BarrierSetAssembler API: it loads straight
>>> to v0. Instead I am inserting a LoadLoad membar for float/double (which
>>> should be rare enough anyway).
>>>
>>> Other than that it's pretty much analogous to x86.
>>>
>>> Testing: no regressions in hotspot/tier1
>>>
>>> Can I please get a review?
>>>
>>> Thanks, Roman
>>>
>
More information about the hotspot-dev
mailing list