RFR: JDK-8203172: Primitive heap access for interpreter BarrierSetAssembler/aarch64

Mon Jun 4 20:49:45 UTC 2018

Am 04.06.2018 um 22:16 schrieb Erik Österlund:
> Hi Roman,
> 
> On 2018-06-04 21:42, Roman Kennke wrote:
>> Am 04.06.2018 um 18:43 schrieb Erik Österlund:
>>> Hi Roman,
>>>
>>> On 2018-06-04 17:24, Roman Kennke wrote:
>>>> Ok, right. Very good catch!
>>>>
>>>> This should do it, right? Sorry, I couldn't easily make an incremental
>>>> diff:
>>>>
>>>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.01/
>>> Unfortunately, I think there is one more problem for you.
>>> The signal handler is supposed to catch SIGSEGV caused by speculative
>>> loads shot from the fantastic jni fast get field code. But it currently
>>> expects an exact PC match:
>>>
>>> address JNI_FastGetField::find_slowcase_pc(address pc) {
>>>    for (int i=0; i<count; i++) {
>>>      if (speculative_load_pclist[i] == pc) {
>>>        return slowcase_entry_pclist[i];
>>>      }
>>>    }
>>>    return (address)-1;
>>> }
>>>
>>> This means that the way this is written now, speculative_load_pclist
>>> registers the __ pc() right before the access_load_at call. This puts
>>> constraints on whatever is done inside of access_load_at to only
>>> speculatively load on the first assembled instruction.
>>>
>>> If you imagine a scenario where you have a GC with Brooks pointers that
>>> also uncommits memory (like Shenandoah I presume), then I imagine you
>>> would need something more here. If you start with a forwarding pointer
>>> load, then that can trap (which is probably caught by the exact PC
>>> match). But then there will be a subsequent load of the value in the
>>> to-space object, which will not be protected. But this is also loaded
>>> speculatively (as the subsequent safepoint counter check could
>>> invalidate the result), and could therefore crash the VM unless
>>> protected, as the signal handler code fails to recognize this is a
>>> speculative load from jni fast get field.
>>>
>>> I imagine the solution to this would be to let speculative_load_pclist
>>> specify a range for fuzzy SIGSEGV matching in the signal handler, rather
>>> than an exact PC (i.e. speculative_load_pclist_start and
>>> speculative_load_pclist_end). That would give you enough freedom to use
>>> Brooks pointers in there. Sometimes I wonder if the lengths we go to
>>> maintain jni fast get field is *really* worth it.
>> I are probably right in general. But I also think we are fine with
>> Shenandoah. Both the fwd ptr load and the field load are constructed
>> with the same base operand. If the oop is NULL (or invalid memory) it
>> will blow up on fwdptr load just the same as it would blow up on field
>> load. We maintain an invariant that the fwd ptr of a valid oop results
>> in a valid (and equivalent) oop. I therefore think we are fine for now.
>> Should a GC ever need anything else here, I'd worry about it then. Until
>> this happens, let's just hope to never need to touch this code again ;-)
> 
> No I'm afraid that is not safe. After loading the forwarding pointer,
> the thread could be preempted, then any number of GC cycles could pass,
> which means that the address that the at some point read forwarding
> pointer points to, could be uncommitted memory. In fact it is unsafe
> even without uncommitted memory. Because after resolving the jobject to
> some address in the heap, the thread could get preempted, and any number
> of GC cycles could pass, causing the forwarding pointer to be read from
> some address in the heap that no longer is the forwarding pointer of an
> object, but rather a random integer. This causes the second load to blow
> up, even without uncommitting memory.
> 
> Here is an attempt at showing different things that can go wrong:
> 
> obj = *jobject
> // preempted for N GC cycles, meaning obj might 1) be a valid pointer to
> an object, or 2) be a random pointer inside of the heap or outside of
> the heap
> 
> forward_pointer = *obj // may 1) crash with SIGSEGV, 2) read a random
> pointer, no longer representing the forwarding pointer, or 3) read a
> consistent forwarding pointer
> 
> // preempted for N GC cycles, causing forward_pointer to point at pretty
> much anything
> 
> result = *(forward_pointer + offset) // may 1) read a valid primitive
> value, if previous two loads were not messed up, or 2) read some random
> value that no longer corresponds to the object field, or 3) crash
> because either the forwarding pointer did point at something valid that
> subsequently got relocated and uncommitted before the load hits, or
> because the forwarding pointer never pointed to anything valid in the
> first place, because the forwarding pointer load read a random pointer
> due to the object relocating after the jobject was resolved.
> 
> The summary is that both loads need protection due to how the thread in
> native state runs freely without necessarily caring about the GC running
> any number of GC cycles concurrently, making the memory super slippery,
> which risks crashing the VM without the proper protection.

AWW WTF!? We are in native state in this code?

It might be easier to just call bsa->resolve_for_read() (which emits the
fwd ptr load), then issue another:

speculative_load_pclist[count] = __ pc();

need to juggle with the counter and double-emit slowcase_entry_pclist,
and all this conditionally for Shenandoah. Gaa.

Or just FLAG_SET_DEFAULT(UseFastJNIAccessors,false) in Shenandoah.

Funny how we had this code in Shenandoah literally for years, and
nobody's ever tripped over it.

It's one of those cases where I almost suspect it's been done in Java1.0
when lots of JNI code was in use because some stuff couldn't be done in
fast in Java, but nowadays doesn't really make a difference. *Sigh*

>>>> Unfortunately, I cannot really test it because of:
>>>> http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2018-May/005843.html
>>>>
>>>>
>>> That is unfortunate. If I were you, I would not dare to change anything
>>> in jni fast get field without testing it - it is very error prone.
>>
>> Yeah. I guess I'll just wait with testing until this is resolved. Or
>> else resolve it myself.
> 
> Yeah.
> 
>> Can I consider this change reviewed by you?
> 
> I think we should agree about the safety of doing this for Shenandoah in
> particular first. I still think we need the PC range as opposed to exact
> PC to be caught in the signal handler for this to be safe for your GC
> algorithm.

Yeah, I agree. I need to think this through a little bit.

Thanks for pointing out this bug. I can already see nightly builds
suddenly starting to fail over it, now that it's known :-)

Roman