RFR: JDK-8203172: Primitive heap access for interpreter BarrierSetAssembler/aarch64

Fri Jun 8 20:48:38 UTC 2018

Hi Roman,

Looks good.

Thanks,
/Erik

On 2018-06-08 22:19, Roman Kennke wrote:
> Ping?
>
>> As mentioned in another thread, we in Shenandoah have decided to skip
>> JNI fast getfield stuff for now. We'll probably address it and implement
>> the extended range speculative PC thing later, in a separate RFE. I
>> ripped out the jniFastGetField changes from the patch:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.02/
>>
>> Is it good now to push?
>>
>> Roman
>>
>>> Hi Roman,
>>>
>>> On 2018-06-04 22:49, Roman Kennke wrote:
>>>> Am 04.06.2018 um 22:16 schrieb Erik Österlund:
>>>>> Hi Roman,
>>>>>
>>>>> On 2018-06-04 21:42, Roman Kennke wrote:
>>>>>> Am 04.06.2018 um 18:43 schrieb Erik Österlund:
>>>>>>> Hi Roman,
>>>>>>>
>>>>>>> On 2018-06-04 17:24, Roman Kennke wrote:
>>>>>>>> Ok, right. Very good catch!
>>>>>>>>
>>>>>>>> This should do it, right? Sorry, I couldn't easily make an
>>>>>>>> incremental
>>>>>>>> diff:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~rkennke/JDK-8203172/webrev.01/
>>>>>>> Unfortunately, I think there is one more problem for you.
>>>>>>> The signal handler is supposed to catch SIGSEGV caused by speculative
>>>>>>> loads shot from the fantastic jni fast get field code. But it
>>>>>>> currently
>>>>>>> expects an exact PC match:
>>>>>>>
>>>>>>> address JNI_FastGetField::find_slowcase_pc(address pc) {
>>>>>>>      for (int i=0; i<count; i++) {
>>>>>>>        if (speculative_load_pclist[i] == pc) {
>>>>>>>          return slowcase_entry_pclist[i];
>>>>>>>        }
>>>>>>>      }
>>>>>>>      return (address)-1;
>>>>>>> }
>>>>>>>
>>>>>>> This means that the way this is written now, speculative_load_pclist
>>>>>>> registers the __ pc() right before the access_load_at call. This puts
>>>>>>> constraints on whatever is done inside of access_load_at to only
>>>>>>> speculatively load on the first assembled instruction.
>>>>>>>
>>>>>>> If you imagine a scenario where you have a GC with Brooks pointers
>>>>>>> that
>>>>>>> also uncommits memory (like Shenandoah I presume), then I imagine you
>>>>>>> would need something more here. If you start with a forwarding pointer
>>>>>>> load, then that can trap (which is probably caught by the exact PC
>>>>>>> match). But then there will be a subsequent load of the value in the
>>>>>>> to-space object, which will not be protected. But this is also loaded
>>>>>>> speculatively (as the subsequent safepoint counter check could
>>>>>>> invalidate the result), and could therefore crash the VM unless
>>>>>>> protected, as the signal handler code fails to recognize this is a
>>>>>>> speculative load from jni fast get field.
>>>>>>>
>>>>>>> I imagine the solution to this would be to let speculative_load_pclist
>>>>>>> specify a range for fuzzy SIGSEGV matching in the signal handler,
>>>>>>> rather
>>>>>>> than an exact PC (i.e. speculative_load_pclist_start and
>>>>>>> speculative_load_pclist_end). That would give you enough freedom to
>>>>>>> use
>>>>>>> Brooks pointers in there. Sometimes I wonder if the lengths we go to
>>>>>>> maintain jni fast get field is *really* worth it.
>>>>>> I are probably right in general. But I also think we are fine with
>>>>>> Shenandoah. Both the fwd ptr load and the field load are constructed
>>>>>> with the same base operand. If the oop is NULL (or invalid memory) it
>>>>>> will blow up on fwdptr load just the same as it would blow up on field
>>>>>> load. We maintain an invariant that the fwd ptr of a valid oop results
>>>>>> in a valid (and equivalent) oop. I therefore think we are fine for now.
>>>>>> Should a GC ever need anything else here, I'd worry about it then.
>>>>>> Until
>>>>>> this happens, let's just hope to never need to touch this code again
>>>>>> ;-)
>>>>> No I'm afraid that is not safe. After loading the forwarding pointer,
>>>>> the thread could be preempted, then any number of GC cycles could pass,
>>>>> which means that the address that the at some point read forwarding
>>>>> pointer points to, could be uncommitted memory. In fact it is unsafe
>>>>> even without uncommitted memory. Because after resolving the jobject to
>>>>> some address in the heap, the thread could get preempted, and any number
>>>>> of GC cycles could pass, causing the forwarding pointer to be read from
>>>>> some address in the heap that no longer is the forwarding pointer of an
>>>>> object, but rather a random integer. This causes the second load to blow
>>>>> up, even without uncommitting memory.
>>>>>
>>>>> Here is an attempt at showing different things that can go wrong:
>>>>>
>>>>> obj = *jobject
>>>>> // preempted for N GC cycles, meaning obj might 1) be a valid pointer to
>>>>> an object, or 2) be a random pointer inside of the heap or outside of
>>>>> the heap
>>>>>
>>>>> forward_pointer = *obj // may 1) crash with SIGSEGV, 2) read a random
>>>>> pointer, no longer representing the forwarding pointer, or 3) read a
>>>>> consistent forwarding pointer
>>>>>
>>>>> // preempted for N GC cycles, causing forward_pointer to point at pretty
>>>>> much anything
>>>>>
>>>>> result = *(forward_pointer + offset) // may 1) read a valid primitive
>>>>> value, if previous two loads were not messed up, or 2) read some random
>>>>> value that no longer corresponds to the object field, or 3) crash
>>>>> because either the forwarding pointer did point at something valid that
>>>>> subsequently got relocated and uncommitted before the load hits, or
>>>>> because the forwarding pointer never pointed to anything valid in the
>>>>> first place, because the forwarding pointer load read a random pointer
>>>>> due to the object relocating after the jobject was resolved.
>>>>>
>>>>> The summary is that both loads need protection due to how the thread in
>>>>> native state runs freely without necessarily caring about the GC running
>>>>> any number of GC cycles concurrently, making the memory super slippery,
>>>>> which risks crashing the VM without the proper protection.
>>>> AWW WTF!? We are in native state in this code?
>>> Yes. This is one of the most dangerous code paths we have in the VM I
>>> think.
>>>
>>>> It might be easier to just call bsa->resolve_for_read() (which emits the
>>>> fwd ptr load), then issue another:
>>>>
>>>> speculative_load_pclist[count] = __ pc();
>>>>
>>>> need to juggle with the counter and double-emit slowcase_entry_pclist,
>>>> and all this conditionally for Shenandoah. Gaa.
>>> I think that by just having the speculative load PC list take a range as
>>> opposed to a precise PC, and check that a given PC is in that range, and
>>> not just exactly equal to a PC, the problem is solved for everyone.
>>>
>>>> Or just FLAG_SET_DEFAULT(UseFastJNIAccessors,false) in Shenandoah.
>>> Yeah, sometimes you wonder if it's really worth the maintenance to keep
>>> this thing.
>>>
>>>> Funny how we had this code in Shenandoah literally for years, and
>>>> nobody's ever tripped over it.
>>> Yeah it is a rather nasty race to detect.
>>>
>>>> It's one of those cases where I almost suspect it's been done in Java1.0
>>>> when lots of JNI code was in use because some stuff couldn't be done in
>>>> fast in Java, but nowadays doesn't really make a difference. *Sigh*
>>> :)
>>>
>>>>>>>> Unfortunately, I cannot really test it because of:
>>>>>>>> http://mail.openjdk.java.net/pipermail/aarch64-port-dev/2018-May/005843.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> That is unfortunate. If I were you, I would not dare to change
>>>>>>> anything
>>>>>>> in jni fast get field without testing it - it is very error prone.
>>>>>> Yeah. I guess I'll just wait with testing until this is resolved. Or
>>>>>> else resolve it myself.
>>>>> Yeah.
>>>>>
>>>>>> Can I consider this change reviewed by you?
>>>>> I think we should agree about the safety of doing this for Shenandoah in
>>>>> particular first. I still think we need the PC range as opposed to exact
>>>>> PC to be caught in the signal handler for this to be safe for your GC
>>>>> algorithm.
>>>> Yeah, I agree. I need to think this through a little bit.
>>> Yeah. Still think the PC range check solution should do the trick.
>>>
>>>> Thanks for pointing out this bug. I can already see nightly builds
>>>> suddenly starting to fail over it, now that it's known :-)
>>> No problem!
>>>
>>> Thanks,
>>> /Erik
>>>
>>>> Roman
>>>>
>>>>
>>
>