RFR: Make object pinning safepoint aware

Sun Oct 15 12:57:15 UTC 2017

Am 15.10.2017 um 13:56 schrieb Zhengyu Gu:
>
>
> On 10/15/2017 04:17 AM, Roman Kennke wrote:
>> Am 15.10.2017 um 04:28 schrieb Zhengyu Gu:
>>> Thread running native code does not usually participate safepoints, 
>>> but the write barrier in object pinning code can potential interfere 
>>> with safepoints and mutate the heap, so we need to make it safepoint 
>>> aware.
>>>
>>> Webrev: 
>>> http://cr.openjdk.java.net/~zgu/shenandoah/obj_pin_safepoint/webrev.00/index.html 
>>>
>>>
>>> Test:
>>>   hotspot_gc_shenandoah fastdebug + release on x64 and aarch64.
>>>   + new stress test.
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>
>> This doesn't seem right.
>>
>> Native code (i.e. *outside* a JNI call) does not participate in 
>> safepoints. But as soon as we are *inside* a JNI call, i.e. 
>> GetPrimitiveArrayCritical, the thread is in VM state. This means it 
>> does participate in safepoint in that it cannot reach a safepoint 
>> unless we explicitely tell it to. This should be good for us: we will 
>> not get any nasty surprises (e.g. change of evac-in-progress flag on 
>> the fly) in our write barrier for example.
>>
>> Now you introduce a VM->java->VM state transition inside the JNI 
>> function. This seems wrong. And what does it help? The thread can now 
>> reach a safepoint in the middle of GetPrimitiveArrayCritical: ugh. I 
>> doubt very much that this is the right fix. (And besides, there are 
>> scoped objects to do that...)
> yes, you are right! in_vm should be enough.
>
> Then there is a strange crash:
> #
> #  Internal Error 
> (/home/zhengyu/jdk10/hotspot/src/share/vm/gc/shenandoah/shenandoahHeapRegion.cpp:176), 
> pid=17356, tid=17359
> #  fatal error: Disallowed transition from Collection Set to Pinned
> #
>
> Link: https://paste.fedoraproject.org/paste/jaB5Q6QTDJ-Q64Yg0OM5oA
>
>
Ok, this is strange indeed. The write barrier should ensure we get no 
cset objects in pinning. The only logical explanation is that evac 
failed because of OOM (you should see than in hs_err: it will tell you 
if we are in cancelled state). However, we must still pin the 
object/region. I guess we must ensure we wait in oom_during_evacuation() 
until all GC threads settled (we probably do the right thing already), 
and then carry on pinning the cset object (i.e. we need an exception in 
the checking code for when evac has been cancelled). Then we'd slide 
into full-gc where we would handle pinning correctly (unless the region 
isn't pinned anymore when we get there).

Roman