RFR(m): 8220351: Cross-modifying code

Robbin Ehn robbin.ehn at oracle.com
Wed Mar 27 12:02:34 UTC 2019


Hi Erik,

On 3/27/19 12:04 PM, Erik Österlund wrote:
> Hi Robbin,
> 
> In runtime/safepointMechanism.cpp:
> 110     OrderAccess::storeload();
> 
> This should be OrderAccess::fence() as it must also act as a loadload() between 
> the local poll and the global poll; if they reorder, we could miss a safepoint 
> request that we disarmed, and forget to arm again, causing potentially causing a 
> deadlock in the whole system. The only thing interleaving the two now is a 
> storeload and a release (comprising loadstore and storestore), in other words no 
> loadload. Naturally, all known storeload fences are also loadload fences, so it 
> will work anyway, but it is nevertheless good for the semantics to be explicit 
> about it.

A question, in this case we have:
A: load poll
loadstore|storestore (release)
B: store poll
storeload
C: load state

So B is ordered after A and C is ordered after B.
Which I thought implied that C also was ordered after A ?

That is the reason for having a storeload and not a fence.

> 
> Otherwise, it looks good. Don't need another webrev. Thanks for sorting this out.

Thanks!

/Robbin

> 
> Thanks,
> /Erik
> 
> On 2019-03-26 13:51, Robbin Ehn wrote:
>> Hi all, please review.
>>
>> This is v4, which is just an rebase on-top of "RFR: 8221207: Redo JDK-8218446 -
>> SuspendAtExit hangs". The commit simplified S/R code a bit, making this patch
>> simpler.
>>
>> Full:
>> http://cr.openjdk.java.net/~rehn/8220351/v4/webrev/
>> Inc (not complete since v3 did not apply proper, one of the reject wasn't
>> needed):
>> http://cr.openjdk.java.net/~rehn/8220351/v4/inc/webrev/
>>
>> To summarize, before 8221207 we had 4 safe states:
>> *Off threads_list (_thread_new)
>> *Blocked          (_thread_blocked)
>> *In native        (_thread_in_native)
>> *Suspend          (*ANY*)
>> *After handshake  (X)
>>
>> After 8221207 we have:
>> *Off threads_list (_thread_new) (far from cross-mod code)
>> *Blocked          (_thread_blocked)
>> *In native        (_thread_in_native)
>> *After handshake  (X)
>>
>> -We have one instruction barrier for when going from _thread_new.
>> -We have four instruction barrier for when going from blocked:
>>      - ThreadBlockInVM
>>      - ThreadBlockInVMWithDeadlockCheck
>>      - The manual blocked transition for suspended blocked case.
>>      - Blocked in SS:block() uses the barrier in block_if_requested_slow.
>> -We let threads doing their own handshake hit the barrier in
>> block_if_requested_slow.
>> -We let threads in native stay armed so they hit the barrier in
>> block_if_requested_slow.
>>
>> Not disarming threads in native works because, false positives already happens
>> today and are handled. Both handshakes and safepoints disarms in correct order,
>> so a JavaThread can notice that is should be armed, but with false positives.
>> Before first check of handshake/safepoint the JavaThread is in a unsafe state.
>> After that check a new handshake/safepoint cannot start/progress since the
>> JavaThread is unsafe. Thus re-arming and continue to next polling site is okay.
>>
>> Patch only adds barrier for cross-modifying code that happens during a
>> handshake/safepoint and implements this barrier for those platforms which have a
>> simple implementaion for accomplish it. sparc/arm32/aarch64 needs additional
>> work.
>>
>> The deprecated global poll path does not contain proper barriers, and will be
>> obsoleted in JDK 14.
>>
>> Thanks, Robbin
>>
>> On 2019-03-08 16:24, Robbin Ehn wrote:
>>> Hi all, please review.
>>>
>>> Issue:
>>> https://bugs.openjdk.java.net/browse/JDK-8220351
>>> Changeset:
>>> http://cr.openjdk.java.net/~rehn/8220351/webrev/
>>>
>>> After a JavaThread have been in a safepoint/(handshake) safe state it can start
>>> executing updated code. E.g. an oop in the instruction stream can have been
>>> updated.
>>>
>>> Most hardware's require a barrier or that the code cross modified is far away to
>>> guarantee that the thread executing the updated instruction stream sees the
>>> modification.
>>>
>>> What far away is and how far an update instruction stream is from a safepoint
>>> safe state is not clear.
>>>
>>> To be compliant with those architectures an instruction stream barrier must be
>>> added when leaving the safepoint safe state.
>>>
>>> There may be crashes today due to this missing barrier.
>>> A new CPU with deeper pipeline or changes to the VM which moves a safepoint safe
>>> state closer to a nmethod can increase changes of a crash.
>>>
>>> A few benchmarks will see a performance regression with the extra serializing,
>>> such as Octane-Crypto -5% (x86).
>>>
>>> With a much more elaborate fix, such as changing the local poll to have more
>>> than two states (armed/disarmed), it would be possible to only emit such
>>> instruction stream barrier when the poll have been armed while the thread was
>>> safepoint safe.
>>>
>>> Passes t1-3 on our platforms, and test built what I can.
>>>
>>> Thanks, Robbin


More information about the hotspot-dev mailing list