RFR(m): 8220351: Cross-modifying code

Wed Mar 27 09:58:57 UTC 2019

Hi Robbin,

On 26/03/2019 10:51 pm, Robbin Ehn wrote:
> Hi all, please review.
> 
> This is v4, which is just an rebase on-top of "RFR: 8221207: Redo 
> JDK-8218446 -
> SuspendAtExit hangs". The commit simplified S/R code a bit, making this 
> patch simpler.

That's good :)

> Full:
> http://cr.openjdk.java.net/~rehn/8220351/v4/webrev/
> Inc (not complete since v3 did not apply proper, one of the reject wasn't
> needed):
> http://cr.openjdk.java.net/~rehn/8220351/v4/inc/webrev/
> 
> To summarize, before 8221207 we had 4 safe states:
> *Off threads_list (_thread_new)
> *Blocked          (_thread_blocked)
> *In native        (_thread_in_native)
> *Suspend          (*ANY*)
> *After handshake  (X)
> 
> After 8221207 we have:
> *Off threads_list (_thread_new) (far from cross-mod code)
> *Blocked          (_thread_blocked)
> *In native        (_thread_in_native)
> *After handshake  (X)
> 
> -We have one instruction barrier for when going from _thread_new.
> -We have four instruction barrier for when going from blocked:
>      - ThreadBlockInVM
>      - ThreadBlockInVMWithDeadlockCheck
>      - The manual blocked transition for suspended blocked case.
>      - Blocked in SS:block() uses the barrier in block_if_requested_slow.

I finally realized how this translates into covering all the 
return-to-Java paths, without having as many unnecessary calls to 
cross_modify_barrier. If we return to Java after a safepoint (or action 
that could have installed modified code) we need the barrier. If a 
safepoint has occurred then we must have become safepoint-safe at some 
point, which means we either were in _thread_blocked (by one means or 
another) or _thread_in_native. So we guard the transitions out of those 
states.

> -We let threads doing their own handshake hit the barrier in
> block_if_requested_slow.
> -We let threads in native stay armed so they hit the barrier in
> block_if_requested_slow.
> 
> Not disarming threads in native works because, false positives
> already happens today and are handled. Both handshakes and safepoints
> disarms in correct order, so a JavaThread can notice that is should
> be armed, but with false positives. Before first check of
> handshake/safepoint the JavaThread is in a unsafe state. After that
> check a new handshake/safepoint cannot start/progress since the 
> JavaThread is unsafe. Thus re-arming and continue to next polling
> site is okay.

I'll take your word for it. I can't claim sufficient knowledge of the 
handshake mechanism to truly comment.

> Patch only adds barrier for cross-modifying code that happens during
> a handshake/safepoint and implements this barrier for those
> platforms which have a simple implementaion for accomplish it.
> sparc/arm32/aarch64 needs additional work.

I suggest creating subtasks for those platforms so this additional work 
isn't forgotten.

> The deprecated global poll path does not contain proper barriers, and 
> will be obsoleted in JDK 14.

Ok.

A couple of minor typos:

src/hotspot/share/runtime/safepointMechanism.inline.hpp

+     // JavaThread will disarm it self  ...

s/it self/itself/

---
src/hotspot/share/runtime/thread.cpp

+   // from here is probably far enought, ...

s/enought/enough/

+   // Since we are not using a regular thread-state tranisition helper 
here,

s/tranisition/transition/

Thanks,
David
-----

> Thanks, Robbin
> 
> On 2019-03-08 16:24, Robbin Ehn wrote:
>> Hi all, please review.
>>
>> Issue:
>> https://bugs.openjdk.java.net/browse/JDK-8220351
>> Changeset:
>> http://cr.openjdk.java.net/~rehn/8220351/webrev/
>>
>> After a JavaThread have been in a safepoint/(handshake) safe state it 
>> can start
>> executing updated code. E.g. an oop in the instruction stream can have 
>> been
>> updated.
>>
>> Most hardware's require a barrier or that the code cross modified is 
>> far away to
>> guarantee that the thread executing the updated instruction stream 
>> sees the
>> modification.
>>
>> What far away is and how far an update instruction stream is from a 
>> safepoint
>> safe state is not clear.
>>
>> To be compliant with those architectures an instruction stream barrier 
>> must be
>> added when leaving the safepoint safe state.
>>
>> There may be crashes today due to this missing barrier.
>> A new CPU with deeper pipeline or changes to the VM which moves a 
>> safepoint safe
>> state closer to a nmethod can increase changes of a crash.
>>
>> A few benchmarks will see a performance regression with the extra 
>> serializing,
>> such as Octane-Crypto -5% (x86).
>>
>> With a much more elaborate fix, such as changing the local poll to 
>> have more
>> than two states (armed/disarmed), it would be possible to only emit such
>> instruction stream barrier when the poll have been armed while the 
>> thread was
>> safepoint safe.
>>
>> Passes t1-3 on our platforms, and test built what I can.
>>
>> Thanks, Robbin