RFR(m): 8220351: Cross-modifying code

Tue Mar 12 12:56:17 UTC 2019

Hi Martin,

On 3/12/19 12:47 PM, Doerr, Martin wrote:
> Hi Robbin,
> 
> for PPC, isync came into my mind first (also mentioned by David). However, isync is only strictly required for a thread which performs modifications and wants to execute these modified instructions. We already have isync at the patch sites.
> 
> The normal "fence" = "sync" does the job for cross-modification on PPC, so I don't think we need to execute any additional instructions.
> The existing barriers look fine for PPC.
> 
> I believe other non-x86 platforms are also fine with existing fences (s390, haven't checked others).
> 
> For x86, why don't we just "upgrade" the few existing barriers to include instruction stream synchronization?
> E.g.
> sharedRuntime_x86_64.cpp: 2564
> templateInterpreterGenerator_x86.cpp: 1094
> safepoint.cpp: 822

Since at the moment don't have a way to ask for multiple fences in OrderAccess
it's only available as a separate method for now.

In the transition we have a StoreLoad between storing the unsafe thread state,
e.g. native_trans and loading of the poll. But the instruction barrier must
happen after poll. Otherwise an oop can be updated after the
StoreLoad+InstructionPipeline barrier and poll being disarmed before poll check.

JavaThread:                  | VMThread
StoreLoad+InstructionPipeline|
                              | update <immediate oop>
                              | disarm
load thread_poll             |

Thanks, Robbin

> ...
> 
> Am I missing anything?
> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: Robbin Ehn <robbin.ehn at oracle.com>
> Sent: Dienstag, 12. März 2019 10:44
> To: Doerr, Martin <martin.doerr at sap.com>; Andrew Haley <aph at redhat.com>; hotspot-dev at openjdk.java.net
> Subject: Re: RFR(m): 8220351: Cross-modifying code
> 
> Hi Martin,
> 
> On 3/11/19 6:43 PM, Doerr, Martin wrote:
>> Hi Robbin,
>>
>>> They are only changed during a GC safepoint on archs with
>>> mustIterateImmediateOopsInCode set to true, AFAIK.
>>
>> Only x86 uses Oops directly encoded into the instruction stream.
>> But all platforms have compressed Oops encoded: loadConN nodes in ad files.
> 
> Sorry, forgot about that. Was just looking at the immediate ones.
> 
>>
>> I wonder if there are already barriers in place which have the required effect for non-x86 CPUs.
>> For example fence() and acquire() already include instruction stream synchronization on PPC.
>> Is that why you implemented the new barrier empty for non-x86 CPUs?
> 
> Partly because wasn't sure e.g. isync was what you wanted.
> And maybe someone would choose to gamble to not take the performance hit.
> Or there may be other considering for that arch which I'm not aware of.
> (I wasn't aware of arms need to sync I-cache with D-cache, I thought ISB was enough)
> 
> Thanks, Robbin
> 
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: hotspot-dev <hotspot-dev-bounces at openjdk.java.net> On Behalf Of Robbin Ehn
>> Sent: Montag, 11. März 2019 13:56
>> To: Andrew Haley <aph at redhat.com>; hotspot-dev at openjdk.java.net
>> Subject: Re: RFR(m): 8220351: Cross-modifying code
>>
>> On 2019-03-11 11:26, Andrew Haley wrote:
>>> On 3/8/19 3:24 PM, Robbin Ehn wrote:
>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8220351
>>>> Changeset:
>>>> http://cr.openjdk.java.net/~rehn/8220351/webrev/
>>>>
>>>> After a JavaThread have been in a safepoint/(handshake) safe state
>>>> it can start executing updated code. E.g. an oop in the instruction
>>>> stream can have been updated.
>>>
>>> Hmm, interesting. I think it works on on AArch64 at present because
>>> the segfault trap we take is effectively a full synchronization
>>> operation.
>>>
>>> An AArch64 ISB (instruction synchronization barrier) invalidates the
>>> pipeline. It doesn't invalidate the instruction cache, which we'd need
>>> to do to see a relocated OOP. If an instruction is modified by some
>>> other thread we need to flush the local icache, but to do that we need
>>> to know which instructions have been changed. We could create a
>>> modification queue, but that seems rather elaborate.
>>>
>>> What is is that changes OOPs in the instructions during a handshake?
>>> Is it just garbage collection, or does it happen at other times too?
>>
>> They are only changed during a GC safepoint on archs with
>> mustIterateImmediateOopsInCode set to true, AFAIK.
>> Arm and aarch64 have it false.
>>
>> /Robbin
>>
>>>
>>> We could simply move OOPs (and class metadata pointers?) into the
>>> constant pool. That would have some performance impact on in-order
>>> CPUs, but hopefully not much on out-of-order ones.
>>>
>>
>>