RFR(m): 8220351: Cross-modifying code

Wed Mar 20 22:23:43 UTC 2019

Hi Martin,

On 21/03/2019 3:42 am, Doerr, Martin wrote:
> Hi David and Robbin,
> 
>> As I wrote earlier I'd like to see the stat-space enumerated
>> here to show why pushing this down to the "safe states" is sufficient
> I was also struggling a bit with the placement of the cross_modify_fence.
> But there's the idea to replace the suspend flags by handshakes.
> After that, it seems to me that there will be very few cross_modify_fence usages left which look nicely placed to me.
> So if this is the plan, I should be happy with it ��

I'm really not sure how using handshakes for suspend might simplify the 
current problem.

>> I don't understand at all what your changes to the safepoint/handshake
>> disarming logic are doing or how they relate to all this.
> That was basically my suggestion.
> The first proposal was to introduce CPUID in native wrapper which is costly from performance point of view.
> Every Java thread returning from native would have to execute it even if there was no safepoint. And there was no way to detect if there was one.
> So my idea was to change the disarming logic such that every thread returning from native must detect if there was a safepoint.
> This is done by moving the task of disarming from the VM thread to the thread returning from native.

Doesn't the safepoint counter provide a means to detect if there has 
been a safepoint?

I worry about making this kind of tweak to the safepoint mechanism. I 
now have to convince myself that a self-disarm can't race with a re-arm 
from the VMThread.

Thanks,
David
-----

> I guess Robbin will also add a description.
> 
> Best regards,
> Martin
> 
> 
> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Mittwoch, 20. März 2019 11:25
> To: Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev at openjdk.java.net; Doerr, Martin <martin.doerr at sap.com>
> Cc: aph at redhat.com; Erik Österlund <erik.osterlund at oracle.com>; Reingruber, Richard <richard.reingruber at sap.com>
> Subject: Re: RFR(m): 8220351: Cross-modifying code
> 
> On 20/03/2019 7:31 pm, Robbin Ehn wrote:
>> Hi, v3 updated with Martin's comments.
>>
>> Inc:
>> http://cr.openjdk.java.net/~rehn/8220351/v3/inc/
>> Full:
>> http://cr.openjdk.java.net/~rehn/8220351/v3/
>>
>> David, are you okay with this model which does cross-mod fence on
>> leaving a safe state? With the other model, always cross-mod fence
>> going to java would mean we cannot disarm any thread since we don't
>> know which state it's going to to next, if we want the same
>> performance.
> 
> Sorry Robbin but I can't follow the new approach just from the code
> changes. As I wrote earlier I'd like to see the stat-space enumerated
> here to show why pushing this down to the "safe states" is sufficient
> (and not introducing redundant fences) - e.g we fence when going from
> _thread_blocked to _thread_in_vm which covers the case if we then return
> to _thread_in_java; but if we're returning to native then we don't need
> the fence.
> 
> I don't understand at all what your changes to the safepoint/handshake
> disarming logic are doing or how they relate to all this. Can you write
> up the design in the bug report please.
> 
> Thanks,
> David
> -----
> 
> 
>>
>> Thanks, Robbin
>>
>> On 3/19/19 5:43 PM, Robbin Ehn wrote:
>>> Hi all, here is v2:
>>>
>>> Full:
>>> http://cr.openjdk.java.net/~rehn/8220351/v2/webrev/
>>> Inc:
>>> http://cr.openjdk.java.net/~rehn/8220351/v2/inc/webrev/
>>>
>>> The deprecated non-TLH path is missing the correct cross-mod fence,
>>> but since 13
>>> will be the last release containing that code path I think that is okay.
>>>
>>> We just leave native/native trans armed, and let the thread disarm on the
>>> transition. If it disarms a new safepoint, we re-arm and stop at next
>>> polling
>>> site, that avoids recursion.
>>>
>>> Passes stress and hs-t1-5.
>>> As in the previous mail, performance numbers look good.
>>>
>>> Thanks, Robbin
>>>
>>> On 3/8/19 4:24 PM, Robbin Ehn wrote:
>>>> Hi all, please review.
>>>>
>>>> Issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8220351
>>>> Changeset:
>>>> http://cr.openjdk.java.net/~rehn/8220351/webrev/
>>>>
>>>> After a JavaThread have been in a safepoint/(handshake) safe state it
>>>> can start
>>>> executing updated code. E.g. an oop in the instruction stream can
>>>> have been
>>>> updated.
>>>>
>>>> Most hardware's require a barrier or that the code cross modified is
>>>> far away to
>>>> guarantee that the thread executing the updated instruction stream
>>>> sees the
>>>> modification.
>>>>
>>>> What far away is and how far an update instruction stream is from a
>>>> safepoint
>>>> safe state is not clear.
>>>>
>>>> To be compliant with those architectures an instruction stream
>>>> barrier must be
>>>> added when leaving the safepoint safe state.
>>>>
>>>> There may be crashes today due to this missing barrier.
>>>> A new CPU with deeper pipeline or changes to the VM which moves a
>>>> safepoint safe
>>>> state closer to a nmethod can increase changes of a crash.
>>>>
>>>> A few benchmarks will see a performance regression with the extra
>>>> serializing,
>>>> such as Octane-Crypto -5% (x86).
>>>>
>>>> With a much more elaborate fix, such as changing the local poll to
>>>> have more
>>>> than two states (armed/disarmed), it would be possible to only emit such
>>>> instruction stream barrier when the poll have been armed while the
>>>> thread was
>>>> safepoint safe.
>>>>
>>>> Passes t1-3 on our platforms, and test built what I can.
>>>>
>>>> Thanks, Robbin