RFR(m): 8220351: Cross-modifying code

Thu Mar 21 08:04:25 UTC 2019

Hi,

> 
> I'm really not sure how using handshakes for suspend might simplify the current 
> problem.

What I have in my head is leaving the thread on an active handshake blocked on
SR_lock (some new features needed in handshakes code). So the suspend case would
get the instruction barrier from the normal blocked on lock transition.
So I think that would leave only 2 instruction barriers for blocked and with 
lazy disarm 1 in the should block which handles the native case.

> 
>>> I don't understand at all what your changes to the safepoint/handshake
>>> disarming logic are doing or how they relate to all this.
>> That was basically my suggestion.
>> The first proposal was to introduce CPUID in native wrapper which is costly 
>> from performance point of view.
>> Every Java thread returning from native would have to execute it even if there 
>> was no safepoint. And there was no way to detect if there was one.
>> So my idea was to change the disarming logic such that every thread returning 
>> from native must detect if there was a safepoint.
>> This is done by moving the task of disarming from the VM thread to the thread 
>> returning from native.
> 
> Doesn't the safepoint counter provide a means to detect if there has been a 
> safepoint?

The problem is that we also need them for handshakes.
Having a per thread handshake counter would work. (I think also Haley said this 
or similar)
And update handshake counter on safepoint. (or just do the safepoint as a
handshake all operation)
But it would require updates to all assembly code adding more checks.

> 
> I worry about making this kind of tweak to the safepoint mechanism. I now have 
> to convince myself that a self-disarm can't race with a re-arm from the VMThread.

In safepoint case we first store global state and then arm:

  319   // We are synchronizing
  320   OrderAccess::storestore(); // Ordered with _safepoint_counter
  321   _state = _synchronizing;
  322
  323   if (SafepointMechanism::uses_thread_local_poll()) {
  324     // Arming the per thread poll while having _state != _not_synchronized 
means safepointing
  325     log_trace(safepoint)("Setting thread local yield flag for threads");
  326     OrderAccess::storestore(); // storestore, global state -> local state
  327     for (JavaThreadIteratorWithHandle jtiwh; JavaThread *cur = 
jtiwh.next(); ) {
  328       // Make sure the threads start polling, it is time to yield.
  329       SafepointMechanism::arm_local_poll(cur);

In handshake case:

void HandshakeState::set_operation(JavaThread* target, HandshakeOperation* op) { 

   _operation = op;
   SafepointMechanism::arm_local_poll_release(target);
}

So in both cases _operation/_state are always stored before the arming.
If we disarm, recheck with:
  if (uses_thread_local_poll() && local_poll_armed(thread)) {
  108     disarm_local_poll_release(thread);
  109     // We might have disarmed next safepoint/handshake
  110     OrderAccess::storeload();
  111     if (global_poll() || thread->has_handshake()) {
  112       arm_local_poll(thread);
  113     }
  114   }

We might re-arm an already armed poll from VMThread, that is okey.
We might re-arm a disarm from VMThread and hit an extra poll site, that is okey.

Since we are in an unsafe state running this code, after re-arming we can
continue to execute and stop at next polling site.
Instead of rearming we can just call block_if_requested_slow again, this have 
the upside of faster re-safepointing and might be better.
I did not find any issues with the recursion (but generally avoiding recursion 
is good).

Hopefully there is no error in my logic :)

/Robbin

> 
> Thanks,
> David
> -----
> 
>> I guess Robbin will also add a description.
>>
>> Best regards,
>> Martin
>>
>>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Mittwoch, 20. März 2019 11:25
>> To: Robbin Ehn <robbin.ehn at oracle.com>; hotspot-dev at openjdk.java.net; Doerr, 
>> Martin <martin.doerr at sap.com>
>> Cc: aph at redhat.com; Erik Österlund <erik.osterlund at oracle.com>; Reingruber, 
>> Richard <richard.reingruber at sap.com>
>> Subject: Re: RFR(m): 8220351: Cross-modifying code
>>
>> On 20/03/2019 7:31 pm, Robbin Ehn wrote:
>>> Hi, v3 updated with Martin's comments.
>>>
>>> Inc:
>>> http://cr.openjdk.java.net/~rehn/8220351/v3/inc/
>>> Full:
>>> http://cr.openjdk.java.net/~rehn/8220351/v3/
>>>
>>> David, are you okay with this model which does cross-mod fence on
>>> leaving a safe state? With the other model, always cross-mod fence
>>> going to java would mean we cannot disarm any thread since we don't
>>> know which state it's going to to next, if we want the same
>>> performance.
>>
>> Sorry Robbin but I can't follow the new approach just from the code
>> changes. As I wrote earlier I'd like to see the stat-space enumerated
>> here to show why pushing this down to the "safe states" is sufficient
>> (and not introducing redundant fences) - e.g we fence when going from
>> _thread_blocked to _thread_in_vm which covers the case if we then return
>> to _thread_in_java; but if we're returning to native then we don't need
>> the fence.
>>
>> I don't understand at all what your changes to the safepoint/handshake
>> disarming logic are doing or how they relate to all this. Can you write
>> up the design in the bug report please.
>>
>> Thanks,
>> David
>> -----
>>
>>
>>>
>>> Thanks, Robbin
>>>
>>> On 3/19/19 5:43 PM, Robbin Ehn wrote:
>>>> Hi all, here is v2:
>>>>
>>>> Full:
>>>> http://cr.openjdk.java.net/~rehn/8220351/v2/webrev/
>>>> Inc:
>>>> http://cr.openjdk.java.net/~rehn/8220351/v2/inc/webrev/
>>>>
>>>> The deprecated non-TLH path is missing the correct cross-mod fence,
>>>> but since 13
>>>> will be the last release containing that code path I think that is okay.
>>>>
>>>> We just leave native/native trans armed, and let the thread disarm on the
>>>> transition. If it disarms a new safepoint, we re-arm and stop at next
>>>> polling
>>>> site, that avoids recursion.
>>>>
>>>> Passes stress and hs-t1-5.
>>>> As in the previous mail, performance numbers look good.
>>>>
>>>> Thanks, Robbin
>>>>
>>>> On 3/8/19 4:24 PM, Robbin Ehn wrote:
>>>>> Hi all, please review.
>>>>>
>>>>> Issue:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8220351
>>>>> Changeset:
>>>>> http://cr.openjdk.java.net/~rehn/8220351/webrev/
>>>>>
>>>>> After a JavaThread have been in a safepoint/(handshake) safe state it
>>>>> can start
>>>>> executing updated code. E.g. an oop in the instruction stream can
>>>>> have been
>>>>> updated.
>>>>>
>>>>> Most hardware's require a barrier or that the code cross modified is
>>>>> far away to
>>>>> guarantee that the thread executing the updated instruction stream
>>>>> sees the
>>>>> modification.
>>>>>
>>>>> What far away is and how far an update instruction stream is from a
>>>>> safepoint
>>>>> safe state is not clear.
>>>>>
>>>>> To be compliant with those architectures an instruction stream
>>>>> barrier must be
>>>>> added when leaving the safepoint safe state.
>>>>>
>>>>> There may be crashes today due to this missing barrier.
>>>>> A new CPU with deeper pipeline or changes to the VM which moves a
>>>>> safepoint safe
>>>>> state closer to a nmethod can increase changes of a crash.
>>>>>
>>>>> A few benchmarks will see a performance regression with the extra
>>>>> serializing,
>>>>> such as Octane-Crypto -5% (x86).
>>>>>
>>>>> With a much more elaborate fix, such as changing the local poll to
>>>>> have more
>>>>> than two states (armed/disarmed), it would be possible to only emit such
>>>>> instruction stream barrier when the poll have been armed while the
>>>>> thread was
>>>>> safepoint safe.
>>>>>
>>>>> Passes t1-3 on our platforms, and test built what I can.
>>>>>
>>>>> Thanks, Robbin