RFR 8240918: [REDO] Allow direct handshakes without VMThread intervention

Fri Apr 3 14:33:20 UTC 2020

Hi Martin,

On 4/3/20 11:20 AM, Doerr, Martin wrote:
> Hi David,
>
> the acquire belongs to the load of _pending_threads in is_completed() (handshake.cpp:270).
> It's moved below the loop because we don't need to repeat the barrier.
>
> The release barrier is integrated in Atomic::dec(&_pending_threads) (handshake.cpp:234).
> Should we better use Atomic::dec(&_pending_threads, memory_order_release) to make it precise and explicit?
As you stated in your other email too we can specify explicit memory 
ordering to Atomic::dec() which I forgot. I agree with explicitly 
changing it to Atomic::dec(&_pending_threads, memory_order_release).

> Release / acquire semantics are exactly what I was missing in the earlier version.
>
>
> I don't understand why "add_target_count" is called after the handshakes were activated (handshake.cpp:171).
> Couldn't it happen that the first handshake completes before that (it would set _pending_threads to 0)?
Yes, _pending_threads can even go negative for the 
VM_HandshakeAllThreads case. But we only check _pending_threads in the 
VMThread so for the Handshakees it doesn't matter when they decrement 
the counter what's the actual value.

Thanks,
Patricio
> Best regards,
> Martin
>
>
>> -----Original Message-----
>> From: David Holmes <david.holmes at oracle.com>
>> Sent: Freitag, 3. April 2020 15:22
>> To: Patricio Chilano <patricio.chilano.mateo at oracle.com>; Doerr, Martin
>> <martin.doerr at sap.com>; Robbin Ehn <robbin.ehn at oracle.com>; hotspot-
>> runtime-dev at openjdk.java.net
>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
>> VMThread intervention
>>
>> Hi Patricio,
>>
>> On 3/04/2020 6:00 pm, Patricio Chilano wrote:
>>> Hi Martin,
>>>
>>> On 4/2/20 5:24 PM, Doerr, Martin wrote:
>>>> Hi Patricio,
>>>>
>>>> right, you got my point. Sorry for not explaining it in greater detail
>>>> at the beginning. I was busy with other things.
>>>>
>>>> There should be a memory barrier ordering read of _pending_threads
>>>> (which is observed as 0) and whatever gets done when returning from
>>>> execute_direct.
>>>> I don't like relying on barriers which are hidden somewhere and may be
>>>> subject to change.
>>> Yes, I agree with that. I added an explicit acquire fence at the end of
>>> Handshake::execute_direct() with a comment. Thanks for noticing it.
>> This is unclear to me:
>>
>> 281   // Prevent future loads from floating above the load of
>> _pending_threads
>>    282   // in is_completed(). This prevents reading stale data modified
>> in the
>>    283   // handshake closure by the Handshakee.
>>    284   OrderAccess::acquire();
>>    285
>>    286   return op.executed();
>>
>> An acquire() by itself has no meaning, it has to be associated with a
>> release() and both have to be associated with the load, respectively,
>> store, of the same variable. From the comment it sounds like you just
>> want a loadload() barrier here.
>>
>> Other updates seem fine.
>>
>> Thanks,
>> David
>> -----
>>
>>>> I'd also prefer explicit barriers on the writers side over the
>>>> implicit ones from Atomic::add, because explicit barriers with
>>>> comments help a lot understanding the code.
>>>> (Not sure if you want to change that. Just my opinion.)
>>> I added a comment on top of Atomic::dec() on why we need a memory
>>> barrier there and the fact that we are already using a full memory fence.
>>> If you still think I should add an explicit barrier for code readability
>>> I can add it though.
>>>
>>>
>>> Here is v3 which also addresses Dan's comments.
>>>
>>> Full:
>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v3/webrev/
>>> Inc:
>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v3/inc/webrev/
>>>
>>>
>>> Thanks!
>>> Patricio
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>> Sent: Donnerstag, 2. April 2020 19:56
>>>>> To: Robbin Ehn <robbin.ehn at oracle.com>; Doerr, Martin
>>>>> <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
>>>>> VMThread intervention
>>>>>
>>>>>
>>>>> On 4/2/20 1:16 PM, Patricio Chilano wrote:
>>>>>> Hi Martin,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> On 4/2/20 10:15 AM, Robbin Ehn wrote:
>>>>>>> Hi Martin,
>>>>>>>
>>>>>>> On 2020-04-02 14:55, Doerr, Martin wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> first of all, thanks, Patricio, for doing this.
>>>>>>>>
>>>>>>>> One question:
>>>>>>>> _pending_threads is used for synchronizing threads.
>>>>>>>> Don't we need release / acquire semantics?
>>>>>>>> I guess otherwise handshaker can read stale data from handshakee
>>>>>>>> after observing is_completed.
>>>>>>>> Or did I miss anything?
>>>>>>> Reading stale should be fine in this case, since the counter goes
>>>>>>> from n->0, thus you can only get false negatives from "bool
>>>>>>> is_completed()" ?
>>>>>> Right, as Robbin points out at most we will just have to loop again.
>>>>>> Also, when the handshakee processes the handshake, right after
>>>>>> updating _pending_threads, it will either signal the _processing_sem
>>>>>> if there are no direct operations, or it will signal the
>>>>>> _handshake_turn_sem, so that will provide the release already. The
>>>>>> next read in the loop should read the updated value. Are you okay with
>>>>>> that?
>>>>> I realize you might be thinking of a different case now. You mean the
>>>>> handshaker could execute loads from code after
>>>>> Handshake::execute_direct() while it hasn't yet read _pending_threads
>>>>> from is_completed()? Although there is a load_acquire() in
>>>>> ~ThreadsListHandle() that will prevent those issues you might argue that
>>>>> we can't rely on code which might change.
>>>>> I think (not counting that load_acquire from ~ThreadsListHandle) the
>>>>> issue could appear for example if you created a ThreadsListHandle first,
>>>>> then called Handshake::execute_direct(), and when returning you try to
>>>>> read some field from the handshakee (you might want to assert
>> something
>>>>> for example). If there wasn't any ThreadsListHandle created before
>>>>> calling execute_direct() I think you are not allowed to access other
>>>>> JavaThread's data anyway since the thread might have exited already,
>>>>> unless you create a ThreadsListHandle which will already provide the
>>>>> acquire. So I think you are right and we might need a fence before
>>>>> returning from execute_direct() or add a comment that we are relying
>> on
>>>>> the load_acquire from ~ThreadsListHandle(). Is that what you were
>>>>> thinking?
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>> Thanks!
>>>>>>
>>>>>> Patricio
>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: hotspot-runtime-dev <hotspot-runtime-dev-
>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Robbin Ehn
>>>>>>>>> Sent: Donnerstag, 2. April 2020 13:36
>>>>>>>>> To: Patricio Chilano <patricio.chilano.mateo at oracle.com>;
>>>>>>>>> hotspot-runtime-
>>>>>>>>> dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
>>>>>>>>> VMThread intervention
>>>>>>>>>
>>>>>>>>> Hi Patricio,
>>>>>>>>>
>>>>>>>>> Thanks for fixing, looks good.
>>>>>>>>>
>>>>>>>>> /Robbin
>>>>>>>>>
>>>>>>>>> On 2020-04-01 19:45, Patricio Chilano wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this redo for 8230594. Since the reason behind the
>>>>>>>>>> Windows
>>>>>>>>>> timeouts turned out to be 8240902, this patch is almost identical
>>>>>>>>>> to the
>>>>>>>>>> reverse of the backout modulo the rebase.
>>>>>>>>>>
>>>>>>>>>> There is only one small fix for an issue not identified in the
>>>>>>>>>> original
>>>>>>>>>> patch which I explained in the comments. To avoid this possible
>>>>>>>>>> issue of
>>>>>>>>>> deleting a JavaThread that called Thread::destroy_vm() while
>> there
>>>>>>>>>> are
>>>>>>>>>> handshakers blocked in the handshake_turn_sem semaphore, I
>>>>> added a
>>>>>>>>> check
>>>>>>>>>> to verify the JavaThread in question is not protected by some
>>>>>>>>>> ThreadsList reference before attempting the delete it. In case
>>>>>>>>>> it is
>>>>>>>>>> protected, we need to at least clear the value for _thread_key to
>>>>>>>>>> avoid
>>>>>>>>>> possible infinite loops at thread exit (this can happen in
>>>>>>>>>> Solaris).
>>>>>>>>>>
>>>>>>>>>> The only other change is that I removed the check for local
>>>>>>>>>> polls in
>>>>>>>>>> Handshake::execute_direct() since after the backout, thread-local
>>>>>>>>>> handshakes was implemented for arm32 which was the only
>>>>> remaining
>>>>>>>>> platform.
>>>>>>>>>> I tested it several times in mach5 tiers 1-6 and once in t7 and
>>>>>>>>>> saw no
>>>>>>>>>> failures.
>>>>>>>>>>
>>>>>>>>>> Bug:
>>>>>>>>>>        https://bugs.openjdk.java.net/browse/JDK-8240918
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v1/webrev/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Patricio