RFR 8240918: [REDO] Allow direct handshakes without VMThread intervention

Sun Apr 5 22:18:19 UTC 2020

Hi Patricio,

On 4/04/2020 12:12 am, Patricio Chilano wrote:
> Hi David,
> 
> On 4/3/20 10:21 AM, David Holmes wrote:
>> Hi Patricio,
>>
>> On 3/04/2020 6:00 pm, Patricio Chilano wrote:
>>> Hi Martin,
>>>
>>> On 4/2/20 5:24 PM, Doerr, Martin wrote:
>>>> Hi Patricio,
>>>>
>>>> right, you got my point. Sorry for not explaining it in greater 
>>>> detail at the beginning. I was busy with other things.
>>>>
>>>> There should be a memory barrier ordering read of _pending_threads 
>>>> (which is observed as 0) and whatever gets done when returning from 
>>>> execute_direct.
>>>> I don't like relying on barriers which are hidden somewhere and may 
>>>> be subject to change.
>>> Yes, I agree with that. I added an explicit acquire fence at the end 
>>> of Handshake::execute_direct() with a comment. Thanks for noticing it.
>>
>> This is unclear to me:
>>
>> 281   // Prevent future loads from floating above the load of 
>> _pending_threads
>>  282   // in is_completed(). This prevents reading stale data modified 
>> in the
>>  283   // handshake closure by the Handshakee.
>>  284   OrderAccess::acquire();
>>  285
>>  286   return op.executed();
>>
>> An acquire() by itself has no meaning, it has to be associated with a 
>> release() and both have to be associated with the load, respectively, 
>> store, of the same variable. From the comment it sounds like you just 
>> want a loadload() barrier here.
> The acquire is associated with _pending_threads, but since we only 
> really need it when we read it has a zero value, instead of always 
> loading it with acquire semantics I just added a barrier at the end of 
> the method. The release store in _pending_threads is given by 
> Atomic::dec().

Okay that makes sense - seeing how these things pair up is often tricky.

> Unless I'm missing something I think we can use a loadload() instead. 
> The loading of _pending_threads is in a conditional so there cannot be 
> writes visible to the handshakee until we checked the operation is done, 
> meaning the handshake closure has already executed.

I'm not clear what writes you are referring to here, but based on later 
emails you aren't taking this path anyway ... I think.

Thanks,
David
-----

> Patricio
>> Other updates seem fine.
>>
>> Thanks,
>> David
>> -----
>>
>>>> I'd also prefer explicit barriers on the writers side over the 
>>>> implicit ones from Atomic::add, because explicit barriers with 
>>>> comments help a lot understanding the code.
>>>> (Not sure if you want to change that. Just my opinion.)
>>> I added a comment on top of Atomic::dec() on why we need a memory 
>>> barrier there and the fact that we are already using a full memory 
>>> fence.
>>> If you still think I should add an explicit barrier for code 
>>> readability I can add it though.
>>>
>>>
>>> Here is v3 which also addresses Dan's comments.
>>>
>>> Full:
>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v3/webrev/
>>> Inc:
>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v3/inc/webrev/
>>>
>>>
>>> Thanks!
>>> Patricio
>>>> Best regards,
>>>> Martin
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
>>>>> Sent: Donnerstag, 2. April 2020 19:56
>>>>> To: Robbin Ehn <robbin.ehn at oracle.com>; Doerr, Martin
>>>>> <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
>>>>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
>>>>> VMThread intervention
>>>>>
>>>>>
>>>>> On 4/2/20 1:16 PM, Patricio Chilano wrote:
>>>>>> Hi Martin,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> On 4/2/20 10:15 AM, Robbin Ehn wrote:
>>>>>>> Hi Martin,
>>>>>>>
>>>>>>> On 2020-04-02 14:55, Doerr, Martin wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> first of all, thanks, Patricio, for doing this.
>>>>>>>>
>>>>>>>> One question:
>>>>>>>> _pending_threads is used for synchronizing threads.
>>>>>>>> Don't we need release / acquire semantics?
>>>>>>>> I guess otherwise handshaker can read stale data from handshakee
>>>>>>>> after observing is_completed.
>>>>>>>> Or did I miss anything?
>>>>>>> Reading stale should be fine in this case, since the counter goes
>>>>>>> from n->0, thus you can only get false negatives from "bool
>>>>>>> is_completed()" ?
>>>>>> Right, as Robbin points out at most we will just have to loop again.
>>>>>> Also, when the handshakee processes the handshake, right after
>>>>>> updating _pending_threads, it will either signal the _processing_sem
>>>>>> if there are no direct operations, or it will signal the
>>>>>> _handshake_turn_sem, so that will provide the release already. The
>>>>>> next read in the loop should read the updated value. Are you okay 
>>>>>> with
>>>>>> that?
>>>>> I realize you might be thinking of a different case now. You mean the
>>>>> handshaker could execute loads from code after
>>>>> Handshake::execute_direct() while it hasn't yet read _pending_threads
>>>>> from is_completed()? Although there is a load_acquire() in
>>>>> ~ThreadsListHandle() that will prevent those issues you might argue 
>>>>> that
>>>>> we can't rely on code which might change.
>>>>> I think (not counting that load_acquire from ~ThreadsListHandle) the
>>>>> issue could appear for example if you created a ThreadsListHandle 
>>>>> first,
>>>>> then called Handshake::execute_direct(), and when returning you try to
>>>>> read some field from the handshakee (you might want to assert 
>>>>> something
>>>>> for example). If there wasn't any ThreadsListHandle created before
>>>>> calling execute_direct() I think you are not allowed to access other
>>>>> JavaThread's data anyway since the thread might have exited already,
>>>>> unless you create a ThreadsListHandle which will already provide the
>>>>> acquire. So I think you are right and we might need a fence before
>>>>> returning from execute_direct() or add a comment that we are 
>>>>> relying on
>>>>> the load_acquire from ~ThreadsListHandle(). Is that what you were 
>>>>> thinking?
>>>>>
>>>>> Thanks,
>>>>> Patricio
>>>>>> Thanks!
>>>>>>
>>>>>> Patricio
>>>>>>> Thanks, Robbin
>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Martin
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: hotspot-runtime-dev <hotspot-runtime-dev-
>>>>>>>>> bounces at openjdk.java.net> On Behalf Of Robbin Ehn
>>>>>>>>> Sent: Donnerstag, 2. April 2020 13:36
>>>>>>>>> To: Patricio Chilano <patricio.chilano.mateo at oracle.com>;
>>>>>>>>> hotspot-runtime-
>>>>>>>>> dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
>>>>>>>>> VMThread intervention
>>>>>>>>>
>>>>>>>>> Hi Patricio,
>>>>>>>>>
>>>>>>>>> Thanks for fixing, looks good.
>>>>>>>>>
>>>>>>>>> /Robbin
>>>>>>>>>
>>>>>>>>> On 2020-04-01 19:45, Patricio Chilano wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> Please review this redo for 8230594. Since the reason behind the
>>>>>>>>>> Windows
>>>>>>>>>> timeouts turned out to be 8240902, this patch is almost identical
>>>>>>>>>> to the
>>>>>>>>>> reverse of the backout modulo the rebase.
>>>>>>>>>>
>>>>>>>>>> There is only one small fix for an issue not identified in the
>>>>>>>>>> original
>>>>>>>>>> patch which I explained in the comments. To avoid this possible
>>>>>>>>>> issue of
>>>>>>>>>> deleting a JavaThread that called Thread::destroy_vm() while 
>>>>>>>>>> there
>>>>>>>>>> are
>>>>>>>>>> handshakers blocked in the handshake_turn_sem semaphore, I
>>>>> added a
>>>>>>>>> check
>>>>>>>>>> to verify the JavaThread in question is not protected by some
>>>>>>>>>> ThreadsList reference before attempting the delete it. In case 
>>>>>>>>>> it is
>>>>>>>>>> protected, we need to at least clear the value for _thread_key to
>>>>>>>>>> avoid
>>>>>>>>>> possible infinite loops at thread exit (this can happen in 
>>>>>>>>>> Solaris).
>>>>>>>>>>
>>>>>>>>>> The only other change is that I removed the check for local 
>>>>>>>>>> polls in
>>>>>>>>>> Handshake::execute_direct() since after the backout, thread-local
>>>>>>>>>> handshakes was implemented for arm32 which was the only
>>>>> remaining
>>>>>>>>> platform.
>>>>>>>>>> I tested it several times in mach5 tiers 1-6 and once in t7 and
>>>>>>>>>> saw no
>>>>>>>>>> failures.
>>>>>>>>>>
>>>>>>>>>> Bug:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8240918
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v1/webrev/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Patricio
>>>
>