RFR 8240918: [REDO] Allow direct handshakes without VMThread intervention
Doerr, Martin
martin.doerr at sap.com
Thu Apr 2 20:24:01 UTC 2020
Hi Patricio,
right, you got my point. Sorry for not explaining it in greater detail at the beginning. I was busy with other things.
There should be a memory barrier ordering read of _pending_threads (which is observed as 0) and whatever gets done when returning from execute_direct.
I don't like relying on barriers which are hidden somewhere and may be subject to change.
I'd also prefer explicit barriers on the writers side over the implicit ones from Atomic::add, because explicit barriers with comments help a lot understanding the code.
(Not sure if you want to change that. Just my opinion.)
Best regards,
Martin
> -----Original Message-----
> From: Patricio Chilano <patricio.chilano.mateo at oracle.com>
> Sent: Donnerstag, 2. April 2020 19:56
> To: Robbin Ehn <robbin.ehn at oracle.com>; Doerr, Martin
> <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
> VMThread intervention
>
>
> On 4/2/20 1:16 PM, Patricio Chilano wrote:
> > Hi Martin,
> >
> > Thanks for looking at this.
> >
> > On 4/2/20 10:15 AM, Robbin Ehn wrote:
> >> Hi Martin,
> >>
> >> On 2020-04-02 14:55, Doerr, Martin wrote:
> >>> Hi all,
> >>>
> >>> first of all, thanks, Patricio, for doing this.
> >>>
> >>> One question:
> >>> _pending_threads is used for synchronizing threads.
> >>> Don't we need release / acquire semantics?
> >>> I guess otherwise handshaker can read stale data from handshakee
> >>> after observing is_completed.
> >>> Or did I miss anything?
> >>
> >> Reading stale should be fine in this case, since the counter goes
> >> from n->0, thus you can only get false negatives from "bool
> >> is_completed()" ?
> > Right, as Robbin points out at most we will just have to loop again.
> > Also, when the handshakee processes the handshake, right after
> > updating _pending_threads, it will either signal the _processing_sem
> > if there are no direct operations, or it will signal the
> > _handshake_turn_sem, so that will provide the release already. The
> > next read in the loop should read the updated value. Are you okay with
> > that?
> I realize you might be thinking of a different case now. You mean the
> handshaker could execute loads from code after
> Handshake::execute_direct() while it hasn't yet read _pending_threads
> from is_completed()? Although there is a load_acquire() in
> ~ThreadsListHandle() that will prevent those issues you might argue that
> we can't rely on code which might change.
> I think (not counting that load_acquire from ~ThreadsListHandle) the
> issue could appear for example if you created a ThreadsListHandle first,
> then called Handshake::execute_direct(), and when returning you try to
> read some field from the handshakee (you might want to assert something
> for example). If there wasn't any ThreadsListHandle created before
> calling execute_direct() I think you are not allowed to access other
> JavaThread's data anyway since the thread might have exited already,
> unless you create a ThreadsListHandle which will already provide the
> acquire. So I think you are right and we might need a fence before
> returning from execute_direct() or add a comment that we are relying on
> the load_acquire from ~ThreadsListHandle(). Is that what you were thinking?
>
> Thanks,
> Patricio
> > Thanks!
> >
> > Patricio
> >> Thanks, Robbin
> >>
> >>>
> >>> Best regards,
> >>> Martin
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: hotspot-runtime-dev <hotspot-runtime-dev-
> >>>> bounces at openjdk.java.net> On Behalf Of Robbin Ehn
> >>>> Sent: Donnerstag, 2. April 2020 13:36
> >>>> To: Patricio Chilano <patricio.chilano.mateo at oracle.com>;
> >>>> hotspot-runtime-
> >>>> dev at openjdk.java.net
> >>>> Subject: Re: RFR 8240918: [REDO] Allow direct handshakes without
> >>>> VMThread intervention
> >>>>
> >>>> Hi Patricio,
> >>>>
> >>>> Thanks for fixing, looks good.
> >>>>
> >>>> /Robbin
> >>>>
> >>>> On 2020-04-01 19:45, Patricio Chilano wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> Please review this redo for 8230594. Since the reason behind the
> >>>>> Windows
> >>>>> timeouts turned out to be 8240902, this patch is almost identical
> >>>>> to the
> >>>>> reverse of the backout modulo the rebase.
> >>>>>
> >>>>> There is only one small fix for an issue not identified in the
> >>>>> original
> >>>>> patch which I explained in the comments. To avoid this possible
> >>>>> issue of
> >>>>> deleting a JavaThread that called Thread::destroy_vm() while there
> >>>>> are
> >>>>> handshakers blocked in the handshake_turn_sem semaphore, I
> added a
> >>>> check
> >>>>> to verify the JavaThread in question is not protected by some
> >>>>> ThreadsList reference before attempting the delete it. In case it is
> >>>>> protected, we need to at least clear the value for _thread_key to
> >>>>> avoid
> >>>>> possible infinite loops at thread exit (this can happen in Solaris).
> >>>>>
> >>>>> The only other change is that I removed the check for local polls in
> >>>>> Handshake::execute_direct() since after the backout, thread-local
> >>>>> handshakes was implemented for arm32 which was the only
> remaining
> >>>> platform.
> >>>>>
> >>>>> I tested it several times in mach5 tiers 1-6 and once in t7 and
> >>>>> saw no
> >>>>> failures.
> >>>>>
> >>>>> Bug:
> >>>>> https://bugs.openjdk.java.net/browse/JDK-8240918
> >>>>>
> >>>>> Webrev:
> >>>>> http://cr.openjdk.java.net/~pchilanomate/8240918/v1/webrev/
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>> Patricio
> >
More information about the hotspot-runtime-dev
mailing list