RFR (XS): 8188877: Improper synchronization in offer_termination

Tue Nov 28 21:11:03 UTC 2017

Hi Andrew and Andrew,

Thanks for the discussion on load-acquire. This has been informative.

Related to timely notifications, one thing about offer_termination is that it's not doing a classic spin-wait. Classic spin-waits (as seen in mutex.cpp, objectMonitor.cpp, safepoint.cpp, synchronizer.cpp, for examples), will test the termination condition as part of the loop.

Offer-termination just has a simple for-loop that delays some number of cycles. As high as 4k iterations * 140 cycles (per SpinPause() on x86), could be 573,000 cycles or so. For this case, especially where the termination test is a simple load, I think we should test _offered_termination in the spin-wait. This should have low overhead on the spinning thread and impose no impact on other threads.

Unless there's disagreement I'll create an enhancement request for this. I'll add a note about the cleanups that Kim mentioned also.

 - Derek

> -----Original Message-----
> From: Andrew Haley [mailto:aph at redhat.com]
> Sent: Monday, November 27, 2017 10:57 AM
> To: Andrew Dinn <adinn at redhat.com>; White, Derek
> <Derek.White at cavium.com>; Thomas Schatzl
> <thomas.schatzl at oracle.com>; Kim Barrett <kim.barrett at oracle.com>
> Cc: hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR (XS): 8188877: Improper synchronization in
> offer_termination
> 
> On 27/11/17 15:44, Andrew Dinn wrote:
> > On 27/11/17 14:53, Andrew Haley wrote:
> >> On 27/11/17 12:30, Andrew Dinn wrote:
> > <snip>
> >>> That's what happens when the reader executes a read barrier. The
> >>> interesting question is what happens when the reader does not
> >>> execute a read barrier.
> >>
> >> The invalidate messages still arrive at the reader, but they sit in
> >> the invalidate queue and aren't acted upon immediately.  Eventually
> >> they must be processed, either lazily or because the reader's
> >> invalidate queue fills up.
> >
> > Hmm, that explanation assumes there will be other invalidate messages.
> 
> No, not at all.  By "lazily" I mean that while a core has nothing else to do it
> might as well process its invalidate queue, and AFAIK that is what happens.
> 
> > But at a STW pause that's not necessarily going to be the case. In the
> > worst case all other threads may could be spinning on a barrier count
> > while this one core/thread has a single invalidate message in its queue.
> 
> That could be, but there are other things go on.  There are other threads
> active, and invalidate messages get sent to everyone.
> 
> In practice, I've never seen more than a few microseconds of delay.
> 
> Bear in mind that the interpreter changes we just made mean that
> interpreted code won't necessarily see safepoint status changes for about
> 100 microseconds, so the lack of an acquiring load in our code is really not
> the biggest issue.
> 
> --
> Andrew Haley
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671