RFR (XS): 8188877: Improper synchronization in offer_termination

Mon Nov 27 15:44:57 UTC 2017

On 27/11/17 14:53, Andrew Haley wrote:
> On 27/11/17 12:30, Andrew Dinn wrote:
<snip>
>> That's what happens when the reader executes a read barrier. The
>> interesting question is what happens when the reader does not execute a
>> read barrier.
> 
> The invalidate messages still arrive at the reader, but they sit in
> the invalidate queue and aren't acted upon immediately.  Eventually
> they must be processed, either lazily or because the reader's
> invalidate queue fills up.

Hmm, that explanation assumes there will be other invalidate messages.
But at a STW pause that's not necessarily going to be the case. In the
worst case all other threads may could be spinning on a barrier count
while this one core/thread has a single invalidate message in its queue.

>> I understand that you tested this and found that it took no longer than
>> a few hundred microseconds. However, I really have to ask what precisely
>> the reader was doing during the test?
> 
> Nothing except spinning and loading, and that's a few microseconds'
> delay rather than a few hundred.

Ok, so it's not as if the reader noticed the invalidate because it was
writing memory or performing some other 'active' interaction with the
memory system that forced the invalidate queue to be processed (assuming
that invalidate detection is not a sneaky side-effect of a nop :-).

>> Specifically, does the time taken to 'eventually' notice a write to the
>> LDRed location depend upon what other instructions are executed between
>> successive LDRs?
> 
> It's really hard to be definite about that.  In practice it may well
> be that back-to-back local cache accesses saturate the CPU<->cache
> interconnect so much that they delay the processing of invalidate
> queue entries, but that's my speculation and it's secret sauce anyway.
 . . .
> And one final caveat: I'm talking about MESI, but there are more
> elaborate and sophisticated ways of making this stuff work.

Well yes, which is fine so long as any such elaborate sophistricat^H^H^H
improvement of current behaviour doesn't assume that a core can ignore
the invalidate queue absent an ldar (or, maybe, absent ldar union some
other subset of the available memory ops).

regards,

Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill, Eric Shander