[jmm-dev] Store completion query - general and ARM

David Holmes david.holmes at oracle.com
Thu Nov 10 20:52:51 UTC 2016

On 10/11/2016 7:31 PM, Andrew Haley wrote:
> On 10/11/16 09:20, David Holmes wrote:
>> On 10/11/2016 7:07 PM, Andrew Haley wrote:
>>> On 10/11/16 00:06, David Holmes wrote:
>>>> Does any part of the JMM require actual visibility/completion of
>>>> volatile stores or is it only order that is defined (with an assumptions
>>>> that all stores will complete in a finite time)?
>>> Ordering is really all that we've got: all that memory fences can do
>>> is ensure that visibility of loads and stores is ordered in some way.
>> If we establish some global order of loads and stores, yes. That can in
>> turn require that a store become visible prior to a given load.
> I agree.
>>>> In relation to ARM specifically, Dekker style algorithms require
>>>> visibility/completion of the store before the subsequent load, yet the
>>>> example in "A Tutorial Introduction to the ARM and POWER Relaxed Memory
>>>> Models" shows the use of DMB, not DSB.
>>> DMB is fine for that.  Dekker doesn't need a store to be forced out of
>>> the caches, only that the store be made visible to other processors
>>> before any operations later in program order.
>> Again it is far from obvious to me that DMB causes the store to be
>> visible before any operations later in program order. I find the Group A
>> / Group B formulation (and even the definition of "observe") to be quite
>> obscure and hard to map to actual code behaviour.
> Indeed.  The real problem is that ARM are trying to describe the
> memory model in an abstract way that does not overly constrain

I just wish they had included the word "complete" or "visible" in that 
abstract description. :)

> implementations.  But a DMB really is sufficient to ensure that prior
> stores are visible.  (Mind you, we don't need DMB to get sequentially-
> consistent behaviour that's enough for Java volatiles.)

I was going to ask how that can be true, but then saw this in the paper 
Peter referenced:

"According to the ARM ARM, store-release is multicopy-
atomic when observed by load-acquires, a strong property
that conventional release-acquire semantics does not imply. Furthermore,
despite their names, these instructions are intended to be
used to implement the C11 sequentially consistent load and store."

That is new information to me, and somewhat surprising.


>>>> Yet AFAICS DMB says nothing about completion whereas DSB does. ??
>>>> (To be honest I find the Group A/B description of DMB properties
>>>> extremely hard to actually interpret wrt code like Dekker.)
>>> DSB is only really needed if there are multiple caches of the same
>>> address, i.e. Icache and Dcache: it's necessary to force a store out
>>> into main memory in order to refresh he Icache.
>> I thought only ISB had an effect relative to instructions/i-cache ??
> It does: you need DSB to ensure the visibility of the data cleaned
> from the Dcache, then ISB to synchronize the fetched instruction
> stream.
> Andrew.

More information about the jmm-dev mailing list