OrderAccess Refactoring

Mon Jan 12 14:18:51 UTC 2015

Hi David,

Thank you for the feedback! :)

On 10/01/15 11:11, David Holmes wrote:
> acquire/release semantics are problematic because there is no single
> common accepted meaning. Even talking about acquire() and release() as
> independent operations is somewhat meaningless - it is only when they
> are paired, and fixed in place that they start to give meaning to things.

I agree it is a bit vague. The description in the comments about 
preventing all memory accesses from floating up and down is the one I 
recognize and it is conceptually good. But then it is later defined in a 
table as acquire = Load(Store|Load) and release = (Store|Load)Store. The 
reason is understandable - even though the description of 
acquire/release is quite general, they are only intended and designed to 
be used in certain contexts like release store and load acquire (with 
variants). Outside such similar contexts, it would not give the 
guarantees described.

This gets more confusing when you go on reading that fence is 
conceptually acquire and release. It is true on the conceptual level 
using the conceptual descriptions of acquire/release, but the actual 
guarantees of release + acquire is StoreStore, LoadLoad and LoadStore 
according to the tables below, while fence() also provides StoreLoad 
constraints. I don't know about you but I find that a bit misleading.

I mean, it might be quite obvious for anyone using it to only use 
acquire/release in the intended contexts where it makes sense and then 
there is no trouble, but I guess it doesn't hurt to add some comments 
making it explicit to avoid any confusion. We don't want confusion I think.

> load_acquire/store_release don't necessarily map cleanly to
> acquire()/release() as currently described in orderAccess.hpp; nor do
> they necessarily map cleanly to the same named operations on eg ARMv8;
> nor to similar named entities in C++11. As Paul stated these were
> introduced to map to the ia64 architecture - which I believe had
> somewhat different semantics.

They do map cleanly after I change acquire() and release() to compiler 
barriers on TSO platforms (and fencing on PPC). Except some exceptions 
like load acquire on PPC that was specialized for speed using twi-isync 
and x86 on windows that don't need the stronger barriers since volatile 
is more accurate anyway.

> This whole area is a minefield because there are different concepts of
> what "acquire" and "release" may mean - so you need to start with
> detailed definitions. Personally, and I know I am not alone, I abhor the
> acquire/release terminology and usage, I much prefer the unambiguous
> storeload, storestore etc. :)

Yes I think we need a better concrete guarantee/contract. The current 
conceptual description of acquire/release is nice I think and useful 
also for understanding when to use it, but I want to make it explicit 
that release is only used in conjunction with writes and acquire only in 
conjunction with loads (and swapping instructions etc), and therefore do 
not give the same guarantees as described as the more general conceptual 
description promises.

The concrete guarantee users can rely on is acquire = Load(Store|Load) 
and release = (Store|Load)Store, and that's consistent with the 
implementation and reasonable intended usage. Hence, it's fine to use 
them in the way acquire/release is described now, iff these barriers 
suffice to guarantee it, otherwise acquire/release should not be used in 
the given context. Does that make sense?

> The current code often expresses things in a semantically incorrect way
> eg that:
>
> store_release(addr, val) == store(addr, val); release();
>
> but in practice the end code provides what is required (often providing
> a stronger barrier because that's all that is available on a platform).

I'm not sure I get what you mean by this being semantically incorrect. I 
get it's suboptimal and might be slower triggering compiler barriers 
that won't be necessary when joined, but semantically surely it should 
be the same, right?

> And I agree with Karen - tackle correctness first then refactoring. IT
> will be too hard to track correctness otherwise. Personally, given I
> work in this code quite a bit, I like to see the complete set of
> definitions for a platform in one place. The duplication is something I
> can live with when it is mostly one-liners.

Okay no problem. I built the general variant first (because it was 
easier since I don't have access to other platforms except mine haha). 
But I fully understand yours and Karen's motivation so I will get back 
to you when I manage to do it the code duplicated way so it's easier to 
follow the evolution.

Oh and speaking of which, I find it very strange that store_fence and 
release_store_fence is duplicated on x86 because we don't want to cast 
away the volatile. This whole non-volatile thing seems a bit off to me - 
surely all calls to OrderedAccess should be expected to be volatile, 
nothing else would make any sense? And if so, why is store_fence the one 
exception that is non-volatile while none of the others are? Unless 
anyone knows there is a good reason for this, I'd like to make it 
consistent with the rest of the code and make it volatile as well (and 
hence get rid of some code duplication while at it). Permission granted? ;)

Thank you David for a nice discussion about this! :)

/Erik