RFR (M): 8184334: Generalizing Atomic with templates

Fri Jul 21 16:08:02 UTC 2017

Hi Andrew,

On 2017-07-21 15:43, Andrew Haley wrote:
> Hi,
>
> On 20/07/17 14:45, Erik Österlund wrote:
>
>> It seems like the aliasing problem has multiple levels at which we can
>> debate. I will try to put my thoughts down in a reasonably structured
>> way... let's see how that goes.
>>
>> These are the core questions we are debating (and I will answer them in
>> order):
>>
>> 1) What does the standard say about aliasing rules?
>> 2) What did concrete implementations like GCC implement the aliasing rules?
>> 3) What are the implications of those aliasing rules being violated?
>> Does it matter?
>> 4) Have these changes actually changed anything relating to those
>> aliasing rules?
>> 5) How do we proceed?
>>
>> Regarding #1: We seem to agree that there are various possible
>> interpretations of the standard (wouldn't be the first time...); at
>> least one interpretation that violates treating intptr_t as
>> compatible with A* and at least one interpretation that permits
>> it. I do not know if there is any one true interpretation of the C++
>> standard, so I will assume the possibility that either one of us
>> could be right here, and hence that a compiler writer might write a
>> compiler in which it is either permitted aliasing or not.
>>
>> Regarding #2: While the simplest most contrived example of what we
>> are arguing about being a strict aliasing violation is not caught as
>> a strict aliasing violation using the strictest strict aliasing
>> checks available in GCC, I feel dubious, but will take your word for
>> it that intptr_t and A* are not compatible types. So for the rest of
>> the discussion, I will assume this is true.
> OK.  For the sake of the discussion I'll go along with this.
>
>> Regarding #3: We already inherently rely on passed in intptr_t or even
>> jlong aliasing A*, possibly today enforced through -fno-strict-aliasing.
>> I do not think the JVM will ever be able to turn such optimizations on.
> I completely agree.  However, if we are using -fno-strict-aliasing,
> then I do not believe that there is any need for the complexity of the
> casting that your patch does.  We should be able to do the operations
> that we need with no type casting at all, with suitable use of
> template functions.
>
>> Our C++ runtime lives in the border lands between Java and C++ that I
>> will refer to as "shady land".
> Yep.
>
>> Regarding #4: I have made changes in the connection between the
>> frontend and the backend. But I would like to maintain that the
>> backends retain the same behaviour as they did before. For example,
>> your xchg_ptr for void* casts (C-style) the exchange_value to
>> intptr_t and use the intptr_t variant of xchg_ptr that passes
>> intptr_t (a.k.a. int64_t on your architecture) into
>> _sync_lock_test_and_set. So whatever hypothetical aliasing problem
>> the new Atomic implementation class were still there before these
>> changes. I have not introduced any of those aliasing issues, only
>> improved the mediation between the frontend to the platform
>> dependent backend.
> True.  But this is new code, and IMO should do better, and also IMO
> does not need to be significantly more complex than the already
> complex code we have.  I assume that the complexity is due to some
> compilers I don't know about, though, so I admit that I am at
> something of a disadvantage.  I suspect that none of it is necessary
> for a complete and correct implementation on GCC.  By "the complexity"
> I am referring to a 4k line patch.  I fear that not only is it
> complex, it may also inhibit useful compiler optimizations.
>
>> Regarding #5: Now as for how to move on with this, I hope we can
>> agree that for code in "shady land", the compiler simply has to
>> treat intptr_t as possible aliases to A* and that if it does not,
>> then we need to tame it to do so. Because the C++ compiler simply
>> does not know the full story of what is going on in our program and
>> can not safely make any meaningful points-to analysis and data flow
>> optimizations crossing dynamically generated JIT-compiled machine
>> code of a different language.
>>
>> I propose three different solutions that we can discuss:
>>
>> Solution A: Do nothing. With this solution we recognize that a) we
>> have actually not introduced any undefined behaviour that was not
>> already there - the backend used intptr_t before and continues to do
>> so, and b) doing so is safe (and inherently has to be safe) due to
>> other levels of safeguarding such as turning off strict aliasing and
>> using volatile.
>>
>> Solution B: If intptr_t does indeed not alias A*, we could find a
>> pointer sized type, let's call it CanonicalPointer, that does indeed
>> alias a generic A*, and pass pointer types from the frontend as
>> CanonicalPointer to the backend. For example char* is explicitly
>> described in the standard as an exception of the rules that is
>> explicitly aliased to a generic A* (hopefully we agree about
>> that).
> We don't.  A char* can point to the bytes of any type, but this rule
> does not mean that an object of type char* can be accessed by an
> lvalue of some other pointer type.  They aren't the same thing at all.
>
> The questions "what can a char* point to?" and "what can point to a
> char* ?"  are quite different.  An int** can't point to a char*.  The
> only permission we have is that any pointer can be cast to a character
> pointer, and that character pointer points to the first byte of the
> object.

I'm not sure if I was clear enough, pardon me if I was not, but the 
loophole I was referring to is section 3.10 where it says:

================================================
If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined:

— the dynamic type of the object,

— a cv-qualified version of the dynamic type of the object,

— a type similar (as defined in 4.4) to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to the
   dynamic type of the object,

— a type that is the signed or unsigned type corresponding to a
   cv-qualified version of the dynamic type of the object,

— an aggregate or union type that includes one of the aforementioned
   types among its elements or non- static data members (including,
   recursively, an element or non-static data member of a subaggregate
   or contained union),

— a type that is a (possibly cv-qualified) base class type of the
   dynamic type of the object,

— a char or unsigned char type.
================================================

I will make a store load pair of A* example split into three relevant cases:
1) A non-Atomic store is observed by an Atomic load
2) An Atomic store is observed by a non-Atomic load
3) An Atomic store is observed by an Atomic load

Case 1: A non-Atomic store is observed by an Atomic load
A non-Atomic store (possibly initialization code) stores an A* that is 
observed as char* and subsequently reinterpret_casted to A*. I claim 
that A* and char* are compatible types in this context.
It is not undefined behaviour w.r.t. 3.10 if the type behind the pointer 
in a pointer load is a char, regardless of the dynamic type it actually 
points at. The load then has to assume conservatively that it does not 
know what those bytes are referring to - it might be anything (including 
the dynamic type behind the stored pointer: A). In other words, char 
aliases all dynamic types that the pointer could point at. It seems like 
we both agree this case is fine.

Case 2: An Atomic store is observed by a non-Atomic load
Conversely, if the Atomic API was to store something as a char* that 
actually has a dynamic type of A*, and a normal load observes this as an 
A*, that is also fine as the dynamic type is still A* despite being 
stored as char*. In other words, the load is invariant of what type the 
store had, w.r.t. 3.10, if its type matches the dynamic type. And it does.
It seems to me like this is the case where you made a point that despite 
being stored as char*, that does not mean that a non-Atomic load of A* 
will understand the aliasing. And my point is that this is still not 
undefined behavior w.r.t. 3.10 as the non-Atomic load is true to the 
dynamic type. I hope we agree here.

But then apart from the standard there is the problem of aliasing 
optimizations of specific compilers, like -fstrict-aliasing on GCC. 
Let's remember that in C++03, "strict aliasing" is not a thing. And a 
concrete points-to analysis might not know what the true dynamic type of 
a pointer is, so depending on how conservative the data flow analysis, 
points-to analysis and type-based aliasing analysis is, an aggressive 
compiler might make further assumptions than what the standard outlines 
as undefined behaviour, and as a result mess up the program. But then 
again, we turn such optimizations off, and they are arguably not 
supported by the standard. If we were to support that some day, perhaps 
CanonicalPointer would be better off as the union of A* and char*. But 
then as I mentioned, that is not enough because C++ programs can send, 
via JNI, a pointer as a jlong into Java that will try to perform a CAS 
through var handles and unsafe, and ultimately end up in a possible 
runtime call to Atomic performing that CAS with Atomic::cmpxchg of 
jlong. Therefore, we must inherently be able to deal with the dynamic 
type being destroyed in "shady land" long before it reaches Atomic. It 
will seemingly just never work reliably with -fstrict-aliasing, as far 
as I can tell.

Case 3: An Atomic store is observed by an Atomic load
This seems trivially compatible as the store and load pair are 
communicating through the same type, char* that are compatible and aliased.

> With regard to a solution, I feel there may be a much simpler answer.
> But in order to sketch out that simpler answer, I need to know exactly
> what problem is being solved.  I presume that there is a need to turn
> a call to a generalized function such as
>
>     template <typename T, typename U>
>     inline static void store(T store_value, volatile U* dest);
>
> into a call to one of a number of specialized functions such as
>
>     template <>
>     inline void Atomic::specialized_store<int64_t>(int64_t store_value, volatile int64_t* dest) {
>       _Atomic_move_long(&store_value, dest);
>     }

Yes that's about it. Without messing things up of course w.r.t T and U 
having different signedness, size and type.

> and also perform some suitable magic for floating-point types.  But do
> we ever use atomic floats for anything?  Will we ever?
>

That is a good point. I am not certain. But the current API supports it, 
so it seemed like a good idea to continue supporting it.

Thanks,
/Erik