RFR (M): 8184334: Generalizing Atomic with templates

Tue Jul 18 09:38:27 UTC 2017

Hi Andrew,

On 2017-07-17 17:59, Andrew Haley wrote:
> Hi,
>
> On 17/07/17 16:00, Erik Österlund wrote:
>
>> I am glad that it seems to work. Thank you for reviewing.
>>
>> As for the over engineered casting mechanism, the IntegerTypes class in
>> metaprogramming/integerTypes.hpp has lots of commentary about what the
>> different casting mechanisms do. Hope that helps.
> It doesn't, because most HotSpot programmers know what this stuff
> does, but not why you're doing it.

I am not sure what 'this stuff' is in this context, so I am afraid I can 
not answer why I am doing it at this point.

>> To explain the reasoning, I will go through an example. Let's take
>> atomic::cmpxchg.
>> Each platform has a specialization for different supported cmpxchg
>> implementations in terms of int32_t and int64_t.
> OK, I get that.  Some platforms presumably requires such
> implementations, but would it not make more sense to restrict the
> complexity to just those platforms?  None of the GCC targets need it.

The Atomic API in hotspot has a frontend and backend part. The frontend 
is shared, and mediates the atomic operations to the platform-specific 
backends. What I have worked on is the frontend. The backend is platform 
specific. Some platforms might want to use GCC intrinsics, while others 
might want to do something else.

Are you proposing that the shared mediator to the backend should have a 
GCC-only path?

>> So for example, we need to be able to perform the following conversions:
>> 1) Floating point to integer without changing bit representaiton. It
>> seems like the only way of doing this without violating the C++03
>> standard is to use memcpy. (no, the union trick in e.g. jfloat_cast is
>> not standards compliant)
> I'm not entirely convinced that memcpy does it either for "truly
> portable" C++, but perhaps it's as good as we're going to get.

I am curious what you mean here.

>> 2) Pointer to integer
>> 3) Anything with not matching CV qualifiers to matching CV qualifiers
>> 4) Integer of wrong signedness to matching signedness
>> 5) Integer with same signedness but declared differently, e.g. signed
>> char is not the same as char, and will miss that specialization. For
>> example on Solaris, int8_t is char, but jbyte is signed char, and those
>> types are different. Sometimes even intptr_t is neither int32_t or
>> int64_t as I have seen on clang.
> If you have something like
>
> long n;
> long *p = &n;
>
> The you can access the stored value in p with any type compatible with
> long*.  However, this:
>
>    Atomic::cmpxchg_ptr(0, &p, &n);
>
> casts &n to (signed) intptr_t: this is guaranteed to work.

Okay, so far so good then.

> Unfortunately, it then calls generic_cmpxchg<long>(0, &p, &n).  This
> accesses the stored pointer in p as a long, which is undefined
> behaviour.

As generic_cmpxchg is in the backend of Linux AArch64, I assume you are 
referring to the code in that backend, which should not come as a surprise.
Further on, I am assuming you are referring to how for a pointer, 
previously intptr_t would be sent into generic_cmpxchg, and now we send 
in int64_t. But you will find that the following assert is true with GCC 
on AArch64 (the backend you are interested in):

#include <metaprogramming/isSame.hpp>

STATIC_ASSERT(IsSame<intptr_t, int64_t>::value);

That is, intptr_t and int64_t is the exact same type by definition, and 
hence the type of the instantiated generic_cmpxchg is exactly the same.
So this particular backend we would previously send in intptr_t, and now 
send in a type that IsSame as intptr_t. The behaviour looks identical to me.

So as long as intptr_t and T* are compatible types (which they are by 
definition), I do not see how this is undefined behaviour.

> It works on GCC if we compile everything with -fno-strict-aliasing (at
> least if intptr_t has the same representation in memory as every
> pointer type, which every machine we care about does) but in that case
> there's little point casting pointers back and forth to integer types.

I hope my answer above convinces you that this is not a problem.

>> Due to these issues, the IntegerTypes class was created to solve the
>> casting problems in one single place while remaining type safe. So it
>> solves the problem of canonicalizing primitive types and casting them
>> into exactly the same canonical integer type that can be safely passed
>> into code that specializes on that canonical integer type, without
>> changing the bit pattern or slipping in type safety.
> It is not type safe.
>
> ------------------------------------------------------------------------
> 3.10, Lvalues and rvalues
>
> If a program attempts to access the stored value of an object through
> a glvalue of other than one of the following types the behavior is
> undefined:
>
> — the dynamic type of the object,
>
> — a cv-qualified version of the dynamic type of the object,
>
> — a type similar (as defined in 4.4) to the dynamic type of the object,
>
> — a type that is the signed or unsigned type corresponding to the
>    dynamic type of the object,
>
> — a type that is the signed or unsigned type corresponding to a
>    cv-qualified version of the dynamic type of the object,
>
> — an aggregate or union type that includes one of the aforementioned
>    types among its elements or non- static data members (including,
>    recursively, an element or non-static data member of a subaggregate
>    or contained union),
>
> — a type that is a (possibly cv-qualified) base class type of the
>    dynamic type of the object,
>
> — a char or unsigned char type.
> ------------------------------------------------------------------------
>
> You only have permission to convert pointers to intptr_t and back: you
> do not have permission to access the stored value of a pointer an an
> intptr_t.

I would say the scenario you describe goes under "the dynamic type of 
the object" or "a type that is the signed or unsigned type corresponding 
to the dynamic type of the object", in the quoted section 3.10 of the 
standard, depending on specific use case.
The problem that type aliasing is aimed at is if you store an A* and 
then load it as a B*, then the dynamic type is A*, yet it is loaded as 
B*, where B is not compatible with A. This is invariant of whether the 
value of the store is indirectly performed through intptr_t or loaded 
indirectly through intptr_t, as the dynamic type of the stored value is 
still A*.

If you now perform a store of A* through Atomic that casts the pointer 
to intptr_t, and stores it, then the dynamic type is still A*. If a 
subsequent load of A* observes that store, then the type of that loads 
needs to be compatible with the dynamic type, which is A*. In this 
scenario, the observed value is the dynamic type A* and is loaded as A*, 
so that is all good.

Conversely, if you perform a normal store of A*, and observe that value 
inside of Atomic, then the dynamic type is still A*, and the value 
observed in atomic is the signed type corresponding to the dynamic type 
A*, so that is all good. And the signed type corresponding to the 
dynamic type A* may safely be casted to A*.

In summary, my point is that as long as the performed stores and loads 
in Atomic preserves the bit representation of whatever pointer was 
passed in, the dynamic type of that pointer is unchanged, invariantly of 
casting it to/from intptr_t, and hence the aliasing is allowed.

Do you have a different interpretation of the standard than I do?

>
>> By solving this general casting problem problem in IntegerTypes, it
>> turned the casting issues in Atomic into trivial non-problems, and
>> hopefully can be used elsewhere in hotspot. For example the jfloat_cast
>> with friend functions are not standards compliant and can now use
>> IntegerTypes instead, for that mindblowingly seemless casting experience.
>>
>> I hope this explanation makes sense.
> I don't believe that it entirely does, no.  But it is the sort of
> commentary which ought to be in the source code.
>

I can put some commentary in atomic.hpp describing the use of 
IntegerTypes for casting if you like.

Thanks,
/Erik