RFR (M): 8184334: Generalizing Atomic with templates

Andrew Haley aph at redhat.com
Tue Jul 18 14:41:14 UTC 2017


On 18/07/17 14:46, Erik Österlund wrote:
> 
> 
> On 2017-07-18 11:57, Andrew Haley wrote:
>> On 18/07/17 10:38, Erik Österlund wrote:
>>
>>>> ------------------------------------------------------------------------
>>>> 3.10, Lvalues and rvalues
>>>>
>>>> If a program attempts to access the stored value of an object through
>>>> a glvalue of other than one of the following types the behavior is
>>>> undefined:
>>>>
>>>> — the dynamic type of the object,
>>>>
>>>> — a cv-qualified version of the dynamic type of the object,
>>>>
>>>> — a type similar (as defined in 4.4) to the dynamic type of the object,
>>>>
>>>> — a type that is the signed or unsigned type corresponding to the
>>>>     dynamic type of the object,
>>>>
>>>> — a type that is the signed or unsigned type corresponding to a
>>>>     cv-qualified version of the dynamic type of the object,
>>>>
>>>> — an aggregate or union type that includes one of the aforementioned
>>>>     types among its elements or non- static data members (including,
>>>>     recursively, an element or non-static data member of a subaggregate
>>>>     or contained union),
>>>>
>>>> — a type that is a (possibly cv-qualified) base class type of the
>>>>     dynamic type of the object,
>>>>
>>>> — a char or unsigned char type.
>>>> ------------------------------------------------------------------------
>>>>
>>>> You only have permission to convert pointers to intptr_t and back: you
>>>> do not have permission to access the stored value of a pointer an an
>>>> intptr_t.
>>> I would say the scenario you describe goes under "the dynamic type of
>>> the object" or "a type that is the signed or unsigned type corresponding
>>> to the dynamic type of the object",
>> OK.
>>
>>> in the quoted section 3.10 of the standard, depending on specific
>>> use case.  The problem that type aliasing is aimed at is if you
>>> store an A* and then load it as a B*, then the dynamic type is A*,
>>> yet it is loaded as B*, where B is not compatible with A.
>> Precisely.  That is what is happening in this case.  A is, say, void*
>> and B is intptr_t.  void* is not compatible with intptr_t.
> 
> My interpretation is that the aliasing rules are for points-to analysis 
> being able to alias that if somebody stores A* and then other code loads 
> that as B* and accesses B, then it is assumed that the B* does not 
> points-to the A* as they are of incompatible types,

It is more general than that.  If you access the stored value of an
object through a glvalue of other than one of the allowed types, then
your *whole program is undefined*.

> and that therefore it is fine to load something (that was stored as
> A*) as intptr_t and subsequently cast it to A* before the A itself
> is being accessed. Am I wrong?

No, that is correct.  The question is whether intptr_t is a type
compatible with a pointer type, and I don't think you will find any
language to the effect that it is.

Pointer types are distinct from one another and from all integer
types.  People sometimes get confused by this: the fact that you can
cast from A* to e.g. void*, doesn't tell you that you can cast from
A** to void** and use the result: they are different types.

> For example, the following test program compiles and runs with g++ 
> -fstrict-aliasing -Wstrict-aliasing=3 -O3 -std=c++03:
> 
> #include <stdio.h>
> #include <stdint.h>
> 
> class A{
> public:
>    int _x;
>    A() : _x(0) {}
> };
> 
> int main(int argc, char* argv[]) {
>    A a;
>    A b;
>    A* ptr = &a;
>    A** ptr_ptr = &ptr;
>    intptr_t* iptr_ptr = reinterpret_cast<intptr_t*>(ptr_ptr);
> 
>    *ptr_ptr = &b;
>    intptr_t iptr = *iptr_ptr;
>    A* ptr2 = reinterpret_cast<A*>(iptr);
> 
>    printf("iptr = %lx, &a = %lx, &b = %lx, iptr->_x = %d\n", iptr,
>           reinterpret_cast<intptr_t>(&a), 
> reinterpret_cast<intptr_t>(&b), ptr2->_x);
> 
>    return 0;
> }
> 
> The program stores an A*, reads it as intptr_t and casts it to A*, and 
> then dereferences into the A. Even with -Wstrict-aliasing=3 GCC does not 
> complain about this. Is GCC wrong about not complaining about this?

No GCC is not wrong because GCC does not have to complain about this.
You are, however, wrong to write it!

Try

  g++ -fsanitize=undefined pp.cc

and you'll get

  iptr = 7ffffcf68ec0, &a = 7ffffcf68ed0, &b = 7ffffcf68ec0, iptr->_x = 0

at one point of undefined behaviour.

[ A warning here: don't assume that the sanitizer can detect all UB
caused by messing with pointer types, but it can detect this
example. ]

> The way I interpret the standard, intptr_t is the signed type 
> corresponding to the dynamic type of A*, which seems compliant to me.

But that's not what the standard says.  Remember that whatever is
allowed is explicitly allowed, and if something is not allowed it is
forbidden.

> Of course the way it is stated in the standard is a bit vague (as
> usual), but the compilers seem to support my interpretation. Is my
> interpretation and GCC -Wstrict-aliasing=3 wrong here in allowing
> this?

Your interpretation is wrong.  But a C++ compiler can do anything with
undefined behaviour, which includes the possibility that it does what
the programmer expected.

>> Let me reiterate: you may cast from any pointer type to intptr_t.  You
>> may not access a pointer in memory as though it were an intptr_t.
> 
> Are you 100% sure about this? Again, I do not know where in the standard 
> this is stated,

It's in my quote above.

> but -Wstrict-aliasing=3 allows me to do that as long as the intptr_t
> is appropriately casted to the A* before the A behind the dynamic A*
> type is used. Perhaps I have missed something?

Yes, I am sure.  Warnings don't help you here because if you throw in
enough casts you can suppress any type-based warnings.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-dev mailing list