RFR (M): 8184334: Generalizing Atomic with templates

Erik Österlund erik.osterlund at oracle.com
Tue Jul 18 15:26:04 UTC 2017



On 2017-07-18 16:41, Andrew Haley wrote:
> On 18/07/17 14:46, Erik Österlund wrote:
>>
>> On 2017-07-18 11:57, Andrew Haley wrote:
>>> On 18/07/17 10:38, Erik Österlund wrote:
>>>
>>>>> ------------------------------------------------------------------------
>>>>> 3.10, Lvalues and rvalues
>>>>>
>>>>> If a program attempts to access the stored value of an object through
>>>>> a glvalue of other than one of the following types the behavior is
>>>>> undefined:
>>>>>
>>>>> — the dynamic type of the object,
>>>>>
>>>>> — a cv-qualified version of the dynamic type of the object,
>>>>>
>>>>> — a type similar (as defined in 4.4) to the dynamic type of the object,
>>>>>
>>>>> — a type that is the signed or unsigned type corresponding to the
>>>>>      dynamic type of the object,
>>>>>
>>>>> — a type that is the signed or unsigned type corresponding to a
>>>>>      cv-qualified version of the dynamic type of the object,
>>>>>
>>>>> — an aggregate or union type that includes one of the aforementioned
>>>>>      types among its elements or non- static data members (including,
>>>>>      recursively, an element or non-static data member of a subaggregate
>>>>>      or contained union),
>>>>>
>>>>> — a type that is a (possibly cv-qualified) base class type of the
>>>>>      dynamic type of the object,
>>>>>
>>>>> — a char or unsigned char type.
>>>>> ------------------------------------------------------------------------
>>>>>
>>>>> You only have permission to convert pointers to intptr_t and back: you
>>>>> do not have permission to access the stored value of a pointer an an
>>>>> intptr_t.
>>>> I would say the scenario you describe goes under "the dynamic type of
>>>> the object" or "a type that is the signed or unsigned type corresponding
>>>> to the dynamic type of the object",
>>> OK.
>>>
>>>> in the quoted section 3.10 of the standard, depending on specific
>>>> use case.  The problem that type aliasing is aimed at is if you
>>>> store an A* and then load it as a B*, then the dynamic type is A*,
>>>> yet it is loaded as B*, where B is not compatible with A.
>>> Precisely.  That is what is happening in this case.  A is, say, void*
>>> and B is intptr_t.  void* is not compatible with intptr_t.
>> My interpretation is that the aliasing rules are for points-to analysis
>> being able to alias that if somebody stores A* and then other code loads
>> that as B* and accesses B, then it is assumed that the B* does not
>> points-to the A* as they are of incompatible types,
> It is more general than that.  If you access the stored value of an
> object through a glvalue of other than one of the allowed types, then
> your *whole program is undefined*.

And I think it is in that list of allowed types. Specifically, it is my 
interpretation of the standard that intptr_t is "a type that is the 
signed or unsigned type corresponding to the dynamic type of the object" 
(cf. 5.2.10) if the dynamic type is e.g. void*.

>> and that therefore it is fine to load something (that was stored as
>> A*) as intptr_t and subsequently cast it to A* before the A itself
>> is being accessed. Am I wrong?
> No, that is correct.  The question is whether intptr_t is a type
> compatible with a pointer type, and I don't think you will find any
> language to the effect that it is.

I think 5.2.10 describing reinterpret_cast seems to suggest that they 
are compatible by saying "A pointer converted to an integer of 
sufficient size (if any such exists on the implementation) and back to 
the same pointer type will have its original value". That is precisely 
what is done.
Conversely, I do not find any language suggesting that intptr_t would 
not be compatible with pointer types.

> Pointer types are distinct from one another and from all integer
> types.  People sometimes get confused by this: the fact that you can
> cast from A* to e.g. void*, doesn't tell you that you can cast from
> A** to void** and use the result: they are different types.

Agreed. But I do not see any problem with reinterpret_cast from A* to 
intptr_t back to A* and using the A from that result. That is explicitly 
allowed as per 5.2.10 as previously described.

>> For example, the following test program compiles and runs with g++
>> -fstrict-aliasing -Wstrict-aliasing=3 -O3 -std=c++03:
>>
>> #include <stdio.h>
>> #include <stdint.h>
>>
>> class A{
>> public:
>>     int _x;
>>     A() : _x(0) {}
>> };
>>
>> int main(int argc, char* argv[]) {
>>     A a;
>>     A b;
>>     A* ptr = &a;
>>     A** ptr_ptr = &ptr;
>>     intptr_t* iptr_ptr = reinterpret_cast<intptr_t*>(ptr_ptr);
>>
>>     *ptr_ptr = &b;
>>     intptr_t iptr = *iptr_ptr;
>>     A* ptr2 = reinterpret_cast<A*>(iptr);
>>
>>     printf("iptr = %lx, &a = %lx, &b = %lx, iptr->_x = %d\n", iptr,
>>            reinterpret_cast<intptr_t>(&a),
>> reinterpret_cast<intptr_t>(&b), ptr2->_x);
>>
>>     return 0;
>> }
>>
>> The program stores an A*, reads it as intptr_t and casts it to A*, and
>> then dereferences into the A. Even with -Wstrict-aliasing=3 GCC does not
>> complain about this. Is GCC wrong about not complaining about this?
> No GCC is not wrong because GCC does not have to complain about this.
> You are, however, wrong to write it!

Oh dear. My fingers slipped.

> Try
>
>    g++ -fsanitize=undefined pp.cc
>
> and you'll get
>
>    iptr = 7ffffcf68ec0, &a = 7ffffcf68ed0, &b = 7ffffcf68ec0, iptr->_x = 0
>
> at one point of undefined behaviour.
>
> [ A warning here: don't assume that the sanitizer can detect all UB
> caused by messing with pointer types, but it can detect this
> example. ]

That looks like the correct (and indeed expected) output. iptr should be 
the same as &b, and iptr->_x should be 0, and &a should be different 
from &b and iptr. I tried this experiment locally too with similar 
correct result. Am I missing something here?

>> The way I interpret the standard, intptr_t is the signed type
>> corresponding to the dynamic type of A*, which seems compliant to me.
> But that's not what the standard says.  Remember that whatever is
> allowed is explicitly allowed, and if something is not allowed it is
> forbidden.

Agreed. But it is my interpretation is that this is explicitly allowed.

In the case where A* is stored, and subsequently loaded as intptr_t (and 
casted to A*), the dynamic type observed by the load is A* and intptr_t 
is the signed type of A*. This is explicitly allowed w.r.t. aliasing 
because the type of the load may be:
3.10: "a type that is the signed or unsigned type corresponding to the 
dynamic type of the object"
...which it is as intptr_t is the signed type corresponding to A*.

This intptr_t is then casted back to the dynamic type A* through 
reinterpret_cast, which is explicitly allowed because:
5.2.10: "A pointer converted to an integer of sufficient size (if any 
such exists on the implementation) and back to the same pointer type 
will have its original value"

Thanks,
/Erik

>> Of course the way it is stated in the standard is a bit vague (as
>> usual), but the compilers seem to support my interpretation. Is my
>> interpretation and GCC -Wstrict-aliasing=3 wrong here in allowing
>> this?
> Your interpretation is wrong.  But a C++ compiler can do anything with
> undefined behaviour, which includes the possibility that it does what
> the programmer expected.
>
>>> Let me reiterate: you may cast from any pointer type to intptr_t.  You
>>> may not access a pointer in memory as though it were an intptr_t.
>> Are you 100% sure about this? Again, I do not know where in the standard
>> this is stated,
> It's in my quote above.
>
>> but -Wstrict-aliasing=3 allows me to do that as long as the intptr_t
>> is appropriately casted to the A* before the A behind the dynamic A*
>> type is used. Perhaps I have missed something?
> Yes, I am sure.  Warnings don't help you here because if you throw in
> enough casts you can suppress any type-based warnings.
>



More information about the hotspot-runtime-dev mailing list