RFR: 8145096: Undefined behaviour in HotSpot
Kim Barrett
kim.barrett at oracle.com
Fri Dec 11 19:33:34 UTC 2015
On Dec 11, 2015, at 1:39 PM, John Rose <john.r.rose at oracle.com> wrote:
>
> P.S.
>
> On Dec 11, 2015, at 10:17 AM, John Rose <john.r.rose at oracle.com> wrote:
>>
>> http://en.cppreference.com/w/cpp/language/reinterpret_cast
>
> After reading the fine print, I see that integral-to-integral reinterpretations
> are not part of the portfolio of reinterpret_cast. But you can reinterpret
> an lvalue of type unsigned int as a reference to type signed int, which
> activates type aliasing rules that allow the intended conversion:
>
>> • AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType
>
> These rules (or similar rules elsewhere in the spec) may (or may not)
> imply that the union trick, and/or memcpy trick, gets the desired result.
>
> So, the lvalue cast makes the following macro definition plausible:
>
> #define JAVA_INTEGER_OP(OP, NAME, TYPE, UNSIGNED_TYPE) \
> inline TYPE NAME (TYPE in1, TYPE in2) { \
> STATIC_ASSERT(sizeof(TYPE) == sizeof(UNSIGNED_TYPE)); \
> UNSIGNED_TYPE ures = static_cast<UNSIGNED_TYPE>(in1); \
> ures OP ## = static_cast<UNSIGNED_TYPE>(in2); \
> return reinterpret_cast<TYPE&>(ures); \
> }
You beat me to this. Yes, I think this works.
The lvalue to reference conversion has the desired meaning: C++03
5.2.10 p10. And while this is a form of pointer type punning, it is
one of the few cases of such that are permitted: C++03 3.10 p15 3rd
bullet.
Recent gcc -O1 generates the desired code.
Recent SunStudios with -O1 generates the desired code, unlike for the
memcpy method where -O4 was needed.
I'll collect information on other platforms.
ps. The memcpy approach is applicable to more cases, such as
converting an integer to a float having the same bit pattern.
Fortunately we have this reinterpret_cast approach available for the
case at hand, since the memcpy idiom seems to be poorly handled by a
lot of compilers.
pps. The union trick is undefined behavior because it reads a union
member other than the last one written. gcc explicitly permits the
union trick; see the documentation for the -fstrict-aliasing option.
I know I've heard of a compiler which did not permit it and could
generate "wrong" code, but I've not yet tracked down the discussion
where I read about that.
More information about the hotspot-dev
mailing list