RFR: 8145096: Undefined behaviour in HotSpot

Kim Barrett kim.barrett at oracle.com
Fri Dec 11 19:33:34 UTC 2015


On Dec 11, 2015, at 1:39 PM, John Rose <john.r.rose at oracle.com> wrote:
> 
> P.S.
> 
> On Dec 11, 2015, at 10:17 AM, John Rose <john.r.rose at oracle.com> wrote:
>> 
>> http://en.cppreference.com/w/cpp/language/reinterpret_cast
> 
> After reading the fine print, I see that integral-to-integral reinterpretations
> are not part of the portfolio of reinterpret_cast.  But you can reinterpret
> an lvalue of type unsigned int as a reference to type signed int, which
> activates type aliasing rules that allow the intended conversion:
> 
>> • AliasedType is the (possibly cv-qualified) signed or unsigned variant of DynamicType
> 
> These rules (or similar rules elsewhere in the spec) may (or may not)
> imply that the union trick, and/or memcpy trick, gets the desired result.
> 
> So, the lvalue cast makes the following macro definition plausible:
> 
> #define JAVA_INTEGER_OP(OP, NAME, TYPE, UNSIGNED_TYPE)  \
> inline TYPE NAME (TYPE in1, TYPE in2) {                 \
> STATIC_ASSERT(sizeof(TYPE) == sizeof(UNSIGNED_TYPE)); \
> UNSIGNED_TYPE ures = static_cast<UNSIGNED_TYPE>(in1); \
> ures OP ## = static_cast<UNSIGNED_TYPE>(in2);         \
> return reinterpret_cast<TYPE&>(ures);                          \
> }

You beat me to this.  Yes, I think this works.

The lvalue to reference conversion has the desired meaning: C++03
5.2.10 p10.  And while this is a form of pointer type punning, it is
one of the few cases of such that are permitted: C++03 3.10 p15 3rd
bullet.

Recent gcc -O1 generates the desired code.

Recent SunStudios with -O1 generates the desired code, unlike for the
memcpy method where -O4 was needed.

I'll collect information on other platforms.

ps. The memcpy approach is applicable to more cases, such as
converting an integer to a float having the same bit pattern.
Fortunately we have this reinterpret_cast approach available for the
case at hand, since the memcpy idiom seems to be poorly handled by a
lot of compilers.

pps. The union trick is undefined behavior because it reads a union
member other than the last one written.  gcc explicitly permits the
union trick; see the documentation for the -fstrict-aliasing option.
I know I've heard of a compiler which did not permit it and could
generate "wrong" code, but I've not yet tracked down the discussion
where I read about that.



More information about the hotspot-dev mailing list