RFR: 8291649: multiple tests failing with -Xcomp after JDK-8290034 [v2]

John Rose john.r.rose at oracle.com
Fri Aug 26 17:00:46 UTC 2022


After sleeping on it I realize I like Kim’s improvements
better than my code which does not derive the type `I`.

The use of metaprogramming (or whatever it is) to derive
the ad hoc type `I` using funny conditionals does in fact
make for logic that more clearly expresses how the integral
promotion rules of C++ will affect the outcome.

Suggestion:  To the gtest unit test, add the following block
to advise folks how to check code quality:

```
// To manually check code quality, try these commands:
//
// $ make hotspot exploded-test TEST=gtest:moveBits
// $ objdump -D $(find build -name test_moveBits.o) > /tmp/foo
// $ sed < /tmp/foo -n /quality/,/stream/p | grep -B20 bswap
//
// Or adapt $(find build -name test_moveBits.o.cmdline)
// replacing -c by -S and -o test_moveBits.s.
```

On 26 Aug 2022, at 1:07, John Rose wrote:

> Thanks; right.  I like your improvements, except the metaprogramming.
>
> Here’s what I got:
>
> https://github.com/openjdk/jdk/compare/master...rose00:jdk:moveBits
>
> I’m not against metaprogramming when we need it, but there are 
> alternatives here to deriving the extra types; the tricks are 
> explained in the comments.  I suppose it’s a matter of taste whether 
> the metaprogramming is more or less obscure than the bit-twiddling 
> tricks, but I will observe that bit-twiddling tricks are the focus of 
> this file.
>
> I verified by hand that you get good code in all cases, even without 
> explicit mentions of the intrinsics.  Here’s an easy way to start 
> that process:
>
> ```
> $ make hotspot exploded-test TEST=gtest:moveBits
> $ objdump -D $(find build -name test_moveBits.o) > /tmp/foo
> $ sed < /tmp/foo -n /quality/,/stream/p | grep -B20 bswap
>
> ```
>
>
> On 25 Aug 2022, at 23:39, Kim Barrett wrote:
>
>> On Fri, 26 Aug 2022 01:06:25 GMT, John R Rose <jrose at openjdk.org> 
>> wrote:
>>
>>>> src/hotspot/share/utilities/moveBits.hpp line 31:
>>>>
>>>>> 29: #include "utilities/globalDefinitions.hpp"
>>>>> 30:
>>>>> 31: inline uint32_t reverse_bits_in_bytes_int(uint32_t x) {
>>>>
>>>> Instead of two functions with different names (suffixed `_int` and 
>>>> `_long`), have one function name that has two overloads 
>>>> disambiguated by the size of the argument, e.g.
>>>>
>>>>
>>>> template<typename T, ENABLE_IF(sizeof(T) <= 4)>
>>>> constexpr T reverse_bits_in_bytes(T x) { ... }
>>>>
>>>> template<typename T, ENABLE_IF(sizeof(T) == 8)>
>>>> constexpr T reverse_bits_in_bytes(T x) { ... }
>>>>
>>>>
>>>> This allows `reverse_bits` to look something like
>>>>
>>>>
>>>> template<typename T, ENABLE_IF(std::is_integral<T>::value)>
>>>> constexpr T reverse_bits(T v) {
>>>>   return reverse_bytes(reverse_bits_in_bytes(v));
>>>> }
>>>
>>> Yes, that could work.
>>>
>>> This also works, and feels appropriately short + sweet to me:
>>>
>>> https://github.com/rose00/jdk/pull/new/moveBits
>>>
>>> (I also added a 90+% level gtest.)
>>>
>>> (Sorry; amended with `constexpr` and force-pushed…)
>>
>> I took John's code and made various (I think) improvements:
>> https://github.com/openjdk/jdk/compare/master...kimbarrett:openjdk-jdk:reverse-bits?expand=1
>>
>>>> src/hotspot/share/utilities/moveBits.hpp line 63:
>>>>
>>>>> 61: 
>>>>> /*****************************************************************************
>>>>> 62:  * GCC and compatible (including Clang)
>>>>> 63:  
>>>>> *****************************************************************************/
>>>>
>>>> Is it really worth the extra code to allow the use of the gcc 
>>>> builtins?
>>>
>>> Kim, is there a way to do this in less code?
>>>
>>> Seems to me that something like this could work:
>>>
>>>
>>> template<typename T, size_t S>
>>> T reverse_bytes(T x) {
>>>   if (size == 1)  return x;
>>>   T alt8bs = (T)((uint64_t)-1 / 0xFFFF * 0xFF);  // 0x00FF…00FF
>>>   x = ((x & alt8bs) << 8) | (((uint64_t)x >> 8 & alt8bs);
>>>   if (size == 2)  return x;
>>>   T alt16bs = (T)((uint64_t)-1 / 0xFFFFFFFFll * 0xFFFF);  // 
>>> 0x0000FFFF…0000FFFF
>>>>>> }
>>>
>>>
>>> That is, one function bodies handles cases for 1/2/4/8 byte values.
>>>
>>> Could it be just a function template, with an explicit instantiation 
>>> (overriding the general template) where it matches an intrinsic?
>>
>> Answering my own question, no, it's not worth the extra code.  gcc 
>> (and likely other compilers) are quite good at recognizing byte 
>> swapping code and transforming it into the appropriate 
>> platform-specific code like x86 bswap.
>>
>> -------------
>>
>> PR: https://git.openjdk.org/jdk/pull/9987
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-compiler-dev/attachments/20220826/076acdcc/attachment-0001.htm>


More information about the hotspot-compiler-dev mailing list