RFR: 8322770: Implement C2 VectorizedHashCode on AArch64 [v2]
Andrew Haley
aph at openjdk.org
Wed Aug 21 15:00:08 UTC 2024
On Wed, 21 Aug 2024 14:49:05 GMT, Mikhail Ablakatov <duke at openjdk.org> wrote:
>> Mikhail Ablakatov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit:
>>
>> 8322770: AArch64: C2: Implement VectorizedHashCode
>>
>> The code to calculate a hash code consists of two parts: a stub method that
>> implements a vectorized loop using Neon instruction which processes 16 or 32
>> elements per iteration depending on the data type; and an unrolled inlined
>> scalar loop that processes remaining tail elements.
>>
>> [Performance]
>>
>> [[Neoverse V2]]
>> ```
>> | 328a053 (master) | dc2909f (this) |
>> ----------------------------------------------------------------------------------------------------------
>> Benchmark (size) Mode Cnt | Score Error | Score Error | Units
>> ----------------------------------------------------------------------------------------------------------
>> ArraysHashCode.bytes 1 avgt 15 | 0.805 ? 0.206 | 0.815 ? 0.141 | ns/op
>> ArraysHashCode.bytes 10 avgt 15 | 4.362 ? 0.013 | 3.522 ? 0.124 | ns/op
>> ArraysHashCode.bytes 100 avgt 15 | 78.374 ? 0.136 | 12.935 ? 0.016 | ns/op
>> ArraysHashCode.bytes 10000 avgt 15 | 9247.335 ? 13.691 | 1344.770 ? 1.898 | ns/op
>> ArraysHashCode.chars 1 avgt 15 | 0.731 ? 0.035 | 0.723 ? 0.046 | ns/op
>> ArraysHashCode.chars 10 avgt 15 | 4.359 ? 0.007 | 3.385 ? 0.004 | ns/op
>> ArraysHashCode.chars 100 avgt 15 | 78.374 ? 0.117 | 11.903 ? 0.023 | ns/op
>> ArraysHashCode.chars 10000 avgt 15 | 9248.328 ? 13.644 | 1344.007 ? 1.795 | ns/op
>> ArraysHashCode.ints 1 avgt 15 | 0.746 ? 0.083 | 0.631 ? 0.020 | ns/op
>> ArraysHashCode.ints 10 avgt 15 | 4.357 ? 0.009 | 3.387 ? 0.005 | ns/op
>> ArraysHashCode.ints 100 avgt 15 | 78.391 ? 0.103 | 10.934 ? 0.015 | ns/op
>> ArraysHashCode.ints 10000 avgt 15 | 9248.125 ? 12.583 | 1340.644 ? 1.869 | ns/op
>> ArraysHashCode.multibytes 1 avgt 15 | 0.555 ? 0.020 | 0.559 ? 0.020 | ns/op
>> ArraysHashCode.multibytes 10 avgt 1...
>
> src/hotspot/share/utilities/intpow.hpp line 58:
>
>> 56: struct intpow<T, v, 1, no_overflow> {
>> 57: static const T value = v;
>> 58: };
>
>> There's no need for any of this metaprogramming. A constexpr function would be better.
>
> The main advantage of this implementation over one using a `constexpr` function is the ability to verify input parameter values and detect overflows at compile time using `static_assert`. This feature helps prevent user errors that could otherwise be difficult to debug. I chose this approach for its functional benefits over a `constexpr` function. If you have any suggestions on how to implement this functionality in a `constexpr` function, they would be highly welcome, as I might be missing something here. 😃
>
> Below is a similar `constexpr` implementation. **Please don't accept it through the UI**. I kept a `typedef` template parameter to ensure the return value has the desired width, if needed, without applying a mask at the call site.
>
> If you'd still prefer a `constexpr` function, please let me know if you have any comments on the implementation below (I'll rename it, remove `no_overflow` parameter and the comments).
>
> Suggestion:
>
> #include <limits>
>
> template <typename T>
> static constexpr T intpow_c(T v, unsigned p, bool no_overflow = false) {
> // Can't be used in constexpr function
> // static_assert(v || p, "0^0 is not defined");
>
> if (p == 0) {
> return 1;
> }
>
> T a = intpow_c(v, p / 2, no_overflow);
> T b = (p % 2) ? v : 1;
>
> // Can't be used in constexpr function
> // static_assert(!no_overflow || a <= std::numeric_limits<T>::max() / a, "Integer overflow");
> // static_assert(!no_overflow || a * a <= std::numeric_limits<T>::max() / b, "Integer overflow");
>
> return a * a * b;
> }
You're assuming a requirement that isn't really there. HotSpot calculates a great many things at startup time, and given that the arguments to intpow here are constants, there's no possibility of any surprises at runtime. Just assert what needs to be asserted, and please keep utility functions like this as simple as possible, leaving template metaprogramming for places that need it.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18487#discussion_r1725219259
More information about the hotspot-dev
mailing list