RFR: 8257815: Replace global log2 functions with efficient implementations [v8]
Stefan Karlsson
stefank at openjdk.java.net
Wed Dec 9 19:30:42 UTC 2020
On Wed, 9 Dec 2020 18:26:10 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> This patch replaces the log2 functions in globalDefinitions.hpp with more efficient counterparts in utilities/powerOfTwo.hpp
>>
>> Naming is hard, but I think the following scheme is reasonable:
>>
>> - log2i: any integral type. 0-hostile
>> - log2i_allow_zero: any integral type. gracefully handles zero (adds a branch)
>> - exact_log2i: any integral type. value must be a power of two
>>
>> I chose log2i rather than log2 to stand out from the log2 functions defined in various standard libraries.
>>
>> Going through all usage, quite a few uses of log2_long et.c. could be replaced by exact_log2i since they take something that has been checked to be a power of two. Most of the remaining usage seem to be able to use the 0-hostile variant, which avoids a branch.
>>
>> To sanity check that calculating log2 using count_leading_zeros gives better performance I added a couple of trivial and short-running microbenchmarks to test_powerOfTwo. For small values (<= 1025) the new impl is ~5x faster, with a larger speed-up for larger integer values:
>>
>> [ RUN ] power_of_2.log2_long_micro
>> [ OK ] power_of_2.log2_long_micro (3581 ms)
>> [ RUN ] power_of_2.log2_long_small_micro
>> [ OK ] power_of_2.log2_long_small_micro (549 ms)
>> [ RUN ] power_of_2.log2i_micro
>> [ OK ] power_of_2.log2i_micro (259 ms)
>> [ RUN ] power_of_2.log2i_small_micro
>> [ OK ] power_of_2.log2i_small_micro (113 ms)
>>
>> I'm not sure if this naive microbenchmark carries its own weight, but it just adds a few seconds and can be useful for quickly checking this performance assumption on other H/W
>>
>> (Intending this for 17)
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
>
> Issue with using count_trailing_zeros with 64-bit values on 32-bit, revert and add TODO
I'm fine with the change as is. I have a few comments that you might want to consider.
src/hotspot/share/utilities/powerOfTwo.hpp line 79:
> 77: (uint64_t)static_cast<typename std::make_unsigned<T>::type>(value));
> 78: // TODO: this could be "return count_trailing_zeros(value)"", but that fails on
> 79: // 32-bit builds since that takes uintx and is thus lossy for 64-bit values.
I don't think we need to leave this bread crumbs in the code. If there's really something we want "to do", then we should create an RFE for it instead.
src/hotspot/share/gc/z/zHeuristics.cpp line 48:
> 46: // Enable medium pages
> 47: ZPageSizeMedium = size;
> 48: ZPageSizeMediumShift = log2i(ZPageSizeMedium);
I think you missed the comment that this could be log2i_exact.
src/hotspot/cpu/x86/vm_version_x86.cpp line 784:
> 782: cpu_family(), _model, _stepping, os::cpu_microcode_revision());
> 783: assert(res > 0, "not enough temporary space allocated");
> 784: assert(log2i_exact((uint64_t)CPU_MAX_FEATURE) + 1 == sizeof(_features_names) / sizeof(char*), "wrong size features_names");
If you want to get rid of this cast (and the one added to interp_masm_x86.cpp, you could register your log2i functions to accept enums:
-template <typename T, ENABLE_IF(std::is_integral<T>::value)>
+template <typename T, ENABLE_IF(std::is_integral<T>::value || std::is_enum<T>::value)>
test/hotspot/gtest/utilities/test_powerOfTwo.cpp line 276:
> 274: }
> 275: {
> 276: /* Test log2i_graceful handles 0 input */
Could convert /* */ to //
-------------
Marked as reviewed by stefank (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/1663
More information about the hotspot-dev
mailing list