RFR: 8290034: Auto vectorize reverse bit operations. [v2]
John R Rose
jrose at openjdk.org
Fri Aug 19 06:51:41 UTC 2022
On Wed, 27 Jul 2022 15:55:11 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Summary of changes:
>> - Intrinsify scalar bit reverse APIs to emit efficient instruction sequence for X86 targets with and w/o GFNI feature.
>> - Handle auto-vectorization of Integer/Long.reverse bit operations.
>> - Backend implementation for these were added with 4th incubation of VectorAPIs.
>>
>> Following are performance number for newly added JMH mocro benchmarks:-
>>
>>
>> No-GFNI(CLX):
>> =============
>> Baseline:
>> Benchmark (size) Mode Cnt Score Error Units
>> Integers.reverse 500 avgt 2 1.085 us/op
>> Longs.reverse 500 avgt 2 1.236 us/op
>> WithOpt:
>> Benchmark (size) Mode Cnt Score Error Units
>> Integers.reverse 500 avgt 2 0.104 us/op
>> Longs.reverse 500 avgt 2 0.255 us/op
>>
>> With-GFNI(ICX):
>> ===============
>> Baseline:
>> Benchmark (size) Mode Cnt Score Error Units
>> Integers.reverse 500 avgt 2 0.887 us/op
>> Longs.reverse 500 avgt 2 1.095 us/op
>>
>> Without:
>> Benchmark (size) Mode Cnt Score Error Units
>> Integers.reverse 500 avgt 2 0.037 us/op
>> Longs.reverse 500 avgt 2 0.145 us/op
>>
>>
>> Kindly review and share feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>
> - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8290034
> - 8290034: Styling comments resolved.
> - 8290034: Adding descriptive comments.
> - 8290034: Auto vectorize reverse bit operations.
src/hotspot/share/opto/subnode.cpp line 1917:
> 1915: const TypeInt* t1int = t1->isa_int();
> 1916: if (t1int && t1int->is_con()) {
> 1917: jint res = reverse_bits(t1int->get_con());
This is clearly wrong, if I understand correctly that ReverseI is an intrinsic for `Integer::reverse`, which reverses not only bits in bytes but also the bytes themselves.
Also, it seems noxious (IIUC) because it widens t1int's con to 64 bits with sign extension, does some sort of 64-bit reversal operation, and then chops off the high order bits assigning to `res`.
The root cause here is that `reverse_bits` should be properly factored out and maintained in its own modular header file, akin to `utilities/powerOfTwo.hpp`. I suggest `utilities/moveBits.hpp`, to hold bits-in-bytes reversal, byte reversal, full reversal, compress/expand (with which it should be grouped) and who knows what other future permutations.
I say it's a root cause in that something randomly coded in the bowels of `subnode.cpp` cannot be properly reviewed in isolation, especially if it is misnamed.
Also, the nearly identical use of the HD algorithm in `code/compressedStream.cpp` should use the new version in the header file.
The new hand-written assembly code for this changeset should refer the reader to utilities/moveBits.hpp as well.
-------------
PR: https://git.openjdk.org/jdk/pull/9535
More information about the hotspot-compiler-dev
mailing list