RFR: 8267663: [vector] Add unsigned comparison operators on AArch64 [v2]
Andrew Haley
aph at openjdk.java.net
Mon Jun 7 10:13:58 UTC 2021
On Mon, 7 Jun 2021 09:46:22 GMT, Eric Liu <eliu at openjdk.org> wrote:
>> This patch implements unsigned vector comparison on AArch64. The
>> performance of unsigned comparison improves about 4x~5x in my local with
>> Byte128Vector.java[1].
>>
>> Before:
>> Benchmark Score(op/ms) Error
>> Byte128Vector.UNSIGNED_GE#size(1024) 99.953 6.17
>> Byte128Vector.UNSIGNED_GT#size(1024) 95.334 8.865
>> Byte128Vector.UNSIGNED_LE#size(1024) 76.908 24.332
>> Byte128Vector.UNSIGNED_LT#size(1024) 78.362 23.507
>>
>> After:
>> Benchmark Score(op/ms) Error
>> Byte128Vector.UNSIGNED_GE#size(1024) 421.809 25.57
>> Byte128Vector.UNSIGNED_GT#size(1024) 420.653 26.779
>> Byte128Vector.UNSIGNED_LE#size(1024) 316.754 92.889
>> Byte128Vector.UNSIGNED_LT#size(1024) 423.683 26.508
>>
>> [Test]
>> - All vector API test cases passed without new failure. 8265312[2] has
>> been implemented this on x86 and supplied sufficient test cases for
>> all kinds of vector.
>> - No performance regression for other comparisons.
>> - libjvm.so drops off about 200KB after this patch by combining those
>> vector compare rules.
>>
>> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte128Vector.java#L1198
>> [2] https://github.com/openjdk/panama-vector/pull/68
>
> Eric Liu has updated the pull request incrementally with one additional commit since the last revision:
>
> Refactor code
>
> - public elemBytes_to_Arrangement and make it more generalized since it
> maybe useful in the future.
> - move neon_compare into macroAssembler.
>
> Change-Id: I7596de03abb066574cf430d935edd07cd627e14b
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5373:
> 5371: case BoolTest::gt: fcmgt(dst, size, src1, src2); break;
> 5372: case BoolTest::le: fcmge(dst, size, src2, src1); break;
> 5373: case BoolTest::lt: fcmgt(dst, size, src2, src1); break;
Are you sure about these two lines?
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5389:
> 5387: case BoolTest::gt: cmgt(dst, size, src1, src2); break;
> 5388: case BoolTest::le: cmge(dst, size, src2, src1); break;
> 5389: case BoolTest::lt: cmgt(dst, size, src2, src1); break;
And these two?
-------------
PR: https://git.openjdk.java.net/jdk/pull/4358
More information about the hotspot-compiler-dev
mailing list