RFR: 8267663: [vector] Add unsigned comparison operators on AArch64 [v2]

Andrew Haley aph at openjdk.java.net
Mon Jun 7 10:13:58 UTC 2021


On Mon, 7 Jun 2021 09:46:22 GMT, Eric Liu <eliu at openjdk.org> wrote:

>> This patch implements unsigned vector comparison on AArch64. The
>> performance of unsigned comparison improves about 4x~5x in my local with
>> Byte128Vector.java[1].
>> 
>> Before:
>> Benchmark                               Score(op/ms)     Error
>> Byte128Vector.UNSIGNED_GE#size(1024)    99.953           6.17
>> Byte128Vector.UNSIGNED_GT#size(1024)    95.334           8.865
>> Byte128Vector.UNSIGNED_LE#size(1024)    76.908           24.332
>> Byte128Vector.UNSIGNED_LT#size(1024)    78.362           23.507
>> 
>> After:
>> Benchmark                               Score(op/ms)     Error
>> Byte128Vector.UNSIGNED_GE#size(1024)    421.809          25.57
>> Byte128Vector.UNSIGNED_GT#size(1024)    420.653          26.779
>> Byte128Vector.UNSIGNED_LE#size(1024)    316.754          92.889
>> Byte128Vector.UNSIGNED_LT#size(1024)    423.683          26.508
>> 
>> [Test]
>> - All vector API test cases passed without new failure. 8265312[2] has
>>   been implemented this on x86 and supplied sufficient test cases for
>>   all kinds of vector.
>> - No performance regression for other comparisons.
>> - libjvm.so drops off about 200KB after this patch by combining those
>>   vector compare rules.
>> 
>> [1] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/jdk/jdk/incubator/vector/benchmark/src/main/java/benchmark/jdk/incubator/vector/Byte128Vector.java#L1198
>> [2] https://github.com/openjdk/panama-vector/pull/68
>
> Eric Liu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Refactor code
>   
>   - public elemBytes_to_Arrangement and make it more generalized since it
>     maybe useful in the future.
>   - move neon_compare into macroAssembler.
>   
>   Change-Id: I7596de03abb066574cf430d935edd07cd627e14b

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5373:

> 5371:       case BoolTest::gt: fcmgt(dst, size, src1, src2); break;
> 5372:       case BoolTest::le: fcmge(dst, size, src2, src1); break;
> 5373:       case BoolTest::lt: fcmgt(dst, size, src2, src1); break;

Are you sure about these two lines?

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 5389:

> 5387:       case BoolTest::gt: cmgt(dst, size, src1, src2); break;
> 5388:       case BoolTest::le: cmge(dst, size, src2, src1); break;
> 5389:       case BoolTest::lt: cmgt(dst, size, src2, src1); break;

And these two?

-------------

PR: https://git.openjdk.java.net/jdk/pull/4358


More information about the hotspot-compiler-dev mailing list