RFR: 8302906: AArch64: Add SVE backend support for vector unsigned comparison [v3]
Andrew Haley
aph at openjdk.org
Mon Mar 13 10:32:35 UTC 2023
On Tue, 7 Mar 2023 07:02:27 GMT, changpeng1997 <duke at openjdk.org> wrote:
>> This patch implements unsigned vector comparison on SVE.
>>
>> 1: Test:
>> All vector API test cases[1][2] passed without new failure. Existing test cases can cover all unsigned comparison conditions for all kinds of vector.
>>
>> 2: Performance:
>> (1): Benchmark:
>> As existing benchmarks in panama repo (such as [3]) have some issues [4] (We will fix them in a separate patch.), I collected performance data with a reduced jmh benchmark [5]. e.g. for ByteVector unsigned comparison:
>>
>>
>> @Benchmark
>> public void byteVectorUnsignedCompare() {
>> for (int j = 0; j < 200; j++) {
>> for (int i = 0; i < bspecies.length(); i++) {
>> ByteVector av = ByteVector.fromArray(bspecies, ba, i);
>> ByteVector ca = ByteVector.fromArray(bspecies, bb, i);
>> av.compare(VectorOperators.UNSIGNED_GT, ca).intoArray(br, i);
>> }
>> }
>> }
>>
>>
>> (2): Performance data
>>
>> Before:
>>
>>
>> Benchmark Score(op/ms) Error
>> ByteVector.UNSIGNED_GT#size(1024) 4.846 3.419
>> ShortVector.UNSIGNED_GE#size(1024) 3.055 1.369
>> IntVector.UNSIGNED_LT#size(1024) 3.475 1.269
>> LongVector.UNSIGNED_LE#size(1024) 4.515 1.812
>>
>>
>> After:
>>
>>
>> Benchmark Score(op/ms) Error
>> ByteVector.UNSIGNED_GT#size(1024) 493.937 1.389
>> ShortVector.UNSIGNED_GE#size(1024) 5308.796 20.557
>> IntVector.UNSIGNED_LT#size(1024) 4944.744 10.606
>> LongVector.UNSIGNED_LE#size(1024) 8459.605 28.683
>>
>>
>> [1] https://github.com/openjdk/jdk/tree/master/test/jdk/jdk/incubator/vector
>> [2] https://github.com/openjdk/jdk/tree/master/test/hotspot/jtreg/compiler/vectorapi
>> [3] https://github.com/openjdk/panama-vector/blob/2aade73adeabdf6a924136b17fd96ccc95c1d160/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/ByteMaxVector.java#L1459
>> [4] https://bugs.openjdk.org/browse/JDK-8282850
>> [5] https://gist.github.com/changpeng1997/d311127e1015c107197f9b56a92b0fae
>
> changpeng1997 has updated the pull request incrementally with one additional commit since the last revision:
>
> Refactor part of code in C2 assembler and remove some switch-case stmts.
src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 3218:
> 3216: f(1, 21), rf(Vm, 16), f(0b111001, 15, 10), rf(Vn, 5), rf(Vd, 0);
> 3217: }
> 3218:
This looks OK, but it's in the wrong place in the file. Look at C4.1 A64 instruction set encoding. These instructions are in the "Advanced SIMD three same" group, so they must appear in assembler_aarch64.hpp in the "Advanced SIMD three same" section.
This is the "AdvSIMD two-reg misc" section.
-------------
PR: https://git.openjdk.org/jdk/pull/12725
More information about the hotspot-compiler-dev
mailing list