RFR: 8289186: Support predicated vector load/store operations over X86 AVX2 targets. [v4]
Jatin Bhateja
jbhateja at openjdk.org
Wed Jul 6 13:24:27 UTC 2022
On Wed, 6 Jul 2022 13:14:53 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> test/micro/org/openjdk/bench/jdk/incubator/vector/LoadMaskedIOOBEBenchmark.java line 98:
>>
>>> 96: for (int i = 0; i < inSize; i += bspecies.length()) {
>>> 97: VectorMask<Byte> mask = VectorMask.fromArray(bspecies, m, i);
>>> 98: ByteVector.fromArray(bspecies, byteIn, i, mask).intoArray(byteOut, i, mask);
>>
>> Could you please add new benchmarks for masked `store` ?
>
> Done.
Here are results of new benchmark.
BaseLine:
Benchmark (inSize) (outSize) Mode Cnt Score Error Units
StoreMaskedIOOBEBenchmark.byteStoreArrayMaskIOOBE 1024 1022 thrpt 2 772.555 ops/ms
StoreMaskedIOOBEBenchmark.doubleStoreArrayMaskIOOBE 1024 1022 thrpt 2 180.548 ops/ms
StoreMaskedIOOBEBenchmark.floatStoreArrayMaskIOOBE 1024 1022 thrpt 2 311.500 ops/ms
StoreMaskedIOOBEBenchmark.intStoreArrayMaskIOOBE 1024 1022 thrpt 2 312.457 ops/ms
StoreMaskedIOOBEBenchmark.longStoreArrayMaskIOOBE 1024 1022 thrpt 2 181.013 ops/ms
StoreMaskedIOOBEBenchmark.shortStoreArrayMaskIOOBE 1024 1022 thrpt 2 538.537 ops/ms
WithOpt:
Benchmark (inSize) (outSize) Mode Cnt Score Error Units
StoreMaskedIOOBEBenchmark.byteStoreArrayMaskIOOBE 1024 1022 thrpt 2 757.079 ops/ms
StoreMaskedIOOBEBenchmark.doubleStoreArrayMaskIOOBE 1024 1022 thrpt 2 1553.923 ops/ms
StoreMaskedIOOBEBenchmark.floatStoreArrayMaskIOOBE 1024 1022 thrpt 2 3060.020 ops/ms
StoreMaskedIOOBEBenchmark.intStoreArrayMaskIOOBE 1024 1022 thrpt 2 3025.225 ops/ms
StoreMaskedIOOBEBenchmark.longStoreArrayMaskIOOBE 1024 1022 thrpt 2 1562.263 ops/ms
StoreMaskedIOOBEBenchmark.shortStoreArrayMaskIOOBE 1024 1022 thrpt 2 538.931 ops/ms
-------------
PR: https://git.openjdk.org/jdk/pull/9324
More information about the hotspot-compiler-dev
mailing list