RFR: 8290322: Optimize Vector.rearrange over byte vectors for AVX512BW targets. [v5]
Vladimir Kozlov
kvn at openjdk.org
Tue Aug 23 16:59:43 UTC 2022
On Sat, 20 Aug 2022 13:42:01 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Hi All,
>>
>> Currently re-arrange over 512bit bytevector is optimized for targets supporting AVX512_VBMI feature, this patch generates efficient JIT sequence to handle it for AVX512BW targets. Following performance results with newly added benchmark shows
>> significant speedup.
>>
>> System: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (CascadeLake 28C 2S)
>>
>>
>> Baseline:
>> =========
>> Benchmark (size) Mode Cnt Score Error Units
>> RearrangeBytesBenchmark.testRearrangeBytes16 512 thrpt 2 16350.330 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes32 512 thrpt 2 15991.346 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes64 512 thrpt 2 34.423 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes8 512 thrpt 2 10873.348 ops/ms
>>
>>
>> With-opt:
>> =========
>> Benchmark (size) Mode Cnt Score Error Units
>> RearrangeBytesBenchmark.testRearrangeBytes16 512 thrpt 2 16062.624 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes32 512 thrpt 2 16028.494 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes64 512 thrpt 2 8741.901 ops/ms
>> RearrangeBytesBenchmark.testRearrangeBytes8 512 thrpt 2 10983.226 ops/ms
>>
>>
>> Kindly review and share your feedback.
>>
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision:
>
> 8290322: Further optimization based on review comments.
FTR, the latest version passed tier1-4 clean.
-------------
PR: https://git.openjdk.org/jdk/pull/9498
More information about the hotspot-compiler-dev
mailing list