RFR: 8290322: Optimize Vector.rearrange over byte vectors for AVX512BW targets.
Joshua Zhu
jzhu at openjdk.org
Thu Jul 21 11:44:05 UTC 2022
On Thu, 14 Jul 2022 18:23:51 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
> Hi All,
>
> Currently re-arrange over 512bit bytevector is optimized for targets supporting AVX512_VBMI feature, this patch generates efficient JIT sequence to handle it for AVX512BW targets. Following performance results with newly added benchmark shows
> significant speedup.
>
> System: Intel(R) Xeon(R) Platinum 8280 CPU @ 2.70GHz (CascadeLake 28C 2S)
>
>
> Baseline:
> =========
> Benchmark (size) Mode Cnt Score Error Units
> RearrangeBytesBenchmark.testRearrangeBytes16 512 thrpt 2 16350.330 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes32 512 thrpt 2 15991.346 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes64 512 thrpt 2 34.423 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes8 512 thrpt 2 10873.348 ops/ms
>
>
> With-opt:
> =========
> Benchmark (size) Mode Cnt Score Error Units
> RearrangeBytesBenchmark.testRearrangeBytes16 512 thrpt 2 16062.624 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes32 512 thrpt 2 16028.494 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes64 512 thrpt 2 8741.901 ops/ms
> RearrangeBytesBenchmark.testRearrangeBytes8 512 thrpt 2 10983.226 ops/ms
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Looks good to me. My Cascade Lake server benefits from this change.
-------------
PR: https://git.openjdk.org/jdk/pull/9498
More information about the hotspot-compiler-dev
mailing list