RFR: 8367292: VectorAPI: Optimize VectorMask.fromLong/toLong() for SVE [v6]
Chiranmoy Bhattacharya
duke at openjdk.org
Wed Nov 12 08:54:23 UTC 2025
On Wed, 12 Nov 2025 07:38:17 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
> Internal tests pass (just sanity testing, did not run it on SVE). Code looks reasonable.
>
> @XiaohongGong Thanks for all the updates and bearing with all the review comments 😊
Tested the patch on AWS Graviton4 with the benchmarks provided, and the results match the reported numbers.
With `VM options: -XX:UseSVE=2 --add-modules=jdk.incubator.vector`
Benchmark bits inputs Mode Unit Before After Gain
MaskQueryOperationsBenchmark.testToLongByte 128 1 thrpt ops/s 269101754.957 1154781149.715 4.29
MaskQueryOperationsBenchmark.testToLongByte 128 2 thrpt ops/s 269106841.271 1020391639.317 3.79
MaskQueryOperationsBenchmark.testToLongByte 128 3 thrpt ops/s 269108088.073 1178242624.232 4.37
MaskQueryOperationsBenchmark.testToLongInt 128 1 thrpt ops/s 833720082.241 1183112162.420 1.41
MaskQueryOperationsBenchmark.testToLongInt 128 2 thrpt ops/s 851866517.512 905381882.385 1.06
MaskQueryOperationsBenchmark.testToLongInt 128 3 thrpt ops/s 841908544.850 1010800908.258 1.20
MaskQueryOperationsBenchmark.testToLongLong 128 1 thrpt ops/s 752714074.556 1116755995.074 1.48
MaskQueryOperationsBenchmark.testToLongLong 128 2 thrpt ops/s 733777062.242 1117923992.880 1.52
MaskQueryOperationsBenchmark.testToLongLong 128 3 thrpt ops/s 755390508.217 1125159886.042 1.48
MaskQueryOperationsBenchmark.testToLongShort 128 1 thrpt ops/s 915079922.329 1183247213.309 1.29
MaskQueryOperationsBenchmark.testToLongShort 128 2 thrpt ops/s 898902990.501 1157778493.700 1.28
MaskQueryOperationsBenchmark.testToLongShort 128 3 thrpt ops/s 913979902.412 1183483647.121 1.29
With `VM options: -XX:UseSVE=1 --add-modules=jdk.incubator.vector`
Benchmark bits inputs Mode Unit Before After Gain
MaskQueryOperationsBenchmark.testToLongByte 128 1 thrpt ops/s 578862813.032 674722742.273 1.16
MaskQueryOperationsBenchmark.testToLongByte 128 2 thrpt ops/s 577292103.016 671339970.996 1.16
MaskQueryOperationsBenchmark.testToLongByte 128 3 thrpt ops/s 576827529.288 673882123.264 1.16
MaskQueryOperationsBenchmark.testToLongInt 128 1 thrpt ops/s 792212973.997 957781054.650 1.20
MaskQueryOperationsBenchmark.testToLongInt 128 2 thrpt ops/s 790683237.790 965247861.666 1.22
MaskQueryOperationsBenchmark.testToLongInt 128 3 thrpt ops/s 794710366.832 981858552.787 1.23
MaskQueryOperationsBenchmark.testToLongLong 128 1 thrpt ops/s 738425667.560 994493069.759 1.34
MaskQueryOperationsBenchmark.testToLongLong 128 2 thrpt ops/s 736805923.837 979981983.578 1.33
MaskQueryOperationsBenchmark.testToLongLong 128 3 thrpt ops/s 740591712.584 972150308.391 1.31
MaskQueryOperationsBenchmark.testToLongShort 128 1 thrpt ops/s 784464050.733 994221594.464 1.26
MaskQueryOperationsBenchmark.testToLongShort 128 2 thrpt ops/s 789528903.130 994094688.740 1.25
MaskQueryOperationsBenchmark.testToLongShort 128 3 thrpt ops/s 779944943.316 979813192.314 1.25
-------------
PR Comment: https://git.openjdk.org/jdk/pull/27481#issuecomment-3520532925
More information about the hotspot-compiler-dev
mailing list