RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v16]
Yuri Gaevsky
duke at openjdk.org
Tue Aug 5 19:38:12 UTC 2025
On Tue, 5 Aug 2025 12:53:24 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:
>> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
>>
>> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.
>
> Yuri Gaevsky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits:
>
> - Merge master
> - replaced vmul_vv + vadd_vv by vmadd_vv
> - returned lmul==m4
> - fixed error made for prevoius lmul-m1 experiment
> - make an experiment with lmul==1 instead of lmul==4.
> - move vredsum_vs out of VEC_LOOP to improve performance
> - - removed tail processing with RVV instructions as simple scalar loop provides in general better results
> - simplified arrays_hashcode_v() to be closer to VLA and use less general-purpose registers; minor cosmetic changes
> - change slli+add sequence to shadd
> - reorder instructions to make RVV instructions contiguous
> - ... and 7 more: https://git.openjdk.org/jdk/compare/ba0ae4cb...e7fac6c7
Updated data after prevoius merge (`e7fac6c`) which includes [JDK-8362596](https://github.com/openjdk/jdk/commit/4189fcbac40943f3b26c3a01938837b4e4762285):
bpif3-16g% ( for i in "-XX:DisableIntrinsic=_vectorizedHashCode" "-XX:-UseRVV" "-XX:+UseRVV" ; do ( echo "--- ${i} ---" && jdk/bin/java -jar benchmarks.jar --jvmArgs="-XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions ${i}" org.openjdk.bench.java.lang.ArraysHashCode.ints -p size=1,5,10,20,30,40,50,60,70,80,90,100,200,300 -f 1 -r 1 -w 1 -wi 5 -i 10 2>&1 | tail -15 ) done )
--- -XX:DisableIntrinsic=_vectorizedHashCode ---
Benchmark (size) Mode Cnt Score Error Units
ArraysHashCode.ints 1 avgt 10 11.274 ± 0.004 ns/op
ArraysHashCode.ints 5 avgt 10 28.837 ± 0.115 ns/op
ArraysHashCode.ints 10 avgt 10 43.109 ± 0.091 ns/op
ArraysHashCode.ints 20 avgt 10 68.190 ± 0.317 ns/op
ArraysHashCode.ints 30 avgt 10 88.075 ± 0.490 ns/op
ArraysHashCode.ints 40 avgt 10 115.032 ± 0.230 ns/op
ArraysHashCode.ints 50 avgt 10 136.004 ± 0.474 ns/op
ArraysHashCode.ints 60 avgt 10 161.900 ± 0.358 ns/op
ArraysHashCode.ints 70 avgt 10 169.663 ± 0.419 ns/op
ArraysHashCode.ints 80 avgt 10 193.207 ± 0.317 ns/op
ArraysHashCode.ints 90 avgt 10 208.696 ± 0.595 ns/op
ArraysHashCode.ints 100 avgt 10 232.698 ± 0.291 ns/op
ArraysHashCode.ints 200 avgt 10 447.169 ± 0.791 ns/op
ArraysHashCode.ints 300 avgt 10 655.249 ± 0.520 ns/op
--- -XX:-UseRVV ---
Benchmark (size) Mode Cnt Score Error Units
ArraysHashCode.ints 1 avgt 10 11.273 ± 0.003 ns/op
ArraysHashCode.ints 5 avgt 10 23.180 ± 0.008 ns/op
ArraysHashCode.ints 10 avgt 10 32.735 ± 0.076 ns/op
ArraysHashCode.ints 20 avgt 10 50.745 ± 0.056 ns/op
ArraysHashCode.ints 30 avgt 10 71.264 ± 0.148 ns/op
ArraysHashCode.ints 40 avgt 10 88.367 ± 0.034 ns/op
ArraysHashCode.ints 50 avgt 10 108.355 ± 0.058 ns/op
ArraysHashCode.ints 60 avgt 10 125.885 ± 0.055 ns/op
ArraysHashCode.ints 70 avgt 10 146.049 ± 0.213 ns/op
ArraysHashCode.ints 80 avgt 10 163.479 ± 0.049 ns/op
ArraysHashCode.ints 90 avgt 10 183.507 ± 0.170 ns/op
ArraysHashCode.ints 100 avgt 10 201.041 ± 0.032 ns/op
ArraysHashCode.ints 200 avgt 10 389.416 ± 0.517 ns/op
ArraysHashCode.ints 300 avgt 10 576.795 ± 0.364 ns/op
--- -XX:+UseRVV ---
Benchmark (size) Mode Cnt Score Error Units
ArraysHashCode.ints 1 avgt 10 11.283 ± 0.005 ns/op
ArraysHashCode.ints 5 avgt 10 23.197 ± 0.023 ns/op
ArraysHashCode.ints 10 avgt 10 38.824 ± 0.007 ns/op
ArraysHashCode.ints 20 avgt 10 70.612 ± 0.372 ns/op
ArraysHashCode.ints 30 avgt 10 101.474 ± 0.027 ns/op
ArraysHashCode.ints 40 avgt 10 108.357 ± 0.034 ns/op
ArraysHashCode.ints 50 avgt 10 139.659 ± 0.061 ns/op
ArraysHashCode.ints 60 avgt 10 171.644 ± 0.047 ns/op
ArraysHashCode.ints 70 avgt 10 112.136 ± 0.051 ns/op
ArraysHashCode.ints 80 avgt 10 146.094 ± 0.289 ns/op
ArraysHashCode.ints 90 avgt 10 177.230 ± 0.032 ns/op
ArraysHashCode.ints 100 avgt 10 119.787 ± 0.270 ns/op
ArraysHashCode.ints 200 avgt 10 161.705 ± 0.086 ns/op
ArraysHashCode.ints 300 avgt 10 216.808 ± 0.364 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-3156355237
More information about the hotspot-compiler-dev
mailing list