RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v10]

Yuri Gaevsky duke at openjdk.org
Thu Jul 17 12:45:03 UTC 2025


On Tue, 15 Jul 2025 14:05:25 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
>> 
>> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   - removed tail processing with RVV instructions as simple scalar loop provides in general better results

> > Looking at the JMH numbers, it's interesting to find that `-XX:DisableIntrinsic=_vectorizedHashCode` outperforms `-XX:-UseRVV`. If that is the case, then why would we want the scalar version (that is `C2_MacroAssembler::arrays_hashcode()`)?
> 
> You are right: the non-RVV version of intrinsic performs worse on BPI-F3 hardware with size > 70, though originally it was better on StarFive JH7110 and T-Head RVB-ICE, please see #16629.

Hm, it is still good on Lichee Pi 4A:

$  ( for i in "-XX:DisableIntrinsic=_vectorizedHashCode" " " ; do ( echo "--- ${i} ---" && ${JAVA_HOME}/bin/java  -jar benchmarks.jar  --jvmArgs="-XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions ${i}" org.openjdk.bench.java.lang.ArraysHashCode.ints  -p size=1,5,10,20,30,40,50,60,70,80,90,100,200,300 -f 3 -r 1 -w 1 -wi 10 -i 10 2>&1 | tail -15 ) done )
--- -XX:DisableIntrinsic=_vectorizedHashCode ---
Benchmark            (size)  Mode  Cnt     Score    Error  Units
ArraysHashCode.ints       1  avgt   30    51.709 ±  3.815  ns/op
ArraysHashCode.ints       5  avgt   30    68.146 ±  1.833  ns/op
ArraysHashCode.ints      10  avgt   30    89.217 ±  0.496  ns/op
ArraysHashCode.ints      20  avgt   30   140.807 ±  9.335  ns/op
ArraysHashCode.ints      30  avgt   30   172.030 ±  4.025  ns/op
ArraysHashCode.ints      40  avgt   30   222.927 ± 10.342  ns/op
ArraysHashCode.ints      50  avgt   30   251.719 ±  0.686  ns/op
ArraysHashCode.ints      60  avgt   30   305.947 ± 10.532  ns/op
ArraysHashCode.ints      70  avgt   30   347.602 ±  7.024  ns/op
ArraysHashCode.ints      80  avgt   30   382.057 ±  1.520  ns/op
ArraysHashCode.ints      90  avgt   30   426.022 ± 31.800  ns/op
ArraysHashCode.ints     100  avgt   30   457.737 ±  0.652  ns/op
ArraysHashCode.ints     200  avgt   30   913.501 ±  3.258  ns/op
ArraysHashCode.ints     300  avgt   30  1297.355 ±  2.383  ns/op
---   ---
Benchmark            (size)  Mode  Cnt    Score    Error  Units
ArraysHashCode.ints       1  avgt   30   50.141 ±  0.463  ns/op
ArraysHashCode.ints       5  avgt   30   62.921 ±  2.538  ns/op
ArraysHashCode.ints      10  avgt   30   77.686 ±  2.577  ns/op
ArraysHashCode.ints      20  avgt   30  102.736 ±  0.136  ns/op
ArraysHashCode.ints      30  avgt   30  137.592 ±  4.232  ns/op
ArraysHashCode.ints      40  avgt   30  157.376 ±  0.302  ns/op
ArraysHashCode.ints      50  avgt   30  196.068 ±  3.812  ns/op
ArraysHashCode.ints      60  avgt   30  212.956 ±  2.075  ns/op
ArraysHashCode.ints      70  avgt   30  251.260 ±  1.176  ns/op
ArraysHashCode.ints      80  avgt   30  266.223 ±  0.655  ns/op
ArraysHashCode.ints      90  avgt   30  313.465 ±  6.810  ns/op
ArraysHashCode.ints     100  avgt   30  373.024 ±  1.005  ns/op
ArraysHashCode.ints     200  avgt   30  620.723 ± 24.313  ns/op
ArraysHashCode.ints     300  avgt   30  881.358 ±  1.320  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-3083927127


More information about the hotspot-compiler-dev mailing list