RFR: 8333964: RISC-V: C2: Check "requires_strict_order" flag for floating-point add reduction
Gui Cao
gcao at openjdk.org
Wed Jun 12 04:56:19 UTC 2024
Hi, We want to support non strictly-ordered floating-point add reduction, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
We can use the Float256VectorTests.java[2] to print the Opto JIT Code, verify and observe the generation of nodes.
For example, we can use the following command to print the Opto JIT Code of a jtreg test case:
/home/zifeihan/jtreg/bin/jtreg \
-v:default \
-concurrency:16 -timeout:50 \
-javaoption:-XX:+UnlockExperimentalVMOptions \
-javaoption:-XX:+UseRVV \
-javaoption:-XX:+PrintOptoAssembly \
-javaoption:-XX:LogFile=/home/zifeihan/jdk/Float256VectorTests_PrintOptoAssembly.log \
-jdk:/home/zifeihan/jdk/build/linux-riscv64-server-fastdebug/jdk \
/home/zifeihan/jdk/test/jdk/jdk/incubator/vector/Float256VectorTests.java
We can observe the specified JIT Code log Float256VectorTests_PrintOptoAssembly.log, which contains the reduce_addF_ordered instruction for the PR implementation.
1e4 B28: # out( B28 B29 ) <- in( B41 B28 ) Loop( B28-B28 inner post of N2310) Freq: 98.8164
1e4 shadd R17, R15, R10, #2 # ptr, #@shaddP_reg_reg_ext_b
1e8 addi R17, R17, #16 # ptr, #@addP_reg_imm
1ea loadV V1, [R17] # vector (rvv)
1f2 reduce_addF_unordered F2, F0, V1 # KILL V2
202 fadd.s F1, F1, F2 #@addF_reg_reg
206 addiw R15, R15, #8 #@addI_reg_imm
208 blt R15, R31, B28 #@cmpI_loop P=0.500000 C=19400.000000
Similarly, for `reduce_addD_unordered` instruction, we can use the `test/jdk/jdk/incubator/vector/Double256VectorTests.java` test case.
### Performance testing:
FloatMaxVector.ADDLanes [2] measures the performance of add reduction for floating-point type.
Without Patch:
Benchmark (size) Mode Cnt Score Error Units
FloatMaxVector.ADDLanes 1024 thrpt 5 394.558 ± 0.044 ops/ms
With Patch:
Benchmark (size) Mode Cnt Score Error Units
FloatMaxVector.ADDLanes 1024 thrpt 5 627.510 ± 1.095 ops/ms
### Correctness testing:
- [x] test/jdk/jdk/incubator/vector (fastdebug) qemu 8.1.50 with UseRVV
- [x] Run tier1-3 tests on SOPHON SG2042 (release)
- [ ] Run tier1-3 tests (release) on qemu 8.1.50 with UseRVV
[1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
[2] https://github.com/openjdk/panama-vector/blob/vectorIntrinsics/test/micro/org/openjdk/bench/jdk/incubator/vector/operation/FloatMaxVector.java#L316
-------------
Commit messages:
- Enable AddReduction test for riscv rvv1.0
- 8333964: RISC-V: C2: Check "requires_strict_order" flag for floating-point add reduction
Changes: https://git.openjdk.org/jdk/pull/19649/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19649&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8333964
Stats: 90 lines in 3 files changed: 86 ins; 0 del; 4 mod
Patch: https://git.openjdk.org/jdk/pull/19649.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19649/head:pull/19649
PR: https://git.openjdk.org/jdk/pull/19649
More information about the hotspot-compiler-dev
mailing list