RFR: 8357551: RISC-V: support CMoveF/D vectorization [v5]
Fei Yang
fyang at openjdk.org
Fri Nov 21 03:41:46 UTC 2025
On Tue, 18 Nov 2025 09:27:44 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Hi,
>>
>> This pr add CMoveF/D on riscv, which enable vectorization of statement like: `op_1 bop op_2 ? res_f_d_1 : res_f_d_2 in a loop`.
>>
>> This pr is also a preparation for further vectorization in https://github.com/openjdk/jdk/pull/28231.
>>
>> Previously it's https://github.com/openjdk/jdk/pull/25341, but at that time, C2 SLP has some issue with unsigned comparison, which is now fixed, so it's good to continue the work.
>>
>> # Test
>> ## Jtreg
>>
>> in progress...
>>
>> ## Performance
>>
>> Column names meanings:
>> * p: with patch
>> * p+v: with patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on
>> * m: without patch
>> * m+v: without patch, `-XX:+UseVectorCmov -XX:+UseCMoveUnconditionally` turned on
>>
>> #### Average improvement
>>
>> NOTE: With only this PR, it brings performance benefit in case of `CMoveF+CmpF`, `CMoveD+ComD`, `CMoveF+CmpI`, `CMoveD+CmpL`. The data below is based on fullly implmenting the vectorization of `CMoveI/L/F/D+CmpI/L/F/D`, which will be achieved by https://github.com/openjdk/jdk/pull/28231.
>>
>> For details, check the performance data in https://github.com/openjdk/jdk/pull/25341 on riscv.
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Opt (m/p) | Opt (m+v/p+v) | Opt (p/p+v) | Opt (m/p+v)
>> -- | -- | -- | --
>> 1.022782609 | 2.198717391 | 2.162673913 | 2.199
>>
>> </google-sheets-html-origin>
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>
> replace assert with log_warning
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1590:
> 1588: // jump if cmp1 < cmp2 or either is NaN
> 1589: // not jump (i.e. move src to dst) if cmp1 >= cmp2
> 1590: float_blt(cmp1, cmp2, no_set);
I compared this with the existing `MacroAssembler::cmov_cmp_fp_ge` [1] and I witnessed some difference in the case of `NaN` handling. In `MacroAssembler::cmov_cmp_fp_ge`, we set the `is_unordered` param to true when calling `float_blt` or `double_blt`, which is not the case here. I assume we need similar handling here as well, right?
[1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L1338
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1636:
> 1634: // jump if cmp1 <= cmp2 or either is NaN
> 1635: // not jump (i.e. move src to dst) if cmp1 > cmp2
> 1636: float_ble(cmp1, cmp2, no_set);
Same question here.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2548424215
PR Review Comment: https://git.openjdk.org/jdk/pull/28309#discussion_r2548424568
More information about the hotspot-dev
mailing list