RFR: 8302908: RISC-V: Support masked vector arithmetic instructions for Vector API [v12]

Mon Apr 17 13:56:40 UTC 2023

On Mon, 17 Apr 2023 07:27:28 GMT, Feilong Jiang <fjiang at openjdk.org> wrote:

>> Thanks for the review!
>> 
>> I think there may be no problem here. The foating-point compare instructions follow the semantics of the scalar floating-point compare instructions[1] in RVV. For all three instructions (FEQ.S, FLT.S, FLE.S), the result is 0 if either operand is NaN[2]. So when one of the operands is NaN, `BoolTest::ge`, `BoolTest::gt`, `BoolTest::le` and `BoolTest::lt` will all generate a 0 on the corresponding bit. 
>> 
>> Also a jtreg test case[3] proves that our current logic is fine. `GTFloat512VectorTests` covers the case where the input is Nan. The test will pass properly and generate the following compilation log which contains `vmaskcmp_rvv`:
>> 
>> 
>> 1ac     B20: # out( B49 B21 ) <- in( B48 B19 )  Freq: 4188.06
>> 1ac     vmaskcmp_rvv V0, V4, V5, #3
>> 1b8     
>> 1b8     MEMBAR-store-store #@membar_storestore
>> 1bc     # checkcastPP of R11, #@checkCastPP
>> 1bc     vstoremask V1, V0
>> 1c8     addi  R7, R11, #16 # ptr, #@addP_reg_imm
>> 1cc     spill R11 -> [sp, #104] # spill size = 64
>> 1ce     storeV [R7], V1 # vector (rvv)
>> 1d6     ld  R19, [R23, #264] # ptr, #@loadP
>> 1da     ld  R7, [R23, #280] # ptr, #@loadP
>> 1de     addi  R28, R19, #16 # ptr, #@addP_reg_imm
>> 1e2     bgeu  R28, R7, B49 #@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> 
>> [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1313-vector-floating-point-compare-instructions
>> [2] https://github.com/riscv/riscv-isa-manual/releases/download/draft-20230131-c0b298a/riscv-spec.pdf
>> [3] https://github.com/openjdk/jdk/tree/master/test/jdk/jdk/incubator/vector/Float512VectorTests.java
>
> Hi, @DingliZhang, I have the same concern here. As the scalar version of Float-point comparison, we have `is_unordered` flag to return the right result if operands contain NaN(s) [1]. Does vfcmp need this flag too?
> 
> Also, I see an example of implementing `isgreater()` when operands contain NaN(s) for vfcmp [2], which checks NaN for both operands.
> 
> 1. https://github.com/openjdk/jdk/blob/7f56de8f78c0b54e5cf313f53213102a3495234f/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L1010-L1037
> 2. https://github.com/riscv/riscv-v-spec/blob/b9afd6f5709fe3f91ce39bb83695bcfaa78eef94/v-spec.adoc?plain=1#L3700-L3706

Hi @feilongjiang , thanks for the view!

I think maybe we don't need to distinguish whether it is unordered or not here at the moment. For example, the greater than or less than comparison in sve_comapre in the vmaskcmp call in aarch64 uses the GE/GT[1][2] in the Assembler condition to make the determination (instead of using the LE/LT flag bits, which would include unordered as well).

The most obvious difference in the results of isgreater() and the logic in the patch is that `isgreater()`[3] does not set the invalid operation exception flag (NV=1) when input is Nan. It does seem that we would be better off using the logic in [3] for the case when input is Nan.

We will update PR later.

[1] https://developer.arm.com/documentation/ddi0596/2021-12/Shared-Pseudocode/Shared-Functions?lang=en#impl-shared.FPCompareGE.3
[2] https://developer.arm.com/documentation/ddi0596/2021-12/Shared-Pseudocode/Shared-Functions?lang=en#impl-shared.FPCompareGT.3
[3] https://github.com/riscv/riscv-v-spec/blob/b9afd6f5709fe3f91ce39bb83695bcfaa78eef94/v-spec.adoc?plain=1#L3700-L3706

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/12682#discussion_r1168736383