RFR: 8317720: RISC-V: Implement Adler32 intrinsic [v11]
Fei Yang
fyang at openjdk.org
Wed Jun 19 03:40:14 UTC 2024
On Tue, 18 Jun 2024 17:22:14 GMT, ArsenyBochkarev <duke at openjdk.org> wrote:
>> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5100:
>>
>>> 5098: // 1. Need to do vector-widening reduction sum
>>> 5099: // 2. It is safe to perform sign-extension during vmv.x.s with 32-bits elements
>>> 5100: __ vwredsumu_vs(vtemp1, vs2acc[0], vzero);
>>
>> Can we simplify reduction sum for the multiplication result into something like this?
>>
>> // Summing up calculated results for s2_new
>> if (MaxVectorSize > 16) {
>> __ vsetvli(temp0, count, Assembler::e16, LMUL);
>> } else {
>> // Half of vector-widening multiplication result is in successor of vs2acc[0]
>> // group if MaxVectorSize == 16, in which case we need to double vector register
>> // group width in order to reduction sum all of them.
>> Assembler::LMUL LMULx2 = (LMUL == Assembler::m1) ? Assembler::m2 :
>> (LMUL == Assembler::m2) ? Assembler::m4 : Assembler::m8;
>> __ vsetvli(temp0, count, Assembler::e16, LMULx2);
>> }
>>
>> // Upper bound for reduction sum:
>> // 0xFF * (64 + 63 + ... + 2 + 1) = 0x817E0 max for whole register group, so:
>> // 1. Need to do vector-widening reduction sum
>> // 2. It is safe to perform sign-extension during vmv.x.s with 32-bits elements
>> __ vwredsumu_vs(vtemp1, vs2acc[0], vzero);
>>
>>
>> This will be faster when `MaxVectorSize == 16` as it can avoid the two if checks for `MaxVectorSize == 16` and saves us `vtemp2` register. Also, the code becomes easier to understand.
>
> Sounds good, thanks! As far as I can see it is safe to do it for any `MaxVectorSize`, so that the code will be even simplier. Please correct me if I'm wrong
Yeah, it will be more simpler if you use it for any MaxVectorSize. But to my surprise, it will bring negative performane impact when I test on my BananaPi-F3 (MaxVectorSize = 32). I am not sure whether it's issue across the boards. Maybe you can give it a try on your hardware to see?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18382#discussion_r1645356197
More information about the hotspot-compiler-dev
mailing list