RFR: 8314125: RISC-V: implement Base64 intrinsic - encoding [v8]
Fei Yang
fyang at openjdk.org
Thu Aug 8 14:57:35 UTC 2024
On Mon, 5 Aug 2024 15:44:07 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Hi,
>> Can you help to review the patch?
>>
>> I'm also working a base64 decode instrinsic, but there is some performance regression in some cases, and decode and encode are totally independent with each other, so I will send out review of decode in another pr when I fix the performance regression in it.
>>
>> Thanks.
>>
>> ## Test
>> benchmarks run on CanVM-K230 (vlenb == 16), and banana-pi (vlenb == 32)
>>
>> I've tried several implementations, respectively with vector group
>> * m2+m1+scalar
>> * m2+scalar
>> * m1+scalar
>> * pure scalar
>> The best one is combination of m2+m1, it have best performance in all source size.
>>
>> ### K230
>>
>> this implementation (m2+m1)
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark | (maxNumBytes) | Mode | Cnt | Score -intrinsic | Score + instrinsic, m1+m2 | Error | Units | -intrinsic/+intrinsic
>> -- | -- | -- | -- | -- | -- | -- | -- | --
>> Base64Encode.testBase64Encode | 1 | avgt | 10 | 86.784 | 86.996 | 0.459 | ns/op | 0.9975631063
>> Base64Encode.testBase64Encode | 2 | avgt | 10 | 93.603 | 94.026 | 1.081 | ns/op | 0.9955012443
>> Base64Encode.testBase64Encode | 3 | avgt | 10 | 121.927 | 123.227 | 0.342 | ns/op | 0.989450364
>> Base64Encode.testBase64Encode | 6 | avgt | 10 | 139.554 | 137.4 | 1.221 | ns/op | 1.015676856
>> Base64Encode.testBase64Encode | 7 | avgt | 10 | 160.698 | 162.25 | 2.36 | ns/op | 0.9904345146
>> Base64Encode.testBase64Encode | 9 | avgt | 10 | 161.085 | 153.772 | 1.505 | ns/op | 1.047557423
>> Base64Encode.testBase64Encode | 10 | avgt | 10 | 187.963 | 174.763 | 1.204 | ns/op | 1.075530862
>> Base64Encode.testBase64Encode | 48 | avgt | 10 | 405.212 | 199.4 | 6.374 | ns/op | 2.032156469
>> Base64Encode.testBase64Encode | 512 | avgt | 10 | 3652.555 | 1111.009 | 3.462 | ns/op | 3.287601631
>> Base64Encode.testBase64Encode | 1000 | avgt | 10 | 7217.187 | 2011.943 | 227.784 | ns/op | 3.587172698
>> Base64Encode.testBase64Encode | 20000 | avgt | 10 | 135165.706 | 33864.592 | 57.557 | ns/op | 3.991357876
>>
>> </google-sheets-html-origin>
>>
>> vector with only m2
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 4...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>
> rename
Hi, Sorry for the late reply. Severa comments remain after a second look. Thanks.
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5202:
> 5200: Register length = c_rarg7; // total length of src data in bytes
> 5201:
> 5202: Label ProcessData, Exit, Greater;
Is `Greater` is used anywhere?
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5243:
> 5241:
> 5242: __ sub(length, length, stepSrcM2);
> 5243: __ j(ProcessM2);
Why not move the preceding `__ blt(length, stepSrcM1, ProcessScalar);` at line L5231 here and replace this direct jump to `ProcessM2`? That will save us one condition branch instruction in this `ProcessM2` loop. We only need to adjust the branch target of `__ blt(length, stepSrcM2, ProcessM1);` to here, right?
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5265:
> 5263:
> 5264: __ mv(step, 3);
> 5265: __ blt(length, step, Exit);
Should we simply do `__ beqz(length, Exit)` instead here? That seems more readable to me.
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5314:
> 5312: __ addi(dst, dst, 4);
> 5313: // loop back
> 5314: __ bge(length, step, ScalarLoop);
Similar here. Maybe `__ bnez(length, ScalarLoop)`?
-------------
PR Review: https://git.openjdk.org/jdk/pull/19973#pullrequestreview-2227880524
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1709502837
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1709685922
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1709611711
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1709652002
More information about the hotspot-dev
mailing list