RFR: 8314125: RISC-V: implement Base64 intrinsic - encoding [v2]
Ludovic Henry
luhenry at openjdk.org
Mon Jul 1 17:16:22 UTC 2024
On Mon, 1 Jul 2024 15:33:29 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Hi,
>> Can you help to review the patch?
>>
>> I'm also working a base64 decode instrinsic, but there is some performance regression in some cases, and decode and encode are totally independent with each other, so I will send out review of decode in another pr when I fix the performance regression in it.
>>
>> Thanks.
>>
>> ## Test
>> benchmarks run on CanVM-K230
>>
>> I've tried several implementations, respectively with vector group
>> * m2+m1+scalar
>> * m2+scalar
>> * m1+scalar
>> * pure scalar
>> The best one is combination of m2+m1, it have best performance in all source size.
>>
>> this implementation (m2+m1)
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark | (maxNumBytes) | Mode | Cnt | Score -intrinsic | Score + instrinsic, m1+m2 | Error | Units | -intrinsic/+intrinsic
>> -- | -- | -- | -- | -- | -- | -- | -- | --
>> Base64Encode.testBase64Encode | 1 | avgt | 10 | 86.784 | 86.996 | 0.459 | ns/op | 0.9975631063
>> Base64Encode.testBase64Encode | 2 | avgt | 10 | 93.603 | 94.026 | 1.081 | ns/op | 0.9955012443
>> Base64Encode.testBase64Encode | 3 | avgt | 10 | 121.927 | 123.227 | 0.342 | ns/op | 0.989450364
>> Base64Encode.testBase64Encode | 6 | avgt | 10 | 139.554 | 137.4 | 1.221 | ns/op | 1.015676856
>> Base64Encode.testBase64Encode | 7 | avgt | 10 | 160.698 | 162.25 | 2.36 | ns/op | 0.9904345146
>> Base64Encode.testBase64Encode | 9 | avgt | 10 | 161.085 | 153.772 | 1.505 | ns/op | 1.047557423
>> Base64Encode.testBase64Encode | 10 | avgt | 10 | 187.963 | 174.763 | 1.204 | ns/op | 1.075530862
>> Base64Encode.testBase64Encode | 48 | avgt | 10 | 405.212 | 199.4 | 6.374 | ns/op | 2.032156469
>> Base64Encode.testBase64Encode | 512 | avgt | 10 | 3652.555 | 1111.009 | 3.462 | ns/op | 3.287601631
>> Base64Encode.testBase64Encode | 1000 | avgt | 10 | 7217.187 | 2011.943 | 227.784 | ns/op | 3.587172698
>> Base64Encode.testBase64Encode | 20000 | avgt | 10 | 135165.706 | 33864.592 | 57.557 | ns/op | 3.991357876
>>
>> </google-sheets-html-origin>
>>
>> vector with only m2
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: st...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>
> use pure scalar version when rvv is not supported
Changes requested by luhenry (Committer).
src/hotspot/cpu/riscv/assembler_riscv.hpp line 1828:
> 1826: }
> 1827:
> 1828: // Vector Unit-Stride Instructions
Suggestion:
// Vector Unit-Stride Load Instructions
src/hotspot/cpu/riscv/assembler_riscv.hpp line 1831:
> 1829: INSN(vlseg3e8_v, 0b0000111, 0b000, 0b00000, 0b00, 0b0, g3);
> 1830:
> 1831: INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);
Suggestion:
// Vector Unit-Stride Store Instructions
INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);
src/hotspot/cpu/riscv/assembler_riscv.hpp line 1832:
> 1830:
> 1831: INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);
> 1832: #undef INSN
Blank like before the `#undef INSN`
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5115:
> 5113: * NOTE: each field will occupy a vector register group
> 5114: */
> 5115: void encodeVector(Register src, Register dst, Register codec, Register step,
Suggestion:
void generate_base64_encodeVector(Register src, Register dst, Register codec, Register step,
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5230:
> 5228:
> 5229: // vector version
> 5230: {
You should not even generate the vectorized code if `UseRVV` is false. You can then remove https://github.com/openjdk/jdk/pull/19973/files#diff-97f199af6d1c8c17b2fa4f50eb1bbc0081858cc59a899f32792a2d31f933ccc4R5225-R5227
Suggestion:
if (UseRVV) {
src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5263:
> 5261:
> 5262: // scalar version
> 5263: __ BIND(ProcessScalar);
You can move that in the previous block at https://github.com/openjdk/jdk/pull/19973/files#diff-97f199af6d1c8c17b2fa4f50eb1bbc0081858cc59a899f32792a2d31f933ccc4R5260 as it's the only block where it's used.
-------------
PR Review: https://git.openjdk.org/jdk/pull/19973#pullrequestreview-2151861929
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333608
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333775
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333402
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661334220
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661335985
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661336651
More information about the hotspot-dev
mailing list