RFR: 8314125: RISC-V: implement Base64 intrinsic - encoding [v2]

Ludovic Henry luhenry at openjdk.org
Mon Jul 1 17:16:22 UTC 2024


On Mon, 1 Jul 2024 15:33:29 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Hi,
>> Can you help to review the patch?
>> 
>> I'm also working a base64 decode instrinsic, but there is some performance regression in some cases, and decode and encode are totally independent with each other, so I will send out review of decode in another pr when I fix the performance regression in it.
>> 
>> Thanks.
>> 
>> ## Test
>> benchmarks run on CanVM-K230
>> 
>> I've tried several implementations, respectively with vector group
>> * m2+m1+scalar
>> * m2+scalar
>> * m1+scalar
>> * pure scalar
>> The best one is combination of m2+m1, it have best performance in all source size.
>> 
>> this implementation (m2+m1)
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark | (maxNumBytes) | Mode | Cnt | Score -intrinsic | Score + instrinsic, m1+m2 | Error | Units | -intrinsic/+intrinsic
>> -- | -- | -- | -- | -- | -- | -- | -- | --
>> Base64Encode.testBase64Encode | 1 | avgt | 10 | 86.784 | 86.996 | 0.459 | ns/op | 0.9975631063
>> Base64Encode.testBase64Encode | 2 | avgt | 10 | 93.603 | 94.026 | 1.081 | ns/op | 0.9955012443
>> Base64Encode.testBase64Encode | 3 | avgt | 10 | 121.927 | 123.227 | 0.342 | ns/op | 0.989450364
>> Base64Encode.testBase64Encode | 6 | avgt | 10 | 139.554 | 137.4 | 1.221 | ns/op | 1.015676856
>> Base64Encode.testBase64Encode | 7 | avgt | 10 | 160.698 | 162.25 | 2.36 | ns/op | 0.9904345146
>> Base64Encode.testBase64Encode | 9 | avgt | 10 | 161.085 | 153.772 | 1.505 | ns/op | 1.047557423
>> Base64Encode.testBase64Encode | 10 | avgt | 10 | 187.963 | 174.763 | 1.204 | ns/op | 1.075530862
>> Base64Encode.testBase64Encode | 48 | avgt | 10 | 405.212 | 199.4 | 6.374 | ns/op | 2.032156469
>> Base64Encode.testBase64Encode | 512 | avgt | 10 | 3652.555 | 1111.009 | 3.462 | ns/op | 3.287601631
>> Base64Encode.testBase64Encode | 1000 | avgt | 10 | 7217.187 | 2011.943 | 227.784 | ns/op | 3.587172698
>> Base64Encode.testBase64Encode | 20000 | avgt | 10 | 135165.706 | 33864.592 | 57.557 | ns/op | 3.991357876
>> 
>> </google-sheets-html-origin>
>> 
>> vector with only m2
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: st...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
> 
>   use pure scalar version when rvv is not supported

Changes requested by luhenry (Committer).

src/hotspot/cpu/riscv/assembler_riscv.hpp line 1828:

> 1826:   }
> 1827: 
> 1828:   // Vector Unit-Stride Instructions

Suggestion:

  // Vector Unit-Stride Load Instructions

src/hotspot/cpu/riscv/assembler_riscv.hpp line 1831:

> 1829:   INSN(vlseg3e8_v, 0b0000111, 0b000, 0b00000, 0b00, 0b0, g3);
> 1830: 
> 1831:   INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);

Suggestion:

  // Vector Unit-Stride Store Instructions
  INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);

src/hotspot/cpu/riscv/assembler_riscv.hpp line 1832:

> 1830: 
> 1831:   INSN(vsseg4e8_v, 0b0100111, 0b000, 0b00000, 0b00, 0b0, g4);
> 1832: #undef INSN

Blank like before the `#undef INSN`

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5115:

> 5113:    * NOTE: each field will occupy a vector register group
> 5114:    */
> 5115:   void encodeVector(Register src, Register dst, Register codec, Register step,

Suggestion:

  void generate_base64_encodeVector(Register src, Register dst, Register codec, Register step,

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5230:

> 5228: 
> 5229:     // vector version
> 5230:     {

You should not even generate the vectorized code if `UseRVV` is false. You can then remove https://github.com/openjdk/jdk/pull/19973/files#diff-97f199af6d1c8c17b2fa4f50eb1bbc0081858cc59a899f32792a2d31f933ccc4R5225-R5227 

Suggestion:

    if (UseRVV) {

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5263:

> 5261: 
> 5262:     // scalar version
> 5263:     __ BIND(ProcessScalar);

You can move that in the previous block at https://github.com/openjdk/jdk/pull/19973/files#diff-97f199af6d1c8c17b2fa4f50eb1bbc0081858cc59a899f32792a2d31f933ccc4R5260 as it's the only block where it's used.

-------------

PR Review: https://git.openjdk.org/jdk/pull/19973#pullrequestreview-2151861929
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333608
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333775
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661333402
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661334220
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661335985
PR Review Comment: https://git.openjdk.org/jdk/pull/19973#discussion_r1661336651


More information about the hotspot-dev mailing list