RFR: 8314125: RISC-V: implement Base64 intrinsic - encoding [v2]
Hamlin Li
mli at openjdk.org
Mon Jul 1 15:38:18 UTC 2024
On Mon, 1 Jul 2024 15:33:29 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> Hi,
>> Can you help to review the patch?
>>
>> I'm also working a base64 decode instrinsic, but there is some performance regression in some cases, and decode and encode are totally independent with each other, so I will send out review of decode in another pr when I fix the performance regression in it.
>>
>> Thanks.
>>
>> ## Test
>> benchmarks run on CanVM-K230
>>
>> I've tried several implementations, respectively with vector group
>> * m2+m1+scalar
>> * m2+scalar
>> * m1+scalar
>> * pure scalar
>> The best one is combination of m2+m1, it have best performance in all source size.
>>
>> this implementation (m2+m1)
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
>> Benchmark | (maxNumBytes) | Mode | Cnt | Score -intrinsic | Score + instrinsic, m1+m2 | Error | Units | -intrinsic/+intrinsic
>> -- | -- | -- | -- | -- | -- | -- | -- | --
>> Base64Encode.testBase64Encode | 1 | avgt | 10 | 86.784 | 86.996 | 0.459 | ns/op | 0.9975631063
>> Base64Encode.testBase64Encode | 2 | avgt | 10 | 93.603 | 94.026 | 1.081 | ns/op | 0.9955012443
>> Base64Encode.testBase64Encode | 3 | avgt | 10 | 121.927 | 123.227 | 0.342 | ns/op | 0.989450364
>> Base64Encode.testBase64Encode | 6 | avgt | 10 | 139.554 | 137.4 | 1.221 | ns/op | 1.015676856
>> Base64Encode.testBase64Encode | 7 | avgt | 10 | 160.698 | 162.25 | 2.36 | ns/op | 0.9904345146
>> Base64Encode.testBase64Encode | 9 | avgt | 10 | 161.085 | 153.772 | 1.505 | ns/op | 1.047557423
>> Base64Encode.testBase64Encode | 10 | avgt | 10 | 187.963 | 174.763 | 1.204 | ns/op | 1.075530862
>> Base64Encode.testBase64Encode | 48 | avgt | 10 | 405.212 | 199.4 | 6.374 | ns/op | 2.032156469
>> Base64Encode.testBase64Encode | 512 | avgt | 10 | 3652.555 | 1111.009 | 3.462 | ns/op | 3.287601631
>> Base64Encode.testBase64Encode | 1000 | avgt | 10 | 7217.187 | 2011.943 | 227.784 | ns/op | 3.587172698
>> Base64Encode.testBase64Encode | 20000 | avgt | 10 | 135165.706 | 33864.592 | 57.557 | ns/op | 3.991357876
>>
>> </google-sheets-html-origin>
>>
>> vector with only m2
>> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: st...
>
> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>
> use pure scalar version when rvv is not supported
with pure scalar impelmentation, it also bring some performance imrpovement in all source size, so also enable the intrinsic when rvv is not supported.
performance data
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark | (maxNumBytes) | Mode | Cnt | Score -intrinsic | Score +instrinsic, scalar | Error | Units | Perf opt
-- | -- | -- | -- | -- | -- | -- | -- | --
Base64Encode.testBase64Encode | 1 | avgt | 10 | 86.784 | 86.75 | 0.38 | ns/op | 1
Base64Encode.testBase64Encode | 2 | avgt | 10 | 93.71 | 93.824 | 1.954 | ns/op | 0.999
Base64Encode.testBase64Encode | 3 | avgt | 10 | 121.824 | 123.487 | 0.559 | ns/op | 0.987
Base64Encode.testBase64Encode | 6 | avgt | 10 | 138.984 | 137.697 | 0.273 | ns/op | 1.009
Base64Encode.testBase64Encode | 7 | avgt | 10 | 161.243 | 157.696 | 0.875 | ns/op | 1.022
Base64Encode.testBase64Encode | 9 | avgt | 10 | 169.724 | 155.223 | 1.908 | ns/op | 1.093
Base64Encode.testBase64Encode | 10 | avgt | 10 | 185.92 | 176.339 | 5.875 | ns/op | 1.054
Base64Encode.testBase64Encode | 48 | avgt | 10 | 408.467 | 347.269 | 1.799 | ns/op | 1.176
Base64Encode.testBase64Encode | 512 | avgt | 10 | 3665.34 | 2718.442 | 26.954 | ns/op | 1.348
Base64Encode.testBase64Encode | 1000 | avgt | 10 | 7022.025 | 5290.003 | 33.216 | ns/op | 1.327
Base64Encode.testBase64Encode | 20000 | avgt | 10 | 135819.7 | 101988.94 | 2209.887 | ns/op | 1.332
</google-sheets-html-origin>
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19973#issuecomment-2200477845
More information about the hotspot-dev
mailing list