RFR: 8314124: RISC-V: implement Base64 intrinsic - decoding [v3]
Hamlin Li
mli at openjdk.org
Thu Aug 22 17:43:05 UTC 2024
On Thu, 22 Aug 2024 15:47:32 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5497:
>>
>>> 5495:
>>> 5496: // load 4 bytes encoded src data
>>> 5497: __ lbu(byte0, Address(src, 0));
>>
>> Is it faster to issue four 8-bit loads instead of one 32-bit load and getting the four values with shifting and masking?
>
> Yeh, it could be, I will test it later.
there is bit regression with `lw` instead of `lb`.
I think the rational is that lb*4 are not really 4 loads, as they are continuos in address. But with `lw`, it will need other operations to move data from word to 4 bytes respectively.
<google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
Benchmark | (addSpecial) | (errorIndex) | (lineSize) | (maxNumBytes) | Mode | Cnt | Score -lb | Score -lw | Units | Improvement
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
Base64Decode.testBase64Decode | 0 | 144 | 4 | 1 | avgt | | 123.03 | 124.96 | ns/op | 1.016
Base64Decode.testBase64Decode | 0 | 144 | 4 | 3 | avgt | | 146.828 | 145.344 | ns/op | 0.99
Base64Decode.testBase64Decode | 0 | 144 | 4 | 7 | avgt | | 197.021 | 202.769 | ns/op | 1.029
Base64Decode.testBase64Decode | 0 | 144 | 4 | 32 | avgt | | 310.964 | 328.367 | ns/op | 1.056
Base64Decode.testBase64Decode | 0 | 144 | 4 | 64 | avgt | | 432.836 | 464.795 | ns/op | 1.074
Base64Decode.testBase64Decode | 0 | 144 | 4 | 80 | avgt | | 543.394 | 570.661 | ns/op | 1.05
Base64Decode.testBase64Decode | 0 | 144 | 4 | 96 | avgt | | 599.538 | 659.938 | ns/op | 1.101
Base64Decode.testBase64Decode | 0 | 144 | 4 | 112 | avgt | | 714.922 | 793.329 | ns/op | 1.11
Base64Decode.testBase64Decode | 0 | 144 | 4 | 512 | avgt | | 3054.931 | 3356.059 | ns/op | 1.099
Base64Decode.testBase64Decode | 0 | 144 | 4 | 1000 | avgt | | 4921.95 | 5413.909 | ns/op | 1.1
Base64Decode.testBase64Decode | 0 | 144 | 4 | 20000 | avgt | | 78169.374 | 89504.671 | ns/op | 1.145
Base64Decode.testBase64Decode | 0 | 144 | 4 | 50000 | avgt | | 188385.163 | 218491.724 | ns/op | 1.16
</google-sheets-html-origin>
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20026#discussion_r1727560148
More information about the hotspot-dev
mailing list