RFR: 8256245: AArch64: Implement Base64 decoding intrinsic

Dong Bo dongbo at openjdk.java.net
Fri Apr 2 07:08:33 UTC 2021


On Tue, 30 Mar 2021 03:24:16 GMT, Dong Bo <dongbo at openjdk.org> wrote:

>>> I think I can rewrite this part as loops.
>>> With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you?
>> 
>> Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified.
>
>> > With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you?
>> 
>> Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified.
> 
> Hi, @theRealAph @nick-arm 
> 
> The code is updated. The error handling in SIMD case was rewriten as loops.
> 
> Also combined the two non-SIMD code blocks into one.
> Due to we have only one non-SIMD loop now, it is moved into `generate_base64_decodeBlock`.
> The size of the stub is 692 bytes, the non-SIMD loop takes about 92 bytes if my calculation is right.
> 
> Verified with tests `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java`.
> Compared with previous implementation, the performance changes are negligible.
> 
> Other comments are addressed too. Thanks.

PING... Any suggestions on the updated commit?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3228


More information about the hotspot-compiler-dev mailing list