RFR: JDK-8300808: Accelerate Base64 on x86 for AVX2 [v11]
    Sandhya Viswanathan 
    sviswanathan at openjdk.org
       
    Tue Feb  7 00:51:46 UTC 2023
    
    
  
On Tue, 7 Feb 2023 00:12:21 GMT, Scott Gibbons <duke at openjdk.org> wrote:
>> Added code for Base64 acceleration (encode and decode) which will accelerate ~4x for AVX2 platforms.
>> 
>> Encode performance:
>> **Old:**
>> 
>> Benchmark                      (maxNumBytes)   Mode  Cnt     Score   Error   Units
>> Base64Encode.testBase64Encode           1024  thrpt    3  4309.439 ± 2.632  ops/ms
>> 
>> 
>> **New:**
>> 
>> Benchmark                      (maxNumBytes)   Mode  Cnt      Score     Error   Units
>> Base64Encode.testBase64Encode           1024  thrpt    3  24211.397 ± 102.026  ops/ms
>> 
>> 
>> Decode performance:
>> **Old:**
>> 
>> Benchmark                      (errorIndex)  (lineSize)  (maxNumBytes)   Mode  Cnt     Score    Error   Units
>> Base64Decode.testBase64Decode           144           4           1024  thrpt    3  3961.768 ± 93.409  ops/ms
>> 
>> **New:**
>> Benchmark                      (errorIndex)  (lineSize)  (maxNumBytes)   Mode  Cnt      Score    Error   Units
>> Base64Decode.testBase64Decode           144           4           1024  thrpt    3  14738.051 ± 24.383  ops/ms
>
> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add algorithm comments
src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2720:
> 2718:     __ vpshufb(xmm5, xmm9, xmm1, Assembler::AVX_256bit);
> 2719:     // If the and of the two is non-zero, we have an invalid input character
> 2720:     __ vptest(xmm3, xmm5);
For isURL, it looks to me that the vptest will fail for URL valid input 0x5F ("_"):
  upper_nibble =  0x5; 
  lower_nibble = 0xF;
  lut_lo_URL = 0x1B; (corresponding to 0xF)
  lut_hi = 0x8; (corresponding to 0x5)
  lut_lo_URL & lut_hi = 0x8; (not zero, taken as not allowable and so exit from loop)
Could you please verify on your end and fix this?
My understanding is that this is happening because 5 and 7 upper nibble get the same encoding 0x8.
-------------
PR: https://git.openjdk.org/jdk/pull/12126
    
    
More information about the core-libs-dev
mailing list