RFR: JDK-8314901: AES-GCM interleaved implementation using AVX2 instructions [v2]

Sandhya Viswanathan sviswanathan at openjdk.org
Fri Sep 22 20:51:18 UTC 2023


On Wed, 13 Sep 2023 20:25:22 GMT, Smita Kamath <svkamath at openjdk.org> wrote:

>> Hi All, 
>> I would like to submit AES-GCM optimization for x86_64 architectures using AVX2 instructions. This optimization interleaves AES and GHASH operations.
>> 
>> Below are the performance numbers on my desktop system with -XX:UseAVX=2 option:
>> 
>> |Benchmark | Data Size | Base version (ops/s) | Patched version (ops/s) | Speedup
>> |-------------|------------|---------------|------------------|-----------|
>> |full.AESGCMBench.decrypt | 8192 | 526274.678 | 670014.543 | 1.27
>> full.AESGCMBench.encrypt | 8192 | 538293.315 | 680716.207 | 1.26
>> small.AESGCMBench.decrypt | 8192 |  527854.353 |663131.48 | 1.25
>> small.AESGCMBench.encrypt | 8192 |  548193.804 | 683624.232 |1.24
>> full.AESGCMBench.decryptMultiPart | 8192 |  299865.766 | 299815.851 | 0.99
>> full.AESGCMBench.encryptMultiPart | 8192 |  534406.564 |539235.462 | 1.00
>> small.AESGCMBench.decryptMultiPart | 8192 |  299960.202 |298913.629 | 0.99
>> small.AESGCMBench.encryptMultiPart | 8192 | 542669.258 | 540552.293 | 0.99
>>   |   |   |   |  
>> full.AESGCMBench.decrypt | 16384 |  307266.364 |390397.778 | 1.27
>> full.AESGCMBench.encrypt | 16384 | 311491.901 | 397279.681 | 1.27
>> small.AESGCMBench.decrypt | 16384 |  306257.801 | 389531.665 |1.27
>> small.AESGCMBench.encrypt | 16384 |  311468.972 | 397804.753 | 1.27
>> full.AESGCMBench.decryptMultiPart | 16384 |  159634.341 | 181271.487 | 1.13
>> full.AESGCMBench.encryptMultiPart | 16384 | 308980.992 | 385606.113 | 1.24
>> small.AESGCMBench.decryptMultiPart | 16384 | 160476.064 |181019.205 |  1.12
>> small.AESGCMBench.encryptMultiPart | 16384 | 308382.656 | 391126.417 | 1.26
>>   |   |   |   |  
>> full.AESGCMBench.decrypt | 32768 |  162284.703 | 213257.481 |1.31
>> full.AESGCMBench.encrypt | 32768 |  164833.104 | 215568.639 | 1.30
>> small.AESGCMBench.decrypt | 32768 |  164416.491 | 213422.347 | 1.29
>> small.AESGCMBench.encrypt | 32768 |  166619.205 | 214584.208 |1.28
>> full.AESGCMBench.decryptMultiPart | 32768 |  83306.239 | 93762.988 |1.12
>> full.AESGCMBench.encryptMultiPart | 32768 | 166109.391 |211701.969 |  1.27
>> small.AESGCMBench.decryptMultiPart | 32768 | 83792.559 | 94530.786 | 1.12
>> small.AESGCMBench.encryptMultiPart | 32768 |  162975.904 |212085.047 | 1.30
>>   |   |   |   |  
>> full.AESGCMBench.decrypt | 65536 | 85765.835 | 112244.611 | 1.30
>> full.AESGCMBench.encrypt | 65536 |  86471.805 | 113320.536 |1.31
>> small.AESGCMBench.decrypt | 65536 |  84490.816 | 112122.358 |1.32
>> small.AESGCMBench.encrypt | 65536 | 85403.025 | 112741.811 |  1.32
>> full.AES...
>
> Smita Kamath has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Removed isEncrypt boolean variable

src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 3627:

> 3625:   __ cmpl(rounds, 52);
> 3626:   __ jcc(Assembler::greaterEqual, aes_192);
> 3627:   __ jmp(last_aes_rnd);

Could be replaced with 
__ jcc(Assembler::below, last_aes_rnd);

src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 3649:

> 3647:   __ cmpl(rounds, 60);
> 3648:   __ jcc(Assembler::aboveEqual, aes_256);
> 3649:   __ jmp(last_aes_rnd);

Could be replaced with
__ jcc(Assembler::below, last_aes_rnd);

src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 4199:

> 4197:   //The entire message was encrypted processed in initial and now need to be hashed
> 4198:   __ cmpl(len, 0);
> 4199:   __ jcc(Assembler::equal, encrypt_done);

We should check for len to be atleast 128 here as the block following processes 128 bytes:
  __ cmpl(len, 128);
  __ jcc(Assembler::less, encrypt_done);

src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 4241:

> 4239:   __ jcc(Assembler::equal, encrypt_done);
> 4240: 
> 4241:   __ bind(encrypt_done);

This is a fall through case: 
  __ cmpl(r14, 0);
  __ jcc(Assembler::equal, encrypt_done);
The above two instructions can be removed.

src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp line 4246:

> 4244:   __ bind(ghash_done);
> 4245:   __ movdqu(xmm15, ExternalAddress(counter_mask_linc1_addr()), rbx /*rscratch*/);
> 4246:   __ vpaddd(xmm9, xmm9, xmm15, Assembler::AVX_128bit);

We could do the following here:
__ vpaddd(xmm9, xmm9, ExternalAddress(counter_mask_linc1_addr()), Assembler::AVX_128bit, rbx);

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15410#discussion_r1334673738
PR Review Comment: https://git.openjdk.org/jdk/pull/15410#discussion_r1334674168
PR Review Comment: https://git.openjdk.org/jdk/pull/15410#discussion_r1334660702
PR Review Comment: https://git.openjdk.org/jdk/pull/15410#discussion_r1334657499
PR Review Comment: https://git.openjdk.org/jdk/pull/15410#discussion_r1334665625



More information about the security-dev mailing list