RFR: 8376164: Optimize AES/ECB/PKCS5Padding implementation using full-message intrinsic stub and parallel RoundKey addition [v5]
xinyangwu
duke at openjdk.org
Thu Feb 26 12:47:49 UTC 2026
On Tue, 24 Feb 2026 04:00:49 GMT, xinyangwu <duke at openjdk.org> wrote:
>> ### Summary
>> This PR introduces a parallel intrinsic for AES/ECB operations to replace the current per-block processing approach, reducing native call overhead and improving throughput for multi-block operations.
>> ### Problem
>> Except supporting AVX512, The existing AES/ECB/PKCS5Padding implementation suffers from three major performance issues:
>> 1. Excessive stub call overhead: Each 16-byte block requires a separate intrinsic call, resulting in high invocation frequency
>>
>> 2. Inefficient instruction-level parallelism: The serialized block processing fails to fully utilize instruction-level parallelism
>>
>> 3. Redundant setup/teardown: Repeated initialization of encryption state for each block
>> ### Changes
>> Added parallel AES intrinsic implementation
>> ### Testing
>> JMH benchmarks
>>
>> It can bring about a **37.43%** performance improvement.
>>
>> On a Intel(R) Core(TM) i9-14900HX CPU machine with origin implements:
>>
>>
>> Benchmark Mode Cnt Score Error Units
>> AesTest.test avgt 5 11518.846 ± 68.621 ns/op
>>
>>
>> On the same machine with optimized implements:
>>
>>
>> Benchmark Mode Cnt Score Error Units
>> AesTest.test avgt 5 8381.499 ± 57.751 ns/op
>>
>>
>> All Tier-1 tests pass on linux-x64. This modification does not involve changing the encryption or decryption logic.
>
> xinyangwu has updated the pull request incrementally with one additional commit since the last revision:
>
> 8376164: Optimize AES/ECB/PKCS5Padding implementation using full-message intrinsic stub and parallel RoundKey addition
I managed to find two Intel machines with AVX-512 support:
**Intel(R) Xeon(R) 6982P-C** and **Intel(R) Xeon(R) Platinum 8480**.
On both systems, I confirmed that `UseAVX=3` and `UseKNLSetting=true` were enabled, and that execution indeed reached my newly added `generate_electronicCodeBook_[en/de]cryptAESCrypt_Parallel` path.
However, I still was not able to reproduce the failure you observed. I ran the test with the following command:
jtreg -vmoptions:"-XX:UseAVX=3 -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting" \
-va -jdk build/debug-aes/images/jdk \
test/hotspot/jtreg/compiler/codegen/aes/TestAESMain.java
@marc-chevalier Could you please let me know:
- Were there any additional JVM flags or test parameters used in your run?
- Did you run only `TestAESMain.java`, or was this part of a larger test suite execution?
Any extra details would be very helpful for me to reproduce and investigate this issue.
Thanks again for reporting this and for your help in tracking it down!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/29385#issuecomment-3966401774
More information about the security-dev
mailing list