RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31]
Suchismith Roy
sroy at openjdk.org
Mon Apr 28 12:51:55 UTC 2025
On Thu, 24 Apr 2025 14:13:50 GMT, Suchismith Roy <sroy at openjdk.org> wrote:
>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>>
>> Currently acceleration code for GHASH is missing for PPC64.
>>
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision:
>
> masm
Without GHASH change
Benchmark (dataMethod) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 1024 128 thrpt 8 52020.466 ± 756.766 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 1500 128 thrpt 8 35524.179 ± 587.709 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 4096 128 thrpt 8 14065.545 ± 94.679 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 16384 128 thrpt 8 3494.208 ± 36.804 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 1024 128 thrpt 8 53579.051 ± 521.148 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 1500 128 thrpt 8 37105.385 ± 755.540 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 4096 128 thrpt 8 14122.494 ± 78.641 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 16384 128 thrpt 8 3570.723 ± 18.136 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 1024 128 thrpt 8 50573.814 ± 858.171 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 1500 128 thrpt 8 35402.422 ± 761.839 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 4096 128 thrpt 8 13948.808 ± 121.955 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 16384 128 thrpt 8 3555.491 ± 27.543 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 1024 128 thrpt 8 52583.092 ± 786.567 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 1500 128 thrpt 8 36563.715 ± 365.381 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 4096 128 thrpt 8 13974.515 ± 88.673 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 16384 128 thrpt 8 3552.996 ± 25.234 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 53387.361 ± 690.909 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1500 128 thrpt 8 36970.383 ± 495.504 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 4096 128 thrpt 8 13919.025 ± 88.704 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 16384 128 thrpt 8 3582.015 ± 12.920 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 53631.653 ± 449.160 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1500 128 thrpt 8 37890.654 ± 291.797 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 4096 128 thrpt 8 14324.705 ± 33.475 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 16384 128 thrpt 8 3563.167 ± 18.069 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 1024 128 thrpt 8 52676.705 ± 828.404 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 1500 128 thrpt 8 36329.914 ± 475.700 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 4096 128 thrpt 8 14062.787 ± 118.448 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 16384 128 thrpt 8 3579.154 ± 16.530 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 1024 128 thrpt 8 53562.594 ± 317.060 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 1500 128 thrpt 8 36811.085 ± 320.696 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 4096 128 thrpt 8 14086.269 ± 54.366 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 16384 128 thrpt 8 3563.559 ± 19.188 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt direct 1024 128 thrpt 8 52021.706 ± 827.404 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt heap 1024 128 thrpt 8 53550.519 ± 457.500 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart direct 1024 128 thrpt 8 50392.121 ± 890.139 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart heap 1024 128 thrpt 8 52771.665 ± 547.670 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 53258.597 ± 758.263 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 54603.228 ± 343.555 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart direct 1024 128 thrpt 8 52796.661 ± 870.566 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart heap 1024 128 thrpt 8 53488.007 ± 441.574 ops/s
Finished running test 'micro:AESGCMByteBuffer'
Test report is stored in build/linux-ppc64le-server-fastdebug/test-results/micro_AESGCMByteBuffer
with my change
Benchmark (dataMethod) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 1024 128 thrpt 8 192164.655 ± 2499.922 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 1500 128 thrpt 8 138590.675 ± 1718.893 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 4096 128 thrpt 8 60015.129 ± 516.554 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt direct 16384 128 thrpt 8 15705.840 ± 101.889 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 1024 128 thrpt 8 234618.808 ± 3508.043 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 1500 128 thrpt 8 153490.970 ± 1991.507 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 4096 128 thrpt 8 59706.883 ± 393.104 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt heap 16384 128 thrpt 8 15282.959 ± 35.228 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 1024 128 thrpt 8 169563.728 ± 3262.014 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 1500 128 thrpt 8 125917.360 ± 2171.133 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 4096 128 thrpt 8 57233.798 ± 1219.124 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart direct 16384 128 thrpt 8 15314.450 ± 267.215 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 1024 128 thrpt 8 199834.254 ± 2929.256 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 1500 128 thrpt 8 143659.707 ± 2019.578 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 4096 128 thrpt 8 57676.269 ± 760.886 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart heap 16384 128 thrpt 8 14899.282 ± 194.883 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 217833.792 ± 2839.966 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 1500 128 thrpt 8 152150.607 ± 2203.853 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 4096 128 thrpt 8 60091.726 ± 812.084 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt direct 16384 128 thrpt 8 15720.273 ± 85.991 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 218901.548 ± 2687.554 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 1500 128 thrpt 8 153527.621 ± 1816.675 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 4096 128 thrpt 8 58896.329 ± 1637.968 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt heap 16384 128 thrpt 8 15226.399 ± 17.957 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 1024 128 thrpt 8 197339.940 ± 2428.986 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 1500 128 thrpt 8 136931.341 ± 2782.111 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 4096 128 thrpt 8 59652.962 ± 750.375 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart direct 16384 128 thrpt 8 15667.096 ± 58.490 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 1024 128 thrpt 8 214639.739 ± 4077.556 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 1500 128 thrpt 8 155557.214 ± 2422.094 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 4096 128 thrpt 8 58895.472 ± 1538.650 ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart heap 16384 128 thrpt 8 15038.955 ± 44.792 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt direct 1024 128 thrpt 8 192555.048 ± 3710.757 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt heap 1024 128 thrpt 8 235177.894 ± 4321.018 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart direct 1024 128 thrpt 8 167625.340 ± 2418.147 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart heap 1024 128 thrpt 8 200193.172 ± 3319.042 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt direct 1024 128 thrpt 8 216340.878 ± 4651.345 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt heap 1024 128 thrpt 8 231760.813 ± 4271.094 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart direct 1024 128 thrpt 8 195748.230 ± 5825.305 ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart heap 1024 128 thrpt 8 215594.033 ± 4254.075 ops/s
Finished running test 'micro:AESGCMByteBuffer'
Test report is stored in build/linux-ppc64le-server-fastdebug/test-results/micro_AESGCMByteBuffer
-------------
PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2835135608
PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2835136297
More information about the hotspot-dev
mailing list