RFR: JDK-8216437 : PPC64: Add intrinsic for GHASH algorithm [v31]

Suchismith Roy sroy at openjdk.org
Mon Apr 28 12:51:55 UTC 2025


On Thu, 24 Apr 2025 14:13:50 GMT, Suchismith Roy <sroy at openjdk.org> wrote:

>> JBS Issue : [JDK-8216437](https://bugs.openjdk.org/browse/JDK-8216437)
>> 
>> Currently acceleration code for GHASH is missing for PPC64. 
>> 
>> The current implementation utlilises SIMD instructions on Power and uses Karatsuba multiplication for obtaining the final result.
>
> Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision:
> 
>   masm

Without GHASH change 


Benchmark                                          (dataMethod)  (dataSize)  (keyLength)  (provider)   Mode  Cnt      Score     Error  Units
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        1024          128              thrpt    8  52020.466 ± 756.766  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        1500          128              thrpt    8  35524.179 ± 587.709  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        4096          128              thrpt    8  14065.545 ±  94.679  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct       16384          128              thrpt    8   3494.208 ±  36.804  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        1024          128              thrpt    8  53579.051 ± 521.148  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        1500          128              thrpt    8  37105.385 ± 755.540  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        4096          128              thrpt    8  14122.494 ±  78.641  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap       16384          128              thrpt    8   3570.723 ±  18.136  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        1024          128              thrpt    8  50573.814 ± 858.171  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        1500          128              thrpt    8  35402.422 ± 761.839  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        4096          128              thrpt    8  13948.808 ± 121.955  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct       16384          128              thrpt    8   3555.491 ±  27.543  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        1024          128              thrpt    8  52583.092 ± 786.567  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        1500          128              thrpt    8  36563.715 ± 365.381  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        4096          128              thrpt    8  13974.515 ±  88.673  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap       16384          128              thrpt    8   3552.996 ±  25.234  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        1024          128              thrpt    8  53387.361 ± 690.909  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        1500          128              thrpt    8  36970.383 ± 495.504  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        4096          128              thrpt    8  13919.025 ±  88.704  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct       16384          128              thrpt    8   3582.015 ±  12.920  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        1024          128              thrpt    8  53631.653 ± 449.160  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        1500          128              thrpt    8  37890.654 ± 291.797  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        4096          128              thrpt    8  14324.705 ±  33.475  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap       16384          128              thrpt    8   3563.167 ±  18.069  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        1024          128              thrpt    8  52676.705 ± 828.404  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        1500          128              thrpt    8  36329.914 ± 475.700  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        4096          128              thrpt    8  14062.787 ± 118.448  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct       16384          128              thrpt    8   3579.154 ±  16.530  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        1024          128              thrpt    8  53562.594 ± 317.060  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        1500          128              thrpt    8  36811.085 ± 320.696  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        4096          128              thrpt    8  14086.269 ±  54.366  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap       16384          128              thrpt    8   3563.559 ±  19.188  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt                 direct        1024          128              thrpt    8  52021.706 ± 827.404  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt                   heap        1024          128              thrpt    8  53550.519 ± 457.500  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart        direct        1024          128              thrpt    8  50392.121 ± 890.139  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart          heap        1024          128              thrpt    8  52771.665 ± 547.670  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt                 direct        1024          128              thrpt    8  53258.597 ± 758.263  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt                   heap        1024          128              thrpt    8  54603.228 ± 343.555  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart        direct        1024          128              thrpt    8  52796.661 ± 870.566  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart          heap        1024          128              thrpt    8  53488.007 ± 441.574  ops/s
Finished running test 'micro:AESGCMByteBuffer'
Test report is stored in build/linux-ppc64le-server-fastdebug/test-results/micro_AESGCMByteBuffer

with my change 
Benchmark                                          (dataMethod)  (dataSize)  (keyLength)  (provider)   Mode  Cnt       Score      Error  Units
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        1024          128              thrpt    8  192164.655 ± 2499.922  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        1500          128              thrpt    8  138590.675 ± 1718.893  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct        4096          128              thrpt    8   60015.129 ±  516.554  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                  direct       16384          128              thrpt    8   15705.840 ±  101.889  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        1024          128              thrpt    8  234618.808 ± 3508.043  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        1500          128              thrpt    8  153490.970 ± 1991.507  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap        4096          128              thrpt    8   59706.883 ±  393.104  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decrypt                    heap       16384          128              thrpt    8   15282.959 ±   35.228  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        1024          128              thrpt    8  169563.728 ± 3262.014  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        1500          128              thrpt    8  125917.360 ± 2171.133  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct        4096          128              thrpt    8   57233.798 ± 1219.124  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart         direct       16384          128              thrpt    8   15314.450 ±  267.215  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        1024          128              thrpt    8  199834.254 ± 2929.256  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        1500          128              thrpt    8  143659.707 ± 2019.578  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap        4096          128              thrpt    8   57676.269 ±  760.886  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.decryptMultiPart           heap       16384          128              thrpt    8   14899.282 ±  194.883  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        1024          128              thrpt    8  217833.792 ± 2839.966  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        1500          128              thrpt    8  152150.607 ± 2203.853  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct        4096          128              thrpt    8   60091.726 ±  812.084  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                  direct       16384          128              thrpt    8   15720.273 ±   85.991  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        1024          128              thrpt    8  218901.548 ± 2687.554  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        1500          128              thrpt    8  153527.621 ± 1816.675  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap        4096          128              thrpt    8   58896.329 ± 1637.968  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encrypt                    heap       16384          128              thrpt    8   15226.399 ±   17.957  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        1024          128              thrpt    8  197339.940 ± 2428.986  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        1500          128              thrpt    8  136931.341 ± 2782.111  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct        4096          128              thrpt    8   59652.962 ±  750.375  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart         direct       16384          128              thrpt    8   15667.096 ±   58.490  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        1024          128              thrpt    8  214639.739 ± 4077.556  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        1500          128              thrpt    8  155557.214 ± 2422.094  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap        4096          128              thrpt    8   58895.472 ± 1538.650  ops/s
o.o.b.j.c.full.AESGCMByteBuffer.encryptMultiPart           heap       16384          128              thrpt    8   15038.955 ±   44.792  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt                 direct        1024          128              thrpt    8  192555.048 ± 3710.757  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decrypt                   heap        1024          128              thrpt    8  235177.894 ± 4321.018  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart        direct        1024          128              thrpt    8  167625.340 ± 2418.147  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.decryptMultiPart          heap        1024          128              thrpt    8  200193.172 ± 3319.042  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt                 direct        1024          128              thrpt    8  216340.878 ± 4651.345  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encrypt                   heap        1024          128              thrpt    8  231760.813 ± 4271.094  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart        direct        1024          128              thrpt    8  195748.230 ± 5825.305  ops/s
o.o.b.j.c.small.AESGCMByteBuffer.encryptMultiPart          heap        1024          128              thrpt    8  215594.033 ± 4254.075  ops/s
Finished running test 'micro:AESGCMByteBuffer'
Test report is stored in build/linux-ppc64le-server-fastdebug/test-results/micro_AESGCMByteBuffer

-------------

PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2835135608
PR Comment: https://git.openjdk.org/jdk/pull/20235#issuecomment-2835136297


More information about the hotspot-dev mailing list