Integrated: 8302113: Improve CRC32 intrinsic with crypto pmull on AArch64

Yi-Fan Tsai duke at openjdk.org
Fri Feb 17 14:46:54 UTC 2023


On Thu, 9 Feb 2023 02:25:27 GMT, Yi-Fan Tsai <duke at openjdk.org> wrote:

> Instruction pmull and pmull2 support operating on 64-bit data in Cryptographic Extension. The execution throughput of this form raises from 1 on Neoverse N1 to 4 on Neoverse V1 while the latency remains 2. The CRC32 instructions did not changed: latency 2, throughput 1. As a result, computing CRC32 using pmull could perform better than using crc32 instruction.
> 
> The following test has passed.
> test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java
> 
> The throughput reported by [the micro benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/java/util/TestCRC32.java) is measured on an EC2 c7g instance. The optimization shows 11 - 99% improvement when the input is at least 384 bytes.
> 
> | input               | 64         | 128        | 256        | 384        | 511        | 512        | 1,024      |
> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
> |  improvement  | 0.02%      | 0.02%      | 0.00%      | 16.00%     | 11.94%     | 34.75%     | 69.80%     |
> 
> | input               | 2,048      | 4,096      | 8,192      | 16,384     | 32,768     | 65,536     |
> | ------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
> |  improvement  | 77.61%     | 92.33%     | 95.98%     | 97.95%     | 99.33%     | 98.36%     |
> 
> 
> Baseline
> 
> TestCRC32.testCRC32Update         64  thrpt   12  173126.358 ± 118.330  ops/ms
> TestCRC32.testCRC32Update        128  thrpt   12  112910.118 ±  47.305  ops/ms
> TestCRC32.testCRC32Update        256  thrpt   12   66601.990 ±   7.294  ops/ms
> TestCRC32.testCRC32Update        384  thrpt   12   47229.319 ±   3.949  ops/ms
> TestCRC32.testCRC32Update        511  thrpt   12   33733.119 ±   4.076  ops/ms
> TestCRC32.testCRC32Update        512  thrpt   12   36584.565 ±   4.211  ops/ms
> TestCRC32.testCRC32Update       1024  thrpt   12   19239.083 ±   1.040  ops/ms
> TestCRC32.testCRC32Update       2048  thrpt   12    9875.652 ±   0.435  ops/ms
> TestCRC32.testCRC32Update       4096  thrpt   12    5004.425 ±   0.290  ops/ms
> TestCRC32.testCRC32Update       8192  thrpt   12    2519.185 ±   0.169  ops/ms
> TestCRC32.testCRC32Update      16384  thrpt   12    1263.909 ±   0.194  ops/ms
> TestCRC32.testCRC32Update      32768  thrpt   12     632.018 ±   0.053  ops/ms
> TestCRC32.testCRC32Update      65536  thrpt   12     315.471 ±   0.095  ops/ms
> 
> 
> Crypto pmull
> 
> TestCRC32.testCRC32Update         64  thrpt   12  173168.669 ±   4.746  ops/ms
> TestCRC32.testCRC32Update        128  thrpt   12  112933.519 ±   4.583  ops/ms
> TestCRC32.testCRC32Update        256  thrpt   12   66602.462 ±   3.150  ops/ms
> TestCRC32.testCRC32Update        384  thrpt   12   54784.739 ±   2.110  ops/ms
> TestCRC32.testCRC32Update        511  thrpt   12   37760.816 ±  69.911  ops/ms
> TestCRC32.testCRC32Update        512  thrpt   12   49297.609 ±  21.983  ops/ms
> TestCRC32.testCRC32Update       1024  thrpt   12   32667.507 ±  90.610  ops/ms
> TestCRC32.testCRC32Update       2048  thrpt   12   17539.986 ± 511.416  ops/ms
> TestCRC32.testCRC32Update       4096  thrpt   12    9625.249 ±   9.713  ops/ms
> TestCRC32.testCRC32Update       8192  thrpt   12    4937.135 ±   6.121  ops/ms
> TestCRC32.testCRC32Update      16384  thrpt   12    2501.936 ±   1.270  ops/ms
> TestCRC32.testCRC32Update      32768  thrpt   12    1259.831 ±   0.119  ops/ms
> TestCRC32.testCRC32Update      65536  thrpt   12     625.773 ±   0.242  ops/ms

This pull request has now been integrated.

Changeset: 57fde75b
Author:    Yi-Fan Tsai <yftsai at amazon.com>
Committer: Volker Simonis <simonis at openjdk.org>
URL:       https://git.openjdk.org/jdk/commit/57fde75b2a9d853c2abe1396ace6a83d198dd284
Stats:     215 lines in 5 files changed: 213 ins; 0 del; 2 mod

8302113: Improve CRC32 intrinsic with crypto pmull on AArch64

Reviewed-by: simonis

-------------

PR: https://git.openjdk.org/jdk/pull/12480


More information about the hotspot-dev mailing list