RFR: 8277358: Accelerate CRC32-C [v2]
eric.caspole at oracle.com
eric.caspole at oracle.com
Wed Dec 1 22:16:42 UTC 2021
Hi Scott,
Thanks for the JMH. I would like to use Mode.Throughput (i.e. 9368.786
± 96.956 ops/ms) so the scores are not very tiny numbers, and just use
the default iterations so the runs are about 35 minutes instead of 1h30,
what do you think? The iterations are very stable so the defaults are
fine in my testing.
Regards,
Eric
diff --git a/test/micro/org/openjdk/bench/java/util/TestCRC32C.java
b/test/micro/org/openjdk/bench/java/util/TestCRC32C.java
index 10681e19bbf..0c3b39fc59a 100644
--- a/test/micro/org/openjdk/bench/java/util/TestCRC32C.java
+++ b/test/micro/org/openjdk/bench/java/util/TestCRC32C.java
@@ -27,12 +27,10 @@ import java.util.concurrent.TimeUnit;
import java.util.zip.CRC32C;
import org.openjdk.jmh.annotations.*;
- at BenchmarkMode(Mode.AverageTime)
- at OutputTimeUnit(TimeUnit.MICROSECONDS)
+ at BenchmarkMode(Mode.Throughput)
+ at OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 2)
- at Warmup(iterations = 2, time = 30, timeUnit = TimeUnit.SECONDS)
- at Measurement(iterations = 3, time = 60, timeUnit = TimeUnit.SECONDS)
public class TestCRC32C {
On 11/30/21 7:13 PM, Scott Gibbons wrote:
> On Wed, 1 Dec 2021 00:02:14 GMT, Scott Gibbons <duke at openjdk.java.net> wrote:
>
>>> Accelerates CRC32-C by utilizing vpclmulqdq similarly to CRC32. This change achieves ~4x throughput improvement.
>>>
>>> 5986.947899319073 MB/s => 24041.05203089616 MB/s
>>> 5840.02689336947 MB/s => 24898.781468710356 MB/s
>>>
>>> ********** Original ***********
>>>
>>>
>>> scottgi at 96974-ICX32:~/crc/jdk (asgibbons-crc32c)$ java test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java 20000000
>>> offset = 0
>>> msgSize = 512 bytes
>>> iters = 20000000
>>> -------------------------------------------------------
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> CRC32C.update(byte[]) runtime = 1.710387358 seconds
>>> CRC32C.update(byte[]) throughput = 5986.947899319073 MB/s
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> -------------------------------------------------------
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> CRC32C.update(ByteBuffer) runtime = 1.753416583 seconds
>>> CRC32C.update(ByteBuffer) throughput = 5840.02689336947 MB/s
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> -------------------------------------------------------
>>>
>>>
>>>
>>>
>>> *********** With my changes: *************
>>>
>>>
>>>
>>> scottgi at 96974-ICX32:~/crc/jdk (asgibbons-crc32c)$ java test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java 20000000
>>> offset = 0
>>> msgSize = 512 bytes
>>> iters = 20000000
>>> -------------------------------------------------------
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> CRC32C.update(byte[]) runtime = 0.425938099 seconds
>>> CRC32C.update(byte[]) throughput = 24041.05203089616 MB/s
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> -------------------------------------------------------
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> CRC32C.update(ByteBuffer) runtime = 0.411265106 seconds
>>> CRC32C.update(ByteBuffer) throughput = 24898.781468710356 MB/s
>>> CRCs: crc = ae10ee5a, crcReference = ae10ee5a
>>> -------------------------------------------------------
>> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Adding CRC32-C microbenchmark.
> Hi, Eric. I added a microbenchmark for CRC32-C. I'm waiting for full completion, but it looks like somewhere around 40GB/s throughput on average. I'll post the results once completed.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/6595
More information about the hotspot-compiler-dev
mailing list