Expected vs. observed performance of java.util.zip.CRC32 in Java 7 and 8
Ariel Weisberg
ariel at weisberg.ws
Wed Jan 21 17:34:19 UTC 2015
Hi,
Hopefully this is the right mailing list. I have some questions about
the performance of java.util.zip.CRC32 in OpenJDK.
I heard that CRC32 became an intrinsic in Java 8 that uses hardware
support if available. I tried it out, but got some odd numbers and I am
having trouble nailing down the difference between Java 7 and 8.
I saw some mailing list traffic mentioning that Adler32 got the
intrinsic treatment as well, but looking at the source I don't see any
mention in C1. Maybe it is just non-obvious?
I compared the JDK CRC32 and Adler32 with a pure java slicing by 8
implementation. I was not expecting to see 13 gigabytes/second per core.
Certainly not in Java 7 or 8 without hardware support. I also was not
expecting to see the performance so close for small sizes, but so far
for large sizes. I would expect the throughput to level out quickly as
you start checksumming larger sizes.
The parameter is the number of bytes being checksummed. JMH code is at
http://pastebin.com/y96EFwcL I ran on a Haswell quad-core macbook pro. I
can tell the correct JDK is running by having setup print the
java.version property.
Apologies for the formatting disaster. I attached the output as a text
file so the columns line up.
If you could shed any light on what the performance should be it would
help when choosing whether to use the JDK CRC32 implementation or
another depending on the currently runtime.
Thanks,
Ariel
jdk1.7.0_71
[java] Benchmark (byteSize) Mode Samples
Score Error Units
[java] o.a.c.t.m.Sample.Adler32Array 128 thrpt 6
9484470.705 ± 3544496.362 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 512 thrpt 6
7553107.572 ± 2017788.822 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1024 thrpt 6
5925103.324 ± 1237581.263 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1048576 thrpt 6
12958.857 ± 313.405 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 128 thrpt 6
8675457.907 ± 2920818.797 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 512 thrpt 6
6906837.280 ± 949573.737 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1024 thrpt 6
5537421.656 ± 658220.086 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1048576 thrpt 6
13103.481 ± 400.833 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 128 thrpt 6
11013067.959 ± 688587.910 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 512 thrpt 6
2991944.703 ± 72920.216 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1024 thrpt 6
1516586.386 ± 68061.147 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1048576 thrpt 6
1483.413 ± 88.246 ops/s
jdk1.8.0_25
[java] Benchmark (byteSize) Mode Samples
Score Error Units
[java] o.a.c.t.m.Sample.Adler32Array 128 thrpt 6
9216616.134 ± 4037583.644 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 512 thrpt 6
7470492.783 ± 2059216.459 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1024 thrpt 6
5792188.710 ± 1363845.066 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1048576 thrpt 6
12273.582 ± 1077.855 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 128 thrpt 6
25619960.160 ± 34966219.926 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 512 thrpt 6
13858414.869 ± 9113069.866 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1024 thrpt 6
8781575.118 ± 3524581.216 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1048576 thrpt 6
12631.141 ± 1340.355 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 128 thrpt 6
10772800.588 ± 398939.396 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 512 thrpt 6
2917829.545 ± 96841.740 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1024 thrpt 6
1541478.127 ± 27234.072 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1048576 thrpt 6
1526.471 ± 27.515 ops/s
jdk1.8.0_25 with -XX:-UseCLMUL -XX:-UseCRC32Intrinsics
[java] Benchmark (byteSize) Mode Samples
Score Error Units
[java] o.a.c.t.m.Sample.Adler32Array 128 thrpt 6
9469461.434 ± 3770031.916 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 512 thrpt 6
7675500.241 ± 2044377.592 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1024 thrpt 6
5693849.558 ± 1477732.642 ops/s
[java] o.a.c.t.m.Sample.Adler32Array 1048576 thrpt 6
13110.626 ± 386.136 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 128 thrpt 6
8850789.721 ± 3204385.552 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 512 thrpt 6
7118956.038 ± 1004576.517 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1024 thrpt 6
5590757.571 ± 448188.497 ops/s
[java] o.a.c.t.m.Sample.CRC32OriginalArray 1048576 thrpt 6
13270.839 ± 598.086 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 128 thrpt 6
11351558.878 ± 546114.618 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 512 thrpt 6
3011605.270 ± 217000.560 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1024 thrpt 6
1529659.010 ± 70247.359 ops/s
[java] o.a.c.t.m.Sample.PureJavaCrc32Array 1048576 thrpt 6
1466.362 ± 211.293 ops/s
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: benchmark.txt
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20150121/62b1274e/benchmark.txt>
More information about the core-libs-dev
mailing list