RFR: JDK-8184947:,ZipCoder performance improvements

Sun Dec 10 23:12:49 UTC 2017

Hi Sherman,

On 2017-12-09 00:09, Xueming Shen wrote:
> Hi,
>
> Please help review the changes for j.u.z.ZipCoder/JDK-8184947 (which 
> also includes
> cleanup/improvement work in java.lang.StringCoding.java to speed up 
> general String
> coding performance, especially for UTF8).
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8184947
> webrev: http://cr.openjdk.java.net/~sherman/8184947/webrev

I've not fully reviewed this yet, but something struck me halfway 
through: As the ASCII
fast-path is what's really important here, we could write that part 
without ever having
to go via a StringCoding.Result.

On four of your ZipCodingBM micros this improves performance a bit 
further (~10%):

diff -r 848591d85052 src/java.base/share/classes/java/lang/StringCoding.java

--- a/src/java.base/share/classes/java/lang/StringCoding.java    Sun Dec 
10 18:48:21 2017 +0100
+++ b/src/java.base/share/classes/java/lang/StringCoding.java    Sun Dec 
10 18:55:38 2017 +0100
@@ -937,7 +937,13 @@
       * Throws iae, instead of replacing, if malformed or unmmappble.
       */
      static String newStringUTF8NoRepl(byte[] bytes, int off, int len) {
-        Result ret = decodeUTF8(bytes, off, len, false);
+        if (COMPACT_STRINGS && !hasNegatives(bytes, off, len)) {
+            return new String(Arrays.copyOfRange(bytes, off, off + 
len), LATIN1);
+        }
+        Result ret = decodeUTF8_0(bytes, off, len, false);
          return new String(ret.value, ret.coder);
      }

Benchmark                Mode  Cnt    Score   Error  Units
ZipCodingBM.jf_entries   avgt   25   43.682 ± 0.656  us/op
ZipCodingBM.jf_stream    avgt   25   42.075 ± 0.444  us/op
ZipCodingBM.zf_entries   avgt   25   43.323 ± 0.572  us/op
ZipCodingBM.zf_stream    avgt   25   40.237 ± 0.604  us/op

After:
Benchmark                Mode  Cnt    Score   Error  Units
ZipCodingBM.jf_entries   avgt   25   37.551 ± 1.198  us/op
ZipCodingBM.jf_stream    avgt   25   38.065 ± 0.628  us/op
ZipCodingBM.zf_entries   avgt   25   37.595 ± 0.686  us/op
ZipCodingBM.zf_stream    avgt   25   35.734 ± 0.442  us/op

(I don't know which jar you using as test.jar, but results seems 
consistent across a
few different ones)

The gain is achieved by not going via the 
ThreadLocal<StringCoding.Result> resultCache,
which checks out when inspecting the perfasm output.

I'm a bit skeptical of ThreadLocal caching optimizations for such small 
objects
(StringCoding.Result), and wonder if there's something else we can do to 
help the
optimizer out here, possibly eliminating the allocation entirely.

Thanks!

/Claes