RFR: JDK-8184947:,ZipCoder performance improvements
Claes Redestad
claes.redestad at oracle.com
Sun Dec 10 23:12:49 UTC 2017
Hi Sherman,
On 2017-12-09 00:09, Xueming Shen wrote:
> Hi,
>
> Please help review the changes for j.u.z.ZipCoder/JDK-8184947 (which
> also includes
> cleanup/improvement work in java.lang.StringCoding.java to speed up
> general String
> coding performance, especially for UTF8).
>
> issue: https://bugs.openjdk.java.net/browse/JDK-8184947
> webrev: http://cr.openjdk.java.net/~sherman/8184947/webrev
I've not fully reviewed this yet, but something struck me halfway
through: As the ASCII
fast-path is what's really important here, we could write that part
without ever having
to go via a StringCoding.Result.
On four of your ZipCodingBM micros this improves performance a bit
further (~10%):
diff -r 848591d85052 src/java.base/share/classes/java/lang/StringCoding.java
--- a/src/java.base/share/classes/java/lang/StringCoding.java Sun Dec
10 18:48:21 2017 +0100
+++ b/src/java.base/share/classes/java/lang/StringCoding.java Sun Dec
10 18:55:38 2017 +0100
@@ -937,7 +937,13 @@
* Throws iae, instead of replacing, if malformed or unmmappble.
*/
static String newStringUTF8NoRepl(byte[] bytes, int off, int len) {
- Result ret = decodeUTF8(bytes, off, len, false);
+ if (COMPACT_STRINGS && !hasNegatives(bytes, off, len)) {
+ return new String(Arrays.copyOfRange(bytes, off, off +
len), LATIN1);
+ }
+ Result ret = decodeUTF8_0(bytes, off, len, false);
return new String(ret.value, ret.coder);
}
Benchmark Mode Cnt Score Error Units
ZipCodingBM.jf_entries avgt 25 43.682 ± 0.656 us/op
ZipCodingBM.jf_stream avgt 25 42.075 ± 0.444 us/op
ZipCodingBM.zf_entries avgt 25 43.323 ± 0.572 us/op
ZipCodingBM.zf_stream avgt 25 40.237 ± 0.604 us/op
After:
Benchmark Mode Cnt Score Error Units
ZipCodingBM.jf_entries avgt 25 37.551 ± 1.198 us/op
ZipCodingBM.jf_stream avgt 25 38.065 ± 0.628 us/op
ZipCodingBM.zf_entries avgt 25 37.595 ± 0.686 us/op
ZipCodingBM.zf_stream avgt 25 35.734 ± 0.442 us/op
(I don't know which jar you using as test.jar, but results seems
consistent across a
few different ones)
The gain is achieved by not going via the
ThreadLocal<StringCoding.Result> resultCache,
which checks out when inspecting the perfasm output.
I'm a bit skeptical of ThreadLocal caching optimizations for such small
objects
(StringCoding.Result), and wonder if there's something else we can do to
help the
optimizer out here, possibly eliminating the allocation entirely.
Thanks!
/Claes
More information about the core-libs-dev
mailing list