RFR: 8279833: Loop optimization issue in String.encodeUTF8_UTF16
Claes Redestad
redestad at openjdk.java.net
Tue Jan 11 12:44:46 UTC 2022
In `String.encodeUTF8_UTF16`, making the `char c` local to each loop helps the performance of the method by helping C2 optimize each individual loop better.
Results on the updated micros:
19-b04:
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.encodeUTF16 UTF-8 avgt 15 164.457 ± 7.296 ns/op
StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2294.160 ± 40.580 ns/op
StringEncode.encodeUTF16LongStart UTF-8 avgt 15 9128.698 ± 124.636 ns/op
Patch:
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.encodeUTF16 UTF-8 avgt 15 131.296 ± 6.693 ns/op
StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2282.750 ± 46.891 ns/op
StringEncode.encodeUTF16LongStart UTF-8 avgt 15 4786.965 ± 64.896 ns/op
Going back this seem to have been an issue since this method was introduced with JEP 254 in JDK 9:
8u311:
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.encodeUTF16 UTF-8 avgt 15 194.057 ± 3.913 ns/op
StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 3024.860 ± 88.386 ns/op
StringEncode.encodeUTF16LongStart UTF-8 avgt 15 5282.849 ± 247.230 ns/op
9:
Benchmark (charsetName) Mode Cnt Score Error Units
StringEncode.encodeUTF16 UTF-8 avgt 15 148.481 ± 9.374 ns/op
StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2832.754 ± 263.372 ns/op
StringEncode.encodeUTF16LongStart UTF-8 avgt 15 10447.115 ± 408.338 ns/op
(Interestingly JDK 9 seem faster on some of the micros than later iterations, while slower on others. The main issue is the slow non-ASCII loop, where the patch speeds things up ~2x. With the patch we're significantly faster than both JDK 8 and 9 on all measures.)
There's a JIT compiler bug hiding in plain sight here where the potentially uninitialized local `char c` appears to mess up optimization of the second loop. I'll file a separate bug for this (a standalone reproducer should be straightforward to produce). I think this patch is reasonable to put back into the JDK while we investigate if/how C2 can better handle this kind of condition. It might also be easier to backport, depending on whether the C2 fix is trivial or not.
-------------
Commit messages:
- Merge branch 'master' into string_encodeutf8_16
- Loop optimization issue in String.encodeUTF8_UTF16
Changes: https://git.openjdk.java.net/jdk/pull/7026/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8279833
Stats: 85 lines in 2 files changed: 44 ins; 3 del; 38 mod
Patch: https://git.openjdk.java.net/jdk/pull/7026.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/7026/head:pull/7026
PR: https://git.openjdk.java.net/jdk/pull/7026
More information about the core-libs-dev
mailing list