RFR: 8279833: Loop optimization issue in String.encodeUTF8_UTF16 [v3]

Tue Jan 11 13:09:07 UTC 2022

> In `String.encodeUTF8_UTF16`, making the `char c` local to each loop helps the performance of the method by helping C2 optimize each individual loop better.
> 
> Results on the updated micros:
> 19-b04:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score     Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   164.457 ±   7.296  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  2294.160 ±  40.580  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  9128.698 ± 124.636  ns/op
> 
> 
> Patch:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score    Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   131.296 ±  6.693  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  2282.750 ± 46.891  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  4786.965 ± 64.896  ns/op
> 
> 
> Going back this seem to have been an issue since this method was introduced with JEP 254 in JDK 9:
> 
> 8u311:
> 
> Benchmark                          (charsetName)  Mode  Cnt     Score     Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15   194.057 ±   3.913  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15  3024.860 ±  88.386  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  5282.849 ± 247.230  ns/op
> 
> 9:
> 
> Benchmark                          (charsetName)  Mode  Cnt      Score     Error  Units
> StringEncode.encodeUTF16                   UTF-8  avgt   15    148.481 ±   9.374  ns/op
> StringEncode.encodeUTF16LongEnd            UTF-8  avgt   15   2832.754 ± 263.372  ns/op
> StringEncode.encodeUTF16LongStart          UTF-8  avgt   15  10447.115 ± 408.338  ns/op
> 
> 
> (Interestingly JDK 9 seem faster on some of the micros than later iterations, while slower on others. The main issue is the slow non-ASCII loop, where the patch speeds things up ~2x. With the patch we're significantly faster than both JDK 8 and 9 on all measures.)
> 
> There's a JIT compiler bug hiding in plain sight here where the potentially uninitialized local `char c` appears to mess up optimization of the second loop. I'll file a separate bug for this (a standalone reproducer should be straightforward to produce). I think this patch is reasonable to put back into the JDK while we investigate if/how C2 can better handle this kind of condition. It might also be easier to backport, depending on whether the C2 fix is trivial or not.

Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:

  Fix odd comment placement

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7026/files
  - new: https://git.openjdk.java.net/jdk/pull/7026/files/61f7f3d4..5b3dd69a

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=02
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=01-02

  Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7026.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7026/head:pull/7026

PR: https://git.openjdk.java.net/jdk/pull/7026