RFR: 8279833: Loop optimization issue in String.encodeUTF8_UTF16 [v2]
Claes Redestad
redestad at openjdk.java.net
Tue Jan 11 13:05:00 UTC 2022
> In `String.encodeUTF8_UTF16`, making the `char c` local to each loop helps the performance of the method by helping C2 optimize each individual loop better.
>
> Results on the updated micros:
> 19-b04:
>
> Benchmark (charsetName) Mode Cnt Score Error Units
> StringEncode.encodeUTF16 UTF-8 avgt 15 164.457 ± 7.296 ns/op
> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2294.160 ± 40.580 ns/op
> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 9128.698 ± 124.636 ns/op
>
>
> Patch:
>
> Benchmark (charsetName) Mode Cnt Score Error Units
> StringEncode.encodeUTF16 UTF-8 avgt 15 131.296 ± 6.693 ns/op
> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2282.750 ± 46.891 ns/op
> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 4786.965 ± 64.896 ns/op
>
>
> Going back this seem to have been an issue since this method was introduced with JEP 254 in JDK 9:
>
> 8u311:
>
> Benchmark (charsetName) Mode Cnt Score Error Units
> StringEncode.encodeUTF16 UTF-8 avgt 15 194.057 ± 3.913 ns/op
> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 3024.860 ± 88.386 ns/op
> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 5282.849 ± 247.230 ns/op
>
> 9:
>
> Benchmark (charsetName) Mode Cnt Score Error Units
> StringEncode.encodeUTF16 UTF-8 avgt 15 148.481 ± 9.374 ns/op
> StringEncode.encodeUTF16LongEnd UTF-8 avgt 15 2832.754 ± 263.372 ns/op
> StringEncode.encodeUTF16LongStart UTF-8 avgt 15 10447.115 ± 408.338 ns/op
>
>
> (Interestingly JDK 9 seem faster on some of the micros than later iterations, while slower on others. The main issue is the slow non-ASCII loop, where the patch speeds things up ~2x. With the patch we're significantly faster than both JDK 8 and 9 on all measures.)
>
> There's a JIT compiler bug hiding in plain sight here where the potentially uninitialized local `char c` appears to mess up optimization of the second loop. I'll file a separate bug for this (a standalone reproducer should be straightforward to produce). I think this patch is reasonable to put back into the JDK while we investigate if/how C2 can better handle this kind of condition. It might also be easier to backport, depending on whether the C2 fix is trivial or not.
Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
Remove unused Blackholes
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/7026/files
- new: https://git.openjdk.java.net/jdk/pull/7026/files/70d0c5c6..61f7f3d4
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=01
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=7026&range=00-01
Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod
Patch: https://git.openjdk.java.net/jdk/pull/7026.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/7026/head:pull/7026
PR: https://git.openjdk.java.net/jdk/pull/7026
More information about the core-libs-dev
mailing list