RFR: 8326653: Remove jdk.internal.reflect.UTF8

Claes Redestad redestad at openjdk.org
Mon Feb 26 11:12:00 UTC 2024


jdk.internal.reflect.UTF8 is used for encoding String to encoded UTF-8 when generating some classes. 

Since JDK 9 we have a fast-path (which avoids creating encoders) for UTF-8-encoding strings which is bootstrapped very early, so it seems safe to rewire this and remove the UTF8 helper class whose stated raison d'être is to avoid bootstrapping issues.

This cleanup also removes a latent bug since the custom encoder isn't able to deal with classfile constants containing surrogate pairs.

For a quick comparison I copied the UTF8 code to the `StringEncode` microbenchmark and set up a benchmark testing the same inputs as `encodeAllMixed`:


Benchmark                                (charsetName)  Mode  Cnt       Score      Error  Units
StringEncode.encodeAllMixed                      UTF-8  avgt   10   12894,551 ±  164,816  ns/op
StringEncode.encodeUTF8InternalAllMixed          UTF-8  avgt   10  236614,548 ± 1445,975  ns/op


I.e. `String.getBytes(UTF_8.instance)` is about 18x faster on mixed inputs. (I plan on removing `encodeUTF8InternalAllMixed` from the PR before merging, but wanted to include it initially to show what I've measured.)

-------------

Commit messages:
 - 8326653: Remove jdk.internal.reflect.UTF8

Changes: https://git.openjdk.org/jdk/pull/18006/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18006&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8326653
  Stats: 143 lines in 3 files changed: 64 ins; 78 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/18006.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18006/head:pull/18006

PR: https://git.openjdk.org/jdk/pull/18006


More information about the core-libs-dev mailing list