<i18n dev> RFR: 8261418: Reduce decoder creation overheads for sun.nio.cs.ext Charsets

Claes Redestad redestad at openjdk.java.net
Tue Feb 9 13:27:12 UTC 2021


On Tue, 9 Feb 2021 12:54:12 GMT, Claes Redestad <redestad at openjdk.org> wrote:

> This refactor some `sun.nio.cs.ext` charsets, such as ISO-2022-CN-GB, ISO-2022-CN-CNS, ISO-2022-KR and a few others to use static rather than per-instance auxiliary decoders. Doing so reduce overheads of calling `charset.newDecoder()`. This reduce or eliminate regressions on `new String(byte[], String)` operations due the removal of thread-local decoder caching in [JDK-8259842](https://bugs.openjdk.java.net/browse/JDK-8259842)
> 
> Most ISO-2022 Charsets define a specialized decoder already. The `ISO2022.Decoder` class was only used by `ISO2022_KR`, so folding it into that implementation and simplifying the code brings a rather significant speed-up, both to decoder creation and on actual decoding.
> 
> Testing: tier1-3, manual runs of sun.nio.cs tests

`StringDecode.WithCharset.decodeCharsetName` microbenchmark runs

pre-JDK-8259842:
Benchmark                                           (charsetName)     Mode  Cnt     Score     Error   Units
decodeCharsetName                                   x-ISO-2022-CN-GB  avgt    5   756.293 ±  56.082   ns/op
decodeCharsetName:·gc.alloc.rate.norm               x-ISO-2022-CN-GB  avgt    5   488.040 ±   0.013    B/op
decodeCharsetName                                   ISO-2022-KR       avgt    5  1076.745 ± 299.789   ns/op
decodeCharsetName:·gc.alloc.rate.norm               ISO-2022-KR       avgt    5   808.068 ±   0.022    B/op
Current:
Benchmark                                           (charsetName)     Mode  Cnt     Score     Error   Units
decodeCharsetName                                   x-ISO-2022-CN-GB  avgt    5   906.281 ± 109.035   ns/op
decodeCharsetName:·gc.alloc.rate.norm               x-ISO-2022-CN-GB  avgt    5   920.077 ±   0.025    B/op
decodeCharsetName                                   ISO-2022-KR       avgt    5  1175.228 ± 144.509   ns/op
decodeCharsetName:·gc.alloc.rate.norm               ISO-2022-KR       avgt    5  1256.105 ±   0.043    B/op
Patched:
Benchmark                                           (charsetName)     Mode  Cnt    Score     Error   Units
decodeCharsetName                                   x-ISO-2022-CN-GB  avgt    5  783.385 ± 255.048   ns/op
decodeCharsetName:·gc.alloc.rate.norm               x-ISO-2022-CN-GB  avgt    5  584.050 ±   0.016    B/op
decodeCharsetName                                   ISO-2022-KR       avgt    5  814.654 ± 196.860   ns/op
decodeCharsetName:·gc.alloc.rate.norm               ISO-2022-KR       avgt    5  584.050 ±   0.024    B/op

-------------

PR: https://git.openjdk.java.net/jdk/pull/2480


More information about the i18n-dev mailing list