RFR: 8259842: Remove Result cache from StringCoding [v7]

Tue Jan 19 16:48:50 UTC 2021

On Mon, 18 Jan 2021 09:26:07 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> The `StringCoding.resultCached` mechanism is used to remove the allocation of a `StringCoding.Result` object on potentially hot paths used in some `String` constructors. Using a `ThreadLocal` has overheads though, and the overhead got a bit worse after JDK-8258596 which addresses a leak by adding a `SoftReference`.
>> 
>> This patch refactors much of the decode logic back into `String` and gets rid of not only the `Result` cache, but the `Result` class itself along with the `StringDecoder` class and cache.
>> 
>> Microbenchmark results:
>> Baseline
>> 
>> Benchmark                                           (charsetName)  Mode  Cnt    Score    Error   Units
>> decodeCharset                                            US-ASCII  avgt   15  193.043 ±  8.207   ns/op
>> decodeCharset:·gc.alloc.rate.norm                        US-ASCII  avgt   15  112.009 ±  0.001    B/op
>> decodeCharset                                          ISO-8859-1  avgt   15  164.580 ±  6.514   ns/op
>> decodeCharset:·gc.alloc.rate.norm                      ISO-8859-1  avgt   15  112.009 ±  0.001    B/op
>> decodeCharset                                               UTF-8  avgt   15  328.370 ± 18.420   ns/op
>> decodeCharset:·gc.alloc.rate.norm                           UTF-8  avgt   15  224.019 ±  0.002    B/op
>> decodeCharset                                               MS932  avgt   15  328.870 ± 20.056   ns/op
>> decodeCharset:·gc.alloc.rate.norm                           MS932  avgt   15  232.020 ±  0.002    B/op
>> decodeCharset                                          ISO-8859-6  avgt   15  193.603 ± 12.305   ns/op
>> decodeCharset:·gc.alloc.rate.norm                      ISO-8859-6  avgt   15  112.010 ±  0.001    B/op
>> decodeCharsetName                                        US-ASCII  avgt   15  209.454 ±  9.040   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                    US-ASCII  avgt   15  112.009 ±  0.001    B/op
>> decodeCharsetName                                      ISO-8859-1  avgt   15  188.234 ±  7.533   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                  ISO-8859-1  avgt   15  112.009 ±  0.001    B/op
>> decodeCharsetName                                           UTF-8  avgt   15  399.463 ± 12.437   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                       UTF-8  avgt   15  224.019 ±  0.003    B/op
>> decodeCharsetName                                           MS932  avgt   15  358.839 ± 15.385   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                       MS932  avgt   15  208.017 ±  0.003    B/op
>> decodeCharsetName                                      ISO-8859-6  avgt   15  162.570 ±  7.090   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                  ISO-8859-6  avgt   15  112.009 ±  0.001    B/op
>> decodeDefault                                                 N/A  avgt   15  316.081 ± 11.101   ns/op
>> decodeDefault:·gc.alloc.rate.norm                             N/A  avgt   15  224.019 ±  0.002    B/op
>> 
>> Patched:
>> Benchmark                                           (charsetName)  Mode  Cnt    Score    Error   Units
>> decodeCharset                                            US-ASCII  avgt   15  159.153 ±  6.082   ns/op
>> decodeCharset:·gc.alloc.rate.norm                        US-ASCII  avgt   15  112.009 ±  0.001    B/op
>> decodeCharset                                          ISO-8859-1  avgt   15  134.433 ±  6.203   ns/op
>> decodeCharset:·gc.alloc.rate.norm                      ISO-8859-1  avgt   15  112.009 ±  0.001    B/op
>> decodeCharset                                               UTF-8  avgt   15  297.234 ± 21.654   ns/op
>> decodeCharset:·gc.alloc.rate.norm                           UTF-8  avgt   15  224.019 ±  0.002    B/op
>> decodeCharset                                               MS932  avgt   15  311.583 ± 16.445   ns/op
>> decodeCharset:·gc.alloc.rate.norm                           MS932  avgt   15  208.018 ±  0.002    B/op
>> decodeCharset                                          ISO-8859-6  avgt   15  156.216 ±  6.522   ns/op
>> decodeCharset:·gc.alloc.rate.norm                      ISO-8859-6  avgt   15  112.010 ±  0.001    B/op
>> decodeCharsetName                                        US-ASCII  avgt   15  180.752 ±  9.411   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                    US-ASCII  avgt   15  112.010 ±  0.001    B/op
>> decodeCharsetName                                      ISO-8859-1  avgt   15  156.170 ±  8.003   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                  ISO-8859-1  avgt   15  112.010 ±  0.001    B/op
>> decodeCharsetName                                           UTF-8  avgt   15  370.337 ± 22.199   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                       UTF-8  avgt   15  224.019 ±  0.001    B/op
>> decodeCharsetName                                           MS932  avgt   15  312.589 ± 15.067   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                       MS932  avgt   15  208.018 ±  0.002    B/op
>> decodeCharsetName                                      ISO-8859-6  avgt   15  173.205 ±  9.647   ns/op
>> decodeCharsetName:·gc.alloc.rate.norm                  ISO-8859-6  avgt   15  112.009 ±  0.001    B/op
>> decodeDefault                                                 N/A  avgt   15  303.459 ± 16.452   ns/op
>> decodeDefault:·gc.alloc.rate.norm                             N/A  avgt   15  224.019 ±  0.001    B/op
>> 
>> Most variants improve. There's a small added overhead in `String charsetName` variants for some charsets such as `ISO-8859-6` that benefited slightly from the `StringDecoder` cache due avoiding a lookup, but most variants are not helped by this cache and instead see a significant gain from skipping that step. `Charset` variants don't need a lookup and improve across the board.
>> 
>> Another drawback is that we need to cram more logic into `String` to bypass the `StringCoding.Result` indirection - but getting rid of two commonly used `ThreadLocal` caches and most cases getting a bit better raw throughput in the process I think more than enough makes up for that.
>> 
>> Testing: tier1-4
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Harmonize empty string checking in newString methods

The large number of package exposed methods in StringCoding is ugly and makes the code harder to maintain.
Can the code duplication of UTF8 in the String constructors be reduced?
It would be cleaner to move all of the remaining StringCoding methods into String and make them private again. Reading the code now requires quite a bit of cross referencing and the invariants are hard to verify.

src/java.base/share/classes/java/lang/StringCoding.java line 424:

> 422: 
> 423:     static int decodeUTF8_UTF16(byte[] bytes, int offset, int sl, byte[] dst, int dp, boolean doReplace) {
> 424:         while (offset < sl) {

The renaming of sp -> offset seems unmotivated and breaks the symmetry with dp.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2102