RFR: 8148937: (str) Adapt StringJoiner for Compact Strings

Claes Redestad redestad at openjdk.java.net
Mon Mar 15 12:40:16 UTC 2021


On Thu, 18 Feb 2021 20:14:12 GMT, Сергей Цыпанов <github.com+10835776+stsypanov at openjdk.org> wrote:

>> Some of these changes conflict with #2334, which suggest removing the `coder` and `isLatin1` methods from `String`. 
>> 
>> As a more general point I think it would be good to explore options that does not increase leakage of the implementation detail that `Strings` are latin1- or utf16-encoded outside of java.lang.
>
> Hi @cl4es,
>> Some of these changes conflict with #2334, which suggest removing the `coder` and `isLatin1` methods from `String`.
> 
> I've checked out Aleksey's branch and applied my changes onto it, the only thing that I changed to make it work is replacing
> public boolean isLatin1(String str) {
>     return str.isLatin1();
> }
> with
> public boolean isLatin1(String str) {
>     return str.coder == String.LATIN1;
> }
> The rest of the code was left intact. `jdk:tier1` is OK after the change.
>> As a more general point I think it would be good to explore options that does not increase leakage of the implementation detail that `Strings` are latin1- or utf16-encoded outside of java.lang.
> 
> Apart from `JavaLangAccess` the only thing that comes to my mind is reflection, but it will destroy all the improvement. Otherwise I cannot figure out any other way to access somehow package-private latin/non-latin functionality of `j.l.String` in `java.util` package. I wonder, whether I'm missing any other opportunities?

A less intrusive alternative would be to use a `StringBuilder`, see changes in this branch: https://github.com/openjdk/jdk/compare/master...cl4es:stringjoin_improvement?expand=1 (I adapted your StringJoinerBenchmark to work with the ascii-only build constraint)

This underperforms compared to your patch since StringBuilder.toString needs to do a copy, but improves over the baseline:
Benchmark                                                            (count)  (length)  (mode)  Mode  Cnt      Score      Error   Units
StringJoinerBenchmark.stringJoiner                                       100        64   latin  avgt    5   5420.701 ± 1433.485   ns/op
StringJoinerBenchmark.stringJoiner:·gc.alloc.rate.norm                   100        64   latin  avgt    5  20640.428 ±    0.130    B/op
Patch:
Benchmark                                                            (count)  (length)  (mode)  Mode  Cnt      Score      Error   Units
StringJoinerBenchmark.stringJoiner                                       100        64   latin  avgt    5   4271.401 ±  677.560   ns/op
StringJoinerBenchmark.stringJoiner:·gc.alloc.rate.norm                   100        64   latin  avgt    5  14136.294 ±    0.095    B/op

The comparative benefit is that we'd avoid punching more holes into String implementation details for now. Not ruling that out indefinitely, but I think it needs a stronger motivation than to improve StringJoiner alone.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2627


More information about the security-dev mailing list