RFR: 8301958: Avoid Arrays.copyOfRange overhead in java.lang.String [v3]

Alan Bateman alanb at openjdk.org
Tue Feb 7 15:01:01 UTC 2023


On Tue, 7 Feb 2023 14:39:31 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> This adds a local, specialized `copyBytes` method to `String` that avoids certain redundant range checks and clamping that JIT has issues removing fully.
>> 
>> This has a small but statistically significant effect on `String` microbenchmarks, eg.:
>> 
>> Baseline
>> 
>> Benchmark                                            (size)  Mode  Cnt   Score   Error  Units
>> StringConstructor.newStringFromArray                      7  avgt   15  16.817 ± 0.369  ns/op
>> StringConstructor.newStringFromArrayWithCharset           7  avgt   15  16.866 ± 0.449  ns/op
>> StringConstructor.newStringFromArrayWithCharsetName       7  avgt   15  22.198 ± 0.396  ns/op
>> 
>> 
>> Patch:
>> 
>> Benchmark                                            (size)  Mode  Cnt   Score   Error  Units
>> StringConstructor.newStringFromArray                      7  avgt   15  15.477 ± 0.342  ns/op
>> StringConstructor.newStringFromArrayWithCharset           7  avgt   15  15.557 ± 0.352  ns/op
>> StringConstructor.newStringFromArrayWithCharsetName       7  avgt   15  21.272 ± 0.398  ns/op
>> 
>> 
>> Care has to be taken to ensure preconditions have been checked when using `checkBytes`. In the case of `String(AbstractStringBuilder)` there's a possible pre-existing issue where the constructor might either throw an exception or truncate the buffer if the builder byte array and length is not in agreement (theoretically possible if you clear/remove and call `trimToSize()` concurrently). Adding an explicit check here seem to be the right thing to do regardless of this RFE.
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Update StringLatin1.trim for consistency

src/java.base/share/classes/java/lang/String.java line 4546:

> 4544:         // To avoid surprises due to data races (which would either truncate or throw an exception)
> 4545:         // we should check that length <= val.length up front
> 4546:         checkOffset(length, val.length);

I agree a check is needed here but I assume using checkOffset means that SB::toString could fail with SIOOBE. I wonder if Math.min(length, val.length) would be better here.

-------------

PR: https://git.openjdk.org/jdk/pull/12453


More information about the core-libs-dev mailing list