RFR: 8240094: Optimize empty substring
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Wed Feb 26 14:42:49 UTC 2020
Hello,
thanks for doing this! I've checked it myself with similar benchmark
@Warmup(iterations = 10, time = 1)
@Measurement(iterations = 10, time = 1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3, jvmArgsAppend = {"-Xms4g", "-Xmx4g", "-XX:+UseParallelGC"})
public class SubstringAndStripBenchmark {
private static final String str = "Tolstoy";
@Benchmark
public String substring() {
return str.substring(1, 1);
}
}
Which gave me this output:
Before
Benchmark Mode Cnt Score Error Units
SubstringAndStripBenchmark.substring avgt 30 5.877 ± 0.066 ns/op
SubstringAndStripBenchmark.substring:·gc.alloc.rate avgt 30 4325.945 ± 47.259 MB/sec
SubstringAndStripBenchmark.substring:·gc.alloc.rate.norm avgt 30 40.003 ± 0.001 B/op
SubstringAndStripBenchmark.substring:·gc.churn.G1_Eden_Space avgt 30 4338.893 ± 86.555 MB/sec
SubstringAndStripBenchmark.substring:·gc.churn.G1_Eden_Space.norm avgt 30 40.122 ± 0.647 B/op
SubstringAndStripBenchmark.substring:·gc.churn.G1_Survivor_Space avgt 30 0.015 ± 0.003 MB/sec
SubstringAndStripBenchmark.substring:·gc.churn.G1_Survivor_Space.norm avgt 30 ≈ 10⁻⁴ B/op
SubstringAndStripBenchmark.substring:·gc.count avgt 30 557.000 counts
SubstringAndStripBenchmark.substring:·gc.time avgt 30 387.000 ms
After
SubstringAndStripBenchmark.substring avgt 30 2.423 ± 0.172 ns/op
SubstringAndStripBenchmark.substring:·gc.alloc.rate avgt 30 0.001 ± 0.001 MB/sec
SubstringAndStripBenchmark.substring:·gc.alloc.rate.norm avgt 30 ≈ 10⁻⁵ B/op
SubstringAndStripBenchmark.substring:·gc.count avgt 30 ≈ 0 counts
So with this change we save 40 bytes when empty String returned and execution takes > 2x less time.
With best regards,
Sergey Tsypanov
26.02.2020, 16:11, "Claes Redestad" <claes.redestad at oracle.com>:
> Hi,
>
> I took the liberty to file
> https://bugs.openjdk.java.net/browse/JDK-8240094 for this.
>
> Webrev: http://cr.openjdk.java.net/~redestad/8240094/open.00/
>
> I added a trivial micro and verified that there's a roughly 2.5x
> improvement in this targeted test on my machine (~10 ns/op -> ~4ns/op),
> while staying neutral on a selection of benchmarks that exercise
> newString.
>
> /Claes
>
> On 2020-02-26 14:19, Claes Redestad wrote:
>> On 2020-02-26 11:01, Сергей Цыпанов wrote:
>>> Currently we have
>>>
>>> public static String stripLeading(byte[] value) {
>>> int left = indexOfNonWhitespace(value);
>>> if (left == value.length) {
>>> return "";
>>> }
>>> return (left != 0) ? newString(value, left, value.length - left) :
>>> null;
>>> }
>>>
>>> With the patch we change behaviour of this method for the case when
>>> value.length == 0:
>>>
>>> public static String stripLeading(byte[] value) {
>>> int left = indexOfNonWhitespace(value);
>>> return (left != 0) ? newString(value, left, value.length - left) :
>>> null;
>>> }
>>>
>>> Unlike original method this code returns null instead of "" for empty
>>> array.
>>> This does not affect caller (String.stripLeading()) i. e. visible
>>> behaviour
>>> remains the same for String user, but is it OK in general?
>>
>> One observable difference on the public API I think will happen here is
>> that new String("").stripLeading() == "" will change from true to false,
>> since the null return from the inner method mean we return the argument.
>>
>> From a compatibility point of view I think this should be fine, as
>> the identity of the returned empty string isn't specified. I don't think
>> a CSR is required, but bringing it up since others think otherwise.
>>
>> Overall I like how the patch now cleans things up a bit on top of
>> (likely) optimizing a few edge cases, and can volunteer to sponsor it if
>> no one else has spoken up already.
>>
>> Doing some quick performance testing and possibly adding a
>> microbenchmark before push would be good.
>>
>> Thanks!
>>
>> /Claes
More information about the core-libs-dev
mailing list