RFR(S): 8067471: Use private static final char[0] for empty Strings
Claes Redestad
claes.redestad at oracle.com
Fri Dec 19 19:13:36 UTC 2014
No, as mentioned, I didn't check for mixed inputs, which is probably
necessary to ensure we don't slow down the typical case due
to adding an extra branch, so I went about and made an attempt to write
a microbenchmark to capture this[1] by making the count
parameter unpredictable without inducing much overhead (no randomness in
the benchmark code itself), which should make
it sufficiently hard for the compiler to prune the branch checks. I
don't think generating different char[]s would change things,
but feel free to improve upon this and generate random char[]s to verify
that.
Doing this, I added Ivan Gerasimov's idea to fold the rare count == 0
check into the count < 0 clause, which seemed like a
reasonable way to avoid adding the extra branch to the normal path[2],
although it makes the code a little weirder. I'll denote
this version as experiment2, and my original code as experiment.
Results grouped on maxCount parameter, then ordered internally from
baseline, experiment, experiment2. Probability of getting
count == 0 is roughly 1/maxCount, and the cost per operation (average)
increases as maxCount goes up as we're copying more
data on average, the interesting metric is whether or not this
optimization breaks the normal flow when we're only rarely
hitting the non-allocating branch, which would show up as the experiment
underperforming versus the baseline as we increase
maxCount:
Benchmark (maxCount) (numCount) Mode
Samples Score Error Units
o.s.SimpleBench.mixedNewStringFromCount 2 1000000
thrpt 50 46.114 ± 0.391 ops/us
o.s.SimpleBench.mixedNewStringFromCount 2 1000000
thrpt 50 64.320 ± 0.569 ops/us
o.s.SimpleBench.mixedNewStringFromCount 2 1000000
thrpt 50 64.874 ± 1.095 ops/us
o.s.SimpleBench.mixedNewStringFromCount 10 1000000
thrpt 50 48.609 ± 0.921 ops/us
o.s.SimpleBench.mixedNewStringFromCount 10 1000000
thrpt 50 50.562 ± 0.598 ops/us
o.s.SimpleBench.mixedNewStringFromCount 10 1000000
thrpt 50 50.724 ± 0.535 ops/us
o.s.SimpleBench.mixedNewStringFromCount 100 1000000
thrpt 50 31.422 ± 0.553 ops/us
o.s.SimpleBench.mixedNewStringFromCount 100 1000000
thrpt 50 31.416 ± 0.512 ops/us
o.s.SimpleBench.mixedNewStringFromCount 100 1000000
thrpt 50 31.852 ± 0.441 ops/us
o.s.SimpleBench.mixedNewStringFromCount 1000 1000000
thrpt 50 5.528 ± 0.053 ops/us
o.s.SimpleBench.mixedNewStringFromCount 1000 1000000
thrpt 50 5.531 ± 0.039 ops/us
o.s.SimpleBench.mixedNewStringFromCount 1000 1000000
thrpt 50 5.550 ± 0.033 ops/us
To me it looks like the micro-optimization can be applied either way
without much risk that it'll slow down the normal flow.
There's some indication that Ivan's suggestion is a slight improvement
across the board.
/Claes
[1]
package org.sample;
import org.openjdk.jmh.annotations.*;
import java.util.concurrent.ThreadLocalRandom;
@State(Scope.Thread)
public class SimpleBench {
@Param("1000000")
public int numCount;
@Param("10")
public int maxCount;
public int[] counts;
public char[] data;
public int index = 0;
@Setup
public void buildCounts() {
counts = new int[numCount];
data = new char[maxCount];
ThreadLocalRandom r = ThreadLocalRandom.current();
for (int i = 0; i < numCount; i++) {
counts[i] = r.nextInt(0, maxCount);
}
for (int i = 0; i < maxCount; i++) {
data[i] = 'a';
}
}
@Benchmark
public String mixedNewStringFromCount() {
if (index >= numCount) {
index = 0;
}
return new String(data, 0, counts[index++]);
}
}
[2] http://cr.openjdk.java.net/~redestad/8067471/webrev.03/
On 2014-12-21 17:08, Lev Priima wrote:
> Claes,
>
> Is the source of arrays in these benchmarks unpredictable enough to
> avoid branch checks elimination in compiled code?
>
> On 12/19/2014 05:25 PM, Claes Redestad wrote:
>> Hi again,
>>
>> I just had a nagging thought. For applications reporting some or a
>> lot of different empty String objects,
>> this might be due not to use of this constructor, but due to things
>> like an empty StringBuilder.toString() ending
>> up calling java.lang.String(char[] value, int offset, int count) with
>> a count of 0.
>>
>> My feeling is that this would be a more probable/reasonable source of
>> distinct empty strings in an application,
>> and would not be helped much by the current patch.
>>
>> For completeness the java.lang.String(char/int[], int, int)
>> constructors could have a check for count == 0 in
>> which case it assigns value to "".value, see
>> http://cr.openjdk.java.net/~redestad/8067471/webrev.01/
>>
>> Running a few simple microbenchmarks on this (sorry, Aleksey!)[1]
>> yields a bit of speedup, at no discernible
>> cost for a quick sanity check of an actually copying path:
>>
>> baseline:
>> Benchmark Mode Samples Score Error
>> Units
>> o.s.SimpleBench.emptySBBench thrpt 5 48.213 ± 3.509
>> ops/us
>> o.s.SimpleBench.emptyString thrpt 5 103.900 ± 4.869
>> ops/us
>> o.s.SimpleBench.emptyStringCount thrpt 5 105.802 ± 3.355
>> ops/us
>> o.s.SimpleBench.emptyStringCountCP thrpt 5 104.464 ± 8.251
>> ops/us
>> o.s.SimpleBench.singleStringCount thrpt 5 76.379 ± 4.139
>> ops/us
>> o.s.SimpleBench.singleStringCountCP thrpt 5 88.202 ± 4.945
>> ops/us
>>
>> experiment:
>> Benchmark Mode Samples Score Error
>> Units
>> o.s.SimpleBench.emptySBBench thrpt 5 153.822 ± 7.900
>> ops/us 3x
>> o.s.SimpleBench.emptyString thrpt 5 183.588 ± 4.429
>> ops/us 1.75x
>> o.s.SimpleBench.emptyStringCount thrpt 5 166.457 ± 9.202
>> ops/us 1.6x
>> o.s.SimpleBench.emptyStringCountCP thrpt 5 168.089 ± 4.676
>> ops/us 1.6x
>> o.s.SimpleBench.singleStringCount thrpt 5 75.780 ± 8.490
>> ops/us 1x
>> o.s.SimpleBench.singleStringCountCP thrpt 5 89.202 ± 3.714
>> ops/us 1x
>>
>> I guess we'd need to check that compiled code don't get messed up for
>> a mix of inputs where count can be both
>> zero and non-zero with some probability.
> Do we have such benchmarks examples?
>>
>> /Claes
>>
>> [1] public char[] emptyCharArray = new char[0];
>> public int[] emptyIntArray = new int[0];
>>
>> public char[] singleCharArray = new char[] { 'x' };
>> public int[] singleIntArray = new int[] { 102 };
>> public StringBuilder emptySB = new StringBuilder();
>>
>> @Benchmark
>> public String emptySBBench() {
>> return emptySB.toString();
>> }
>>
>> @Benchmark
>> public String emptyString() {
>> return new String();
>> }
>>
>> @Benchmark
>> public String emptyStringCount() {
>> return new String(emptyCharArray, 0, 0);
>> }
>>
>> @Benchmark
>> public String emptyStringCountCP() {
>> return new String(emptyIntArray, 0, 0);
>> }
>>
>> @Benchmark
>> public String singleStringCount() {
>> return new String(singleCharArray, 0, 1);
>> }
>>
>> @Benchmark
>> public String singleStringCountCP() {
>> return new String(singleIntArray, 0, 1);
>> }
>>
>>
>> On 2014-12-19 01:41, Lev Priima wrote:
>>> Claes,
>>> Thanks for cool suggestion:
>>> http://cr.openjdk.java.net/~lpriima/8067471/webrev.01/:
>>> public java.lang.String();
>>> Signature: ()V
>>> flags: ACC_PUBLIC
>>> LineNumberTable:
>>> line 137: 0
>>> line 138: 4
>>> line 139: 13
>>> Code:
>>> stack=2, locals=1, args_size=1
>>> 0: aload_0
>>> 1: invokespecial #1 // Method
>>> java/lang/Object."<init>":()V
>>> 4: aload_0
>>> 5: ldc #2 // String
>>> 7: getfield #3 // Field value:[C
>>> 10: putfield #3 // Field value:[C
>>> 13: return
>>> LineNumberTable:
>>> line 137: 0
>>> line 138: 4
>>>
>>>
>>> Aleksey ,
>>> By naive glance, new constructor thrpt is bigger(jdk9/dev vs 9b42
>>> in attach) on:
>>> @Benchmark
>>> public String getNewString() {
>>> return new String();
>>> }
>>> Please suggest things to compare . Do we have bigapps-like
>>> benchmarks which may show regression on this?
>>>
>>> On 17.12.2014 21:10, Aleksey Shipilev wrote:
>>>> On 17.12.2014 18:58, Claes Redestad wrote:
>>>>> On 2014-12-17 11:22, Lev Priima wrote:
>>>>>> Please review space optimization in no args String constructor.
>>>>>> Originally, it was already rejected once by suggestion in
>>>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010300.html
>>>>>>
>>>>>> w/o formal justification of pros and contras. And enhancement was
>>>>>> requested again by Nathan Reynolds.
>>>>>>
>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8067471
>>>>>> Patch: http://cr.openjdk.java.net/~lpriima/8067471/webrev.00/
>>>> I am OK with this change pending we understand the performance
>>>> impact a
>>>> little better. Please do benchmarks.
>>>>
>>>>
>>>>> Could this have some obscure security implications, such as the
>>>>> ability
>>>>> to lock up some subsystem via retrieving and synchronizing on the
>>>>> shared
>>>>> new String().value? I guess not, for practical purposes.
>>>> This would be an issue if we published String.value directly, but
>>>> we don't.
>>>>
>>>>> I guess it could also be written:
>>>>>
>>>>> public String() {
>>>>> this.value = "".value;
>>>>> }
>>>>>
>>>>> ... which would saves some byte code and should avoid having to
>>>>> allocate
>>>>> a distinct object for this.
>>>> Oh, I like this trick, given "" is probably already in constant
>>>> pool for
>>>> some other reason. However, there is an additional dereference. We
>>>> need
>>>> a targeted nanobenchmark to properly quantify the costs for either
>>>> variant.
>>>>
>>>> -Aleksey.
>>>>
>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list