RFR(S): 8067471: Use private static final char[0] for empty Strings
Ivan Gerasimov
ivan.gerasimov at oracle.com
Fri Dec 19 19:14:30 UTC 2014
On 19.12.2014 18:36, Ivan Gerasimov wrote:
> Hi everyone!
>
> I guess something like the following might be used to avoid slowing
> down the common route:
>
Sorry, it was meant to be:
if (count <= 0) {
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
if (offset <= value.length) {
this.value = "".value;
return;
}
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset + count);
Otherwise, the behavior would have changed.
Sincerely yours,
Ivan
> Sincerely yours,
> Ivan
>
>
> On 19.12.2014 17:25, Claes Redestad wrote:
>> Hi again,
>>
>> I just had a nagging thought. For applications reporting some or a
>> lot of different empty String objects,
>> this might be due not to use of this constructor, but due to things
>> like an empty StringBuilder.toString() ending
>> up calling java.lang.String(char[] value, int offset, int count) with
>> a count of 0.
>>
>> My feeling is that this would be a more probable/reasonable source of
>> distinct empty strings in an application,
>> and would not be helped much by the current patch.
>>
>> For completeness the java.lang.String(char/int[], int, int)
>> constructors could have a check for count == 0 in
>> which case it assigns value to "".value, see
>> http://cr.openjdk.java.net/~redestad/8067471/webrev.01/
>>
>> Running a few simple microbenchmarks on this (sorry, Aleksey!)[1]
>> yields a bit of speedup, at no discernible
>> cost for a quick sanity check of an actually copying path:
>>
>> baseline:
>> Benchmark Mode Samples Score Error
>> Units
>> o.s.SimpleBench.emptySBBench thrpt 5 48.213 ± 3.509
>> ops/us
>> o.s.SimpleBench.emptyString thrpt 5 103.900 ± 4.869
>> ops/us
>> o.s.SimpleBench.emptyStringCount thrpt 5 105.802 ± 3.355
>> ops/us
>> o.s.SimpleBench.emptyStringCountCP thrpt 5 104.464 ± 8.251
>> ops/us
>> o.s.SimpleBench.singleStringCount thrpt 5 76.379 ± 4.139
>> ops/us
>> o.s.SimpleBench.singleStringCountCP thrpt 5 88.202 ± 4.945
>> ops/us
>>
>> experiment:
>> Benchmark Mode Samples Score Error
>> Units
>> o.s.SimpleBench.emptySBBench thrpt 5 153.822 ± 7.900
>> ops/us 3x
>> o.s.SimpleBench.emptyString thrpt 5 183.588 ± 4.429
>> ops/us 1.75x
>> o.s.SimpleBench.emptyStringCount thrpt 5 166.457 ± 9.202
>> ops/us 1.6x
>> o.s.SimpleBench.emptyStringCountCP thrpt 5 168.089 ± 4.676
>> ops/us 1.6x
>> o.s.SimpleBench.singleStringCount thrpt 5 75.780 ± 8.490
>> ops/us 1x
>> o.s.SimpleBench.singleStringCountCP thrpt 5 89.202 ± 3.714
>> ops/us 1x
>>
>> I guess we'd need to check that compiled code don't get messed up for
>> a mix of inputs where count can be both
>> zero and non-zero with some probability.
>>
>> /Claes
>>
>> [1] public char[] emptyCharArray = new char[0];
>> public int[] emptyIntArray = new int[0];
>>
>> public char[] singleCharArray = new char[] { 'x' };
>> public int[] singleIntArray = new int[] { 102 };
>> public StringBuilder emptySB = new StringBuilder();
>>
>> @Benchmark
>> public String emptySBBench() {
>> return emptySB.toString();
>> }
>>
>> @Benchmark
>> public String emptyString() {
>> return new String();
>> }
>>
>> @Benchmark
>> public String emptyStringCount() {
>> return new String(emptyCharArray, 0, 0);
>> }
>>
>> @Benchmark
>> public String emptyStringCountCP() {
>> return new String(emptyIntArray, 0, 0);
>> }
>>
>> @Benchmark
>> public String singleStringCount() {
>> return new String(singleCharArray, 0, 1);
>> }
>>
>> @Benchmark
>> public String singleStringCountCP() {
>> return new String(singleIntArray, 0, 1);
>> }
>>
>>
>> On 2014-12-19 01:41, Lev Priima wrote:
>>> Claes,
>>> Thanks for cool suggestion:
>>> http://cr.openjdk.java.net/~lpriima/8067471/webrev.01/:
>>> public java.lang.String();
>>> Signature: ()V
>>> flags: ACC_PUBLIC
>>> LineNumberTable:
>>> line 137: 0
>>> line 138: 4
>>> line 139: 13
>>> Code:
>>> stack=2, locals=1, args_size=1
>>> 0: aload_0
>>> 1: invokespecial #1 // Method
>>> java/lang/Object."<init>":()V
>>> 4: aload_0
>>> 5: ldc #2 // String
>>> 7: getfield #3 // Field value:[C
>>> 10: putfield #3 // Field value:[C
>>> 13: return
>>> LineNumberTable:
>>> line 137: 0
>>> line 138: 4
>>>
>>>
>>> Aleksey ,
>>> By naive glance, new constructor thrpt is bigger(jdk9/dev vs 9b42
>>> in attach) on:
>>> @Benchmark
>>> public String getNewString() {
>>> return new String();
>>> }
>>> Please suggest things to compare . Do we have bigapps-like
>>> benchmarks which may show regression on this?
>>>
>>> On 17.12.2014 21:10, Aleksey Shipilev wrote:
>>>> On 17.12.2014 18:58, Claes Redestad wrote:
>>>>> On 2014-12-17 11:22, Lev Priima wrote:
>>>>>> Please review space optimization in no args String constructor.
>>>>>> Originally, it was already rejected once by suggestion in
>>>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010300.html
>>>>>>
>>>>>> w/o formal justification of pros and contras. And enhancement was
>>>>>> requested again by Nathan Reynolds.
>>>>>>
>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8067471
>>>>>> Patch: http://cr.openjdk.java.net/~lpriima/8067471/webrev.00/
>>>> I am OK with this change pending we understand the performance
>>>> impact a
>>>> little better. Please do benchmarks.
>>>>
>>>>
>>>>> Could this have some obscure security implications, such as the
>>>>> ability
>>>>> to lock up some subsystem via retrieving and synchronizing on the
>>>>> shared
>>>>> new String().value? I guess not, for practical purposes.
>>>> This would be an issue if we published String.value directly, but
>>>> we don't.
>>>>
>>>>> I guess it could also be written:
>>>>>
>>>>> public String() {
>>>>> this.value = "".value;
>>>>> }
>>>>>
>>>>> ... which would saves some byte code and should avoid having to
>>>>> allocate
>>>>> a distinct object for this.
>>>> Oh, I like this trick, given "" is probably already in constant
>>>> pool for
>>>> some other reason. However, there is an additional dereference. We
>>>> need
>>>> a targeted nanobenchmark to properly quantify the costs for either
>>>> variant.
>>>>
>>>> -Aleksey.
>>>>
>>>>
>>>
>>
>>
>>
>
>
>
More information about the core-libs-dev
mailing list