RFR(S): 8067471: Use private static final char[0] for empty Strings

Sun Dec 21 16:08:26 UTC 2014

Claes,

Is the source of arrays in these benchmarks unpredictable enough to 
avoid branch checks elimination in compiled code?

On 12/19/2014 05:25 PM, Claes Redestad wrote:
> Hi again,
>
> I just had a nagging thought. For applications reporting some or a lot 
> of different empty String objects,
> this might be due not to use of this constructor, but due to things 
> like an empty StringBuilder.toString() ending
> up calling java.lang.String(char[] value, int offset, int count) with 
> a count of 0.
>
> My feeling is that this would be a more probable/reasonable source of 
> distinct empty strings in an application,
> and would not be helped much by the current patch.
>
> For completeness the java.lang.String(char/int[], int, int) 
> constructors could have a check for count == 0 in
> which case it assigns value to "".value, see 
> http://cr.openjdk.java.net/~redestad/8067471/webrev.01/
>
> Running a few simple microbenchmarks on this (sorry, Aleksey!)[1] 
> yields a bit of speedup, at no discernible
> cost for a quick sanity check of an actually copying path:
>
> baseline:
> Benchmark                            Mode  Samples    Score Error   Units
> o.s.SimpleBench.emptySBBench        thrpt        5   48.213 ± 3.509  
> ops/us
> o.s.SimpleBench.emptyString         thrpt        5  103.900 ± 4.869  
> ops/us
> o.s.SimpleBench.emptyStringCount    thrpt        5  105.802 ± 3.355  
> ops/us
> o.s.SimpleBench.emptyStringCountCP  thrpt        5  104.464 ± 8.251  
> ops/us
> o.s.SimpleBench.singleStringCount   thrpt        5   76.379 ± 4.139  
> ops/us
> o.s.SimpleBench.singleStringCountCP thrpt        5   88.202 ± 4.945  
> ops/us
>
> experiment:
> Benchmark                            Mode  Samples    Score Error   Units
> o.s.SimpleBench.emptySBBench        thrpt        5  153.822 ± 7.900  
> ops/us 3x
> o.s.SimpleBench.emptyString         thrpt        5  183.588 ± 4.429  
> ops/us 1.75x
> o.s.SimpleBench.emptyStringCount    thrpt        5  166.457 ± 9.202  
> ops/us 1.6x
> o.s.SimpleBench.emptyStringCountCP  thrpt        5  168.089 ± 4.676  
> ops/us 1.6x
> o.s.SimpleBench.singleStringCount   thrpt        5   75.780 ± 8.490  
> ops/us 1x
> o.s.SimpleBench.singleStringCountCP thrpt        5   89.202 ± 3.714  
> ops/us 1x
>
> I guess we'd need to check that compiled code don't get messed up for 
> a mix of inputs where count can be both
> zero and non-zero with some probability.
Do we have such benchmarks examples?
>
> /Claes
>
> [1] public char[] emptyCharArray = new char[0];
>     public int[] emptyIntArray = new int[0];
>
>     public char[] singleCharArray = new char[] { 'x' };
>     public int[] singleIntArray = new int[] { 102 };
>     public StringBuilder emptySB = new StringBuilder();
>
>     @Benchmark
>     public String emptySBBench() {
>         return emptySB.toString();
>     }
>
>     @Benchmark
>     public String emptyString() {
>         return new String();
>     }
>
>     @Benchmark
>     public String emptyStringCount() {
>         return new String(emptyCharArray, 0, 0);
>     }
>
>     @Benchmark
>     public String emptyStringCountCP() {
>         return new String(emptyIntArray, 0, 0);
>     }
>
>     @Benchmark
>     public String singleStringCount() {
>         return new String(singleCharArray, 0, 1);
>     }
>
>     @Benchmark
>     public String singleStringCountCP() {
>         return new String(singleIntArray, 0, 1);
>     }
>
>
> On 2014-12-19 01:41, Lev Priima wrote:
>> Claes,
>> Thanks for cool suggestion: 
>> http://cr.openjdk.java.net/~lpriima/8067471/webrev.01/:
>>   public java.lang.String();
>>     Signature: ()V
>>     flags: ACC_PUBLIC
>>     LineNumberTable:
>>       line 137: 0
>>       line 138: 4
>>       line 139: 13
>>     Code:
>>       stack=2, locals=1, args_size=1
>>          0: aload_0
>>          1: invokespecial #1                  // Method 
>> java/lang/Object."<init>":()V
>>          4: aload_0
>>          5: ldc           #2                  // String
>>          7: getfield      #3                  // Field value:[C
>>         10: putfield      #3                  // Field value:[C
>>         13: return
>>       LineNumberTable:
>>         line 137: 0
>>         line 138: 4
>>
>>
>> Aleksey ,
>> By naive glance, new constructor thrpt is  bigger(jdk9/dev vs 9b42 in 
>> attach) on:
>>     @Benchmark
>>     public String getNewString() {
>>         return new String();
>>     }
>> Please suggest things to compare . Do we have bigapps-like benchmarks 
>> which may show regression on this?
>>
>> On 17.12.2014 21:10, Aleksey Shipilev wrote:
>>> On 17.12.2014 18:58, Claes Redestad wrote:
>>>> On 2014-12-17 11:22, Lev Priima wrote:
>>>>> Please review space optimization in no args String constructor.
>>>>> Originally, it was already rejected once by suggestion in
>>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010300.html 
>>>>>
>>>>> w/o formal justification of pros and contras.  And enhancement was
>>>>> requested again by Nathan Reynolds.
>>>>>
>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8067471
>>>>> Patch: http://cr.openjdk.java.net/~lpriima/8067471/webrev.00/
>>> I am OK with this change pending we understand the performance impact a
>>> little better. Please do benchmarks.
>>>
>>>
>>>> Could this have some obscure security implications, such as the 
>>>> ability
>>>> to lock up some subsystem via retrieving and synchronizing on the 
>>>> shared
>>>> new String().value? I guess not, for practical purposes.
>>> This would be an issue if we published String.value directly, but we 
>>> don't.
>>>
>>>> I guess it could also be written:
>>>>
>>>>      public String() {
>>>>          this.value = "".value;
>>>>      }
>>>>
>>>> ... which would saves some byte code and should avoid having to 
>>>> allocate
>>>> a distinct object for this.
>>> Oh, I like this trick, given "" is probably already in constant pool 
>>> for
>>> some other reason. However, there is an additional dereference. We need
>>> a targeted nanobenchmark to properly quantify the costs for either 
>>> variant.
>>>
>>> -Aleksey.
>>>
>>>
>>
>

-- 
Best Regards,
Lev