RFR(S): 8067471: Use private static final char[0] for empty Strings

Fri Dec 19 19:14:30 UTC 2014

On 19.12.2014 18:36, Ivan Gerasimov wrote:
> Hi everyone!
>
> I guess something like the following might be used to avoid slowing 
> down the common route:
>

Sorry, it was meant to be:

              if (count <= 0) {
                  if (count < 0) {
                      throw new StringIndexOutOfBoundsException(count);
                  }
                  if (offset <= value.length) {
                      this.value = "".value;
                      return;
                  }
              }
              // Note: offset or count might be near -1>>>1.
              if (offset > value.length - count) {
                  throw new StringIndexOutOfBoundsException(offset + count);
              }
              this.value = Arrays.copyOfRange(value, offset, offset + count);

Otherwise, the behavior would have changed.

Sincerely yours,
Ivan

> Sincerely yours,
> Ivan
>
>
> On 19.12.2014 17:25, Claes Redestad wrote:
>> Hi again,
>>
>> I just had a nagging thought. For applications reporting some or a 
>> lot of different empty String objects,
>> this might be due not to use of this constructor, but due to things 
>> like an empty StringBuilder.toString() ending
>> up calling java.lang.String(char[] value, int offset, int count) with 
>> a count of 0.
>>
>> My feeling is that this would be a more probable/reasonable source of 
>> distinct empty strings in an application,
>> and would not be helped much by the current patch.
>>
>> For completeness the java.lang.String(char/int[], int, int) 
>> constructors could have a check for count == 0 in
>> which case it assigns value to "".value, see 
>> http://cr.openjdk.java.net/~redestad/8067471/webrev.01/
>>
>> Running a few simple microbenchmarks on this (sorry, Aleksey!)[1] 
>> yields a bit of speedup, at no discernible
>> cost for a quick sanity check of an actually copying path:
>>
>> baseline:
>> Benchmark                            Mode  Samples    Score Error   
>> Units
>> o.s.SimpleBench.emptySBBench        thrpt        5   48.213 ± 3.509  
>> ops/us
>> o.s.SimpleBench.emptyString         thrpt        5  103.900 ± 4.869  
>> ops/us
>> o.s.SimpleBench.emptyStringCount    thrpt        5  105.802 ± 3.355  
>> ops/us
>> o.s.SimpleBench.emptyStringCountCP  thrpt        5  104.464 ± 8.251  
>> ops/us
>> o.s.SimpleBench.singleStringCount   thrpt        5   76.379 ± 4.139  
>> ops/us
>> o.s.SimpleBench.singleStringCountCP thrpt        5   88.202 ± 4.945  
>> ops/us
>>
>> experiment:
>> Benchmark                            Mode  Samples    Score Error   
>> Units
>> o.s.SimpleBench.emptySBBench        thrpt        5  153.822 ± 7.900  
>> ops/us 3x
>> o.s.SimpleBench.emptyString         thrpt        5  183.588 ± 4.429  
>> ops/us 1.75x
>> o.s.SimpleBench.emptyStringCount    thrpt        5  166.457 ± 9.202  
>> ops/us 1.6x
>> o.s.SimpleBench.emptyStringCountCP  thrpt        5  168.089 ± 4.676  
>> ops/us 1.6x
>> o.s.SimpleBench.singleStringCount   thrpt        5   75.780 ± 8.490  
>> ops/us 1x
>> o.s.SimpleBench.singleStringCountCP thrpt        5   89.202 ± 3.714  
>> ops/us 1x
>>
>> I guess we'd need to check that compiled code don't get messed up for 
>> a mix of inputs where count can be both
>> zero and non-zero with some probability.
>>
>> /Claes
>>
>> [1] public char[] emptyCharArray = new char[0];
>>     public int[] emptyIntArray = new int[0];
>>
>>     public char[] singleCharArray = new char[] { 'x' };
>>     public int[] singleIntArray = new int[] { 102 };
>>     public StringBuilder emptySB = new StringBuilder();
>>
>>     @Benchmark
>>     public String emptySBBench() {
>>         return emptySB.toString();
>>     }
>>
>>     @Benchmark
>>     public String emptyString() {
>>         return new String();
>>     }
>>
>>     @Benchmark
>>     public String emptyStringCount() {
>>         return new String(emptyCharArray, 0, 0);
>>     }
>>
>>     @Benchmark
>>     public String emptyStringCountCP() {
>>         return new String(emptyIntArray, 0, 0);
>>     }
>>
>>     @Benchmark
>>     public String singleStringCount() {
>>         return new String(singleCharArray, 0, 1);
>>     }
>>
>>     @Benchmark
>>     public String singleStringCountCP() {
>>         return new String(singleIntArray, 0, 1);
>>     }
>>
>>
>> On 2014-12-19 01:41, Lev Priima wrote:
>>> Claes,
>>> Thanks for cool suggestion: 
>>> http://cr.openjdk.java.net/~lpriima/8067471/webrev.01/:
>>>   public java.lang.String();
>>>     Signature: ()V
>>>     flags: ACC_PUBLIC
>>>     LineNumberTable:
>>>       line 137: 0
>>>       line 138: 4
>>>       line 139: 13
>>>     Code:
>>>       stack=2, locals=1, args_size=1
>>>          0: aload_0
>>>          1: invokespecial #1                  // Method 
>>> java/lang/Object."<init>":()V
>>>          4: aload_0
>>>          5: ldc           #2                  // String
>>>          7: getfield      #3                  // Field value:[C
>>>         10: putfield      #3                  // Field value:[C
>>>         13: return
>>>       LineNumberTable:
>>>         line 137: 0
>>>         line 138: 4
>>>
>>>
>>> Aleksey ,
>>> By naive glance, new constructor thrpt is  bigger(jdk9/dev vs 9b42 
>>> in attach) on:
>>>     @Benchmark
>>>     public String getNewString() {
>>>         return new String();
>>>     }
>>> Please suggest things to compare . Do we have bigapps-like 
>>> benchmarks which may show regression on this?
>>>
>>> On 17.12.2014 21:10, Aleksey Shipilev wrote:
>>>> On 17.12.2014 18:58, Claes Redestad wrote:
>>>>> On 2014-12-17 11:22, Lev Priima wrote:
>>>>>> Please review space optimization in no args String constructor.
>>>>>> Originally, it was already rejected once by suggestion in
>>>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010300.html 
>>>>>>
>>>>>> w/o formal justification of pros and contras.  And enhancement was
>>>>>> requested again by Nathan Reynolds.
>>>>>>
>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8067471
>>>>>> Patch: http://cr.openjdk.java.net/~lpriima/8067471/webrev.00/
>>>> I am OK with this change pending we understand the performance 
>>>> impact a
>>>> little better. Please do benchmarks.
>>>>
>>>>
>>>>> Could this have some obscure security implications, such as the 
>>>>> ability
>>>>> to lock up some subsystem via retrieving and synchronizing on the 
>>>>> shared
>>>>> new String().value? I guess not, for practical purposes.
>>>> This would be an issue if we published String.value directly, but 
>>>> we don't.
>>>>
>>>>> I guess it could also be written:
>>>>>
>>>>>      public String() {
>>>>>          this.value = "".value;
>>>>>      }
>>>>>
>>>>> ... which would saves some byte code and should avoid having to 
>>>>> allocate
>>>>> a distinct object for this.
>>>> Oh, I like this trick, given "" is probably already in constant 
>>>> pool for
>>>> some other reason. However, there is an additional dereference. We 
>>>> need
>>>> a targeted nanobenchmark to properly quantify the costs for either 
>>>> variant.
>>>>
>>>> -Aleksey.
>>>>
>>>>
>>>
>>
>>
>>
>
>
>