[PATCH] Trivial improvement for j.l.Character.toString()

Roger Riggs Roger.Riggs at oracle.com
Wed Mar 25 23:12:02 UTC 2020


Agreed,  +1

On 3/25/20 5:53 PM, Claes Redestad wrote:
> Looks good and trivial, including the drive-by cleanups.
>
> I can sponsor.
>
> /Claes
>
> On 2020-03-25 22:18, Сергей Цыпанов wrote:
>> Hello,
>>
>> I think we can reduce allocation rate for j.l.Character.toString() by 
>> calling String.valueOf(char) instead of String.valueOf(char[]).
>>
>> Current implementation creates char[] with one char which is later 
>> decoded into byte[].
>>
>> Instead String.valueOf(char) decodes char directly consuming less 
>> memory. I've used benchmark [1]
>> and on my machine got those results (JDK 14):
>>
>> Benchmark Mode    Score     Error   Units
>>
>> CharacterToStringBenchmark.toString_utf8 avgt   14.723 ±   1.354   ns/op
>> CharacterToStringBenchmark.valueOf_utf8 avgt    7.678 ±   0.601   ns/op
>>
>> CharacterToStringBenchmark.toString_latin avgt   10.992 ±   1.371   
>> ns/op
>> CharacterToStringBenchmark.valueOf_latin avgt    7.844 ±   1.044   ns/op
>>
>> CharacterToStringBenchmark.toString_utf8:·gc.alloc.rate.norm avgt   
>> 96.003 ±   0.001    B/op
>> CharacterToStringBenchmark.valueOf_utf8:·gc.alloc.rate.norm avgt   
>> 48.002 ±   0.001    B/op
>>
>> CharacterToStringBenchmark.toString_latin:·gc.alloc.rate.norm avgt   
>> 72.003 ±   0.001    B/op
>> CharacterToStringBenchmark.valueOf_latin:·gc.alloc.rate.norm avgt   
>> 48.002 ±   0.001    B/op
>>
>>
>> Patch is below.
>>
>> With best regards,
>> Sergey Tsypanov
>>
>> [1] 
>> https://github.com/stsypanov/strings/blob/master/src/jmh/java/tsypanov/strings/character/CharacterToStringBenchmark.java
>>
>> diff --git a/src/java.base/share/classes/java/lang/Character.java 
>> b/src/java.base/share/classes/java/lang/Character.java
>> --- a/src/java.base/share/classes/java/lang/Character.java
>> +++ b/src/java.base/share/classes/java/lang/Character.java
>> @@ -3285,7 +3285,7 @@
>>                                "SYMBOLS AND PICTOGRAPHS EXTENDED-A",
>> "SYMBOLSANDPICTOGRAPHSEXTENDED-A");
>>   -        private static final int blockStarts[] = {
>> +        private static final int[] blockStarts = {
>>               0x0000,   // 0000..007F; Basic Latin
>>               0x0080,   // 0080..00FF; Latin-1 Supplement
>>               0x0100,   // 0100..017F; Latin Extended-A
>> @@ -8068,7 +8068,7 @@
>>               UNKNOWN,                  // E01F0..10FFFF
>>           };
>>   -        private static HashMap<String, Character.UnicodeScript> 
>> aliases;
>> +        private static final HashMap<String, 
>> Character.UnicodeScript> aliases;
>>           static {
>>               aliases = new HashMap<>((int)(153 / 0.75f + 1.0f));
>>               aliases.put("ADLM", ADLAM);
>> @@ -8421,8 +8421,7 @@
>>        * @return  a string representation of this object.
>>        */
>>       public String toString() {
>> -        char buf[] = {value};
>> -        return String.valueOf(buf);
>> +        return toString(value);
>>       }
>>         /**
>>



More information about the core-libs-dev mailing list