RFR: 8310929: Optimization for Integer.toString [v13]

Thu Aug 31 14:18:05 UTC 2023

On Tue, 18 Jul 2023 01:49:17 GMT, 温绍锦 <duke at openjdk.org> wrote:

>> Optimization for:
>> Integer.toString
>> Long.toString
>> StringBuilder#append(int)
>> 
>> # Benchmark Result
>> 
>> 
>> sh make/devkit/createJMHBundle.sh
>> bash configure --with-jmh=build/jmh/jars
>> make test TEST="micro:java.lang.Integers.toString*" 
>> make test TEST="micro:java.lang.Longs.toString*" 
>> make test TEST="micro:java.lang.StringBuilders.toStringCharWithInt8"
>> 
>> 
>> ## 1. [aliyun_ecs_c8i.xlarge](https://help.aliyun.com/document_detail/25378.html#c8i)
>> * cpu : intel xeon sapphire rapids (x64)
>> 
>> ``` diff
>> -Benchmark               (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Integers.toStringBig       500  avgt   15  6.825 ± 0.023  us/op
>> -Integers.toStringSmall     500  avgt   15  4.823 ± 0.023  us/op
>> -Integers.toStringTiny      500  avgt   15  3.878 ± 0.101  us/op
>> 
>> +Benchmark               (size)  Mode  Cnt  Score   Error  Units (PR Update 04 f4aa1989)
>> +Integers.toStringBig       500  avgt   15  6.002 ± 0.054  us/op (+13.71%)
>> +Integers.toStringSmall     500  avgt   15  4.025 ± 0.020  us/op (+19.82%)
>> +Integers.toStringTiny      500  avgt   15  3.874 ± 0.067  us/op (+0.10%)
>> 
>> -Benchmark            (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Longs.toStringBig       500  avgt   15  9.224 ± 0.021  us/op
>> -Longs.toStringSmall     500  avgt   15  4.621 ± 0.087  us/op
>> 
>> +Benchmark            (size)  Mode  Cnt  Score   Error  Units (PR Update 04 f4aa1989)
>> +Longs.toStringBig       500  avgt   15  7.483 ± 0.018  us/op (+23.26%)
>> +Longs.toStringSmall     500  avgt   15  4.020 ± 0.016  us/op (+14.95%)
>> 
>> -Benchmark                           Mode  Cnt     Score    Error  Units (baseline)
>> -StringBuilders.toStringCharWithInt8 avgt   15    89.327 ±  0.733  ns/op
>> 
>> +Benchmark                            Mode  Cnt   Score   Error  Units (PR Update 04 f4aa1989)
>> +StringBuilders.toStringCharWithInt8  avgt   15  36.639 ± 0.422  ns/op (+143.80%)
>> 
>> 
>> 
>> ## 2. [aliyun_ecs_c8a.xlarge](https://help.aliyun.com/document_detail/25378.html#c8a)
>> * cpu : amd epc genoa (x64)
>> 
>> ``` diff
>> -Benchmark               (size)  Mode  Cnt  Score   Error  Units (baseline)
>> -Integers.toStringBig       500  avgt   15  6.753 ± 0.007  us/op
>> -Integers.toStringSmall     500  avgt   15  4.470 ± 0.005  us/op
>> -Integers.toStringTiny      500  avgt   15  2.764 ± 0.020  us/op
>> 
>> +Benchmark               (size)  Mode  Cnt  Score   Error  Units (PR Update 04 f4aa1989)
>> +Integers.toStringBig       500  avgt   15  5.036 ± 0.005  us/op (+...
>
> 温绍锦 has updated the pull request incrementally with one additional commit since the last revision:
> 
>   assert bounds check

I'm wondering if a micro benchmark like this is very realistic.  They may score positive as eventually these helper tables are in the inner most cache level, but it may be a net negative for larger functions that only do integer conversion occasionally, and almost always have a cache miss when doing a conversion.  The tables may also be displacing other more useful cache lines.

In my opinion, cache is a limited resource, and having a low level function use a big chunk of it may be optimal for a micro benchmark, but seems unlikely to be optimal overall.  If you are only doing mass string to integer conversion, I'm sure it will do better.  If you are writing out JSON that sometimes contains integers, the integer conversions may now sometimes have cache misses (in which case a non-cached based conversion would perform better), or the conversion is displacing other more useful cache lines.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14699#issuecomment-1701138931