RFR: 8261847: performace of java.lang.Record::toString should be improved [v3]

Fri Nov 19 05:15:39 UTC 2021

On Fri, 19 Nov 2021 05:07:23 GMT, Vicente Romero <vromero at openjdk.org> wrote:

>> Please review this PR which aims to optimize the implementation of the `toString` method we provide for records. A benchmark comparing the implementation we are providing for records with lombok found out that lombok is much faster mainly because our implementation uses `String::format`. This fix is basically delegating on StringConcatFactory::makeConcatWithConstants which is faster.
>> 
>> TIA
>> 
>> This is the result of the benchmark comparing records to lombok with vanilla JDK:
>> 
>> Benchmark                          Mode  Cnt    Score    Error  Units
>> MyBenchmark.base                   avgt    4    0.849 ±  0.111  ns/op
>> MyBenchmark.equals_record          avgt    4    7.343 ±  2.740  ns/op
>> MyBenchmark.equals_value           avgt    4    6.644 ±  1.920  ns/op
>> MyBenchmark.record_hash_code       avgt    4    5.763 ±  3.882  ns/op
>> MyBenchmark.record_to_string       avgt    4  262.626 ± 12.574  ns/op               <------ Before
>> MyBenchmark.value_class_to_string  avgt    4   30.325 ± 21.389  ns/op
>> MyBenchmark.value_hash_code        avgt    4    5.048 ±  3.936  ns/op
>> 
>> 
>> after this patch:
>> 
>> Benchmark                          Mode  Cnt   Score   Error  Units
>> MyBenchmark.base                   avgt    4   0.680 ± 0.185  ns/op
>> MyBenchmark.equals_record          avgt    4   5.599 ± 1.348  ns/op
>> MyBenchmark.equals_value           avgt    4   5.718 ± 4.633  ns/op
>> MyBenchmark.record_hash_code       avgt    4   4.628 ± 4.368  ns/op
>> MyBenchmark.record_to_string       avgt    4  26.791 ± 1.817  ns/op                 <------- After
>> MyBenchmark.value_class_to_string  avgt    4  35.473 ± 2.626  ns/op
>> MyBenchmark.value_hash_code        avgt    4   6.152 ± 5.101  ns/op
>
> Vicente Romero has updated the pull request incrementally with one additional commit since the last revision:
> 
>   adding the benchmark

I have done some additional experiments as suggested by Claes, thanks, for different number of record components. In all cases all the components are of type `int` so they occupy only one slot. Here are some numbers. I tried with 0, 1, 10, 100 and 254 record components.

Benchmark                                  Mode  Cnt      Score       Error  Units
MyBenchmark.base                           avgt    4      0.775 ±     0.552  ns/op
MyBenchmark.record0_toString               avgt    4      4.973 ±     2.800  ns/op
MyBenchmark.record1_toString               avgt    4     16.026 ±     5.100  ns/op
MyBenchmark.record10_toString              avgt    4     81.412 ±     3.365  ns/op
MyBenchmark.record100_toString             avgt    4  12269.500 ±   179.784  ns/op
MyBenchmark.record254_toString             avgt    4  51191.953 ± 11679.762  ns/op
MyBenchmark.valueClass0_toString           avgt    4      5.134 ±     2.372  ns/op
MyBenchmark.valueClass1_toString           avgt    4     23.321 ±     9.234  ns/op
MyBenchmark.valueClass10_toString          avgt    4     94.048 ±     7.017  ns/op
MyBenchmark.valueClass100_toString         avgt    4   9253.282 ±  4843.738  ns/op
MyBenchmark.valueClass254_toString         avgt    4  31963.158 ± 24050.499  ns/op

Then also after a suggestion from Claes I modified the maximum number of slots I would be chopping the arguments into, I gave a try with 20 slots and got these numbers:

Benchmark                                  Mode  Cnt      Score       Error  Units
MyBenchmark.record0_toString               avgt    4      5.009 ±     3.454  ns/op
MyBenchmark.record1_toString               avgt    4     14.207 ±    10.551  ns/op
MyBenchmark.record10_toString              avgt    4     81.018 ±     7.320  ns/op
MyBenchmark.record100_toString             avgt    4   2862.641 ±  1233.862  ns/op
MyBenchmark.record254_toString             avgt    4  23002.280 ± 97103.923  ns/op
MyBenchmark.valueClass0_toString           avgt    4      4.967 ±     3.947  ns/op
MyBenchmark.valueClass1_toString           avgt    4     23.756 ±     8.499  ns/op
MyBenchmark.valueClass10_toString          avgt    4     87.691 ±     7.956  ns/op
MyBenchmark.valueClass100_toString         avgt    4   9539.272 ±  9461.516  ns/op
MyBenchmark.valueClass254_toString         avgt    4  28323.478 ± 11932.474  ns/op

It seems like the execution is way faster now.

-------------

PR: https://git.openjdk.java.net/jdk/pull/6403