RFR: 8315585: Optimization for decimal to string [v7]

Claes Redestad redestad at openjdk.org
Fri Sep 8 14:52:42 UTC 2023


On Fri, 8 Sep 2023 12:00:17 GMT, 温绍锦 <duke at openjdk.org> wrote:

>> BigDecimal is a commonly used class in business development, It is often necessary to perform toString or toPlainString operations on BigDecimal.
>> 
>> The current version uses StringBuilder resulting in multiple memory allocations, I made a modification to improve performance.
>> 
>> Because BigDecimal uses stringCache to cache the result of toString, the performance of toString needs special treatment before testing, such as clearing stringCache by unsafe operation before each test, The performance of toString is similar to that of toEngineering.
>> 
>> The performance data is as follows: 
>> 
>> ## 1. benchmark script
>> 
>> sh make/devkit/createJMHBundle.sh
>> bash configure --with-jmh=build/jmh/jars
>> make test TEST="micro:java.math.BigDecimals.*ToPlainString"
>> make test TEST="micro:java.math.BigDecimals.*ToEngineering"
>> 
>> 
>> ## 2. benchmark environment
>> * virtual machine : [aliyun_ecs_c8i.xlarge](https://help.aliyun.com/zh/ecs/user-guide/overview-of-instance-families#c8i)
>> * cpu intel xeon sapphire rapids (x64)
>> 
>> ## 3. benchmark result
>> 
>> -BigDecimals.testHugeToPlainString         avgt   15   188.691 ±  0.822  ns/op  (baseline)
>> -BigDecimals.testLargeToPlainString        avgt   15    36.656 ±  0.065  ns/op
>> -BigDecimals.testSmallToPlainString        avgt   15    34.342 ±  0.068  ns/op
>> -BigDecimals.testToPlainString             avgt   15  1719.494 ± 24.886  ns/op
>> 
>> +Benchmark                                 Mode  Cnt     Score    Error  Units (optimize)
>> +BigDecimals.testHugeToPlainString         avgt   15   133.972 ?  0.328  ns/op (+40.84%)
>> +BigDecimals.testLargeToPlainString        avgt   15    14.957 ?  0.047  ns/op (145.07%)
>> +BigDecimals.testSmallToPlainString        avgt   15    12.045 ?  0.036  ns/op (+185.11)
>> +BigDecimals.testToPlainString             avgt   15  1643.500 ?  3.217  ns/op (+4.62%)
>> 
>> -Benchmark                                 Mode  Cnt     Score    Error  Units (baseline)
>> -BigDecimals.testHugeToEngineeringString   avgt   15   207.621 ±  5.018  ns/op
>> -BigDecimals.testLargeToEngineeringString  avgt   15    35.658 ±  3.144  ns/op
>> -BigDecimals.testSmallToEngineeringString  avgt   15    15.142 ±  0.053  ns/op
>> -BigDecimals.testToEngineeringString       avgt   15  1813.959 ± 12.842  ns/op
>> 
>> +Benchmark                                 Mode  Cnt     Score    Error  Units (optimize)
>> +BigDecimals.testHugeToEngineeringString   avgt   15   142.110 ?  0.987  ns/op (+45.09%)
>> +BigDecimals.testLargeToEngineeringStr...
>
> 温绍锦 has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Continue to optimize and reduce duplicate code

Seems like a step in the right direction w.r.t. code duplication. There's still a lot going on in this PR so it'll take some time to digest. Is there some way to split this into a series of enhancements for easier review? The `jla.newStringLatin1NoRepl` additions could be split out and done first - along with evidence that it helps in other places like UUID and HexFormat.

src/java.base/share/classes/java/lang/String.java line 699:

> 697:     }
> 698: 
> 699:     static String newStringLatin1NoRepl(byte[] bytes) {

How much does this help compared to calling `jla.newStringNoRepl(bytes, StandardCharsets.ISO_8859_1)`? Maybe something that would be better split out to a separate enhancement and examined in isolation and apply it in other places where applicable (`java.util.UUID`, `java.util.HexFormat`)

src/java.base/share/classes/jdk/internal/access/JavaLangAccess.java line 358:

> 356: 
> 357:     /**
> 358:      * Pack the two ascii characters corresponding to the value from 0 to 100 into a short

Wording and name of this method can be improved. Suggestion:
`short digitPair(int i)`
"For values from 0 to 99 return a `short` encoding a pair of ASCII-encoded digit characters in little-endian, e.g. `0 -> ('0' << 8 | '0')`. Used for formatting."

-------------

PR Review: https://git.openjdk.org/jdk/pull/15555#pullrequestreview-1617373581
PR Review Comment: https://git.openjdk.org/jdk/pull/15555#discussion_r1319803899
PR Review Comment: https://git.openjdk.org/jdk/pull/15555#discussion_r1319939577


More information about the core-libs-dev mailing list