RFR: 8315789: Minor HexFormat performance improvements [v2]
Claes Redestad
redestad at openjdk.org
Wed Sep 13 19:56:41 UTC 2023
On Fri, 8 Sep 2023 10:32:40 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> This PR seeks to improve formatting of hex digits using `java.util.HexFormat` somewhat.
>>
>> This is achieved getting rid of a couple of lookup tables, caching the result of `HexFormat.of().withUpperCase()`, and removing tiny allocation that happens in the `formatHex(A, byte)` method. Improvements range from 20-40% on throughput, and some operations allocate less:
>>
>>
>> Name Cnt Base Error Test Error Unit Diff%
>> HexFormatBench.appenderLower 15 1,330 ± 0,021 1,065 ± 0,067 us/op 19,9% (p = 0,000*)
>> :gc.alloc.rate 15 11,481 ± 0,185 0,007 ± 0,000 MB/sec -99,9% (p = 0,000*)
>> :gc.alloc.rate.norm 15 16,009 ± 0,000 0,007 ± 0,000 B/op -100,0% (p = 0,000*)
>> :gc.count 15 3,000 0,000 counts
>> :gc.time 3 2,000 ms
>> HexFormatBench.appenderLowerCached 15 1,317 ± 0,013 1,065 ± 0,054 us/op 19,1% (p = 0,000*)
>> :gc.alloc.rate 15 11,590 ± 0,111 0,007 ± 0,000 MB/sec -99,9% (p = 0,000*)
>> :gc.alloc.rate.norm 15 16,009 ± 0,000 0,007 ± 0,000 B/op -100,0% (p = 0,000*)
>> :gc.count 15 3,000 0,000 counts
>> :gc.time 3 2,000 ms
>> HexFormatBench.appenderUpper 15 1,330 ± 0,022 1,065 ± 0,036 us/op 19,9% (p = 0,000*)
>> :gc.alloc.rate 15 34,416 ± 0,559 0,007 ± 0,000 MB/sec -100,0% (p = 0,000*)
>> :gc.alloc.rate.norm 15 48,009 ± 0,000 0,007 ± 0,000 B/op -100,0% (p = 0,000*)
>> :gc.count 15 0,000 0,000 counts
>> HexFormatBench.appenderUpperCached 15 1,353 ± 0,009 1,033 ± 0,014 us/op 23,6% (p = 0,000*)
>> :gc.alloc.rate 15 11,284 ± 0,074 0,007 ± 0,000 MB/sec -99,9% (p = 0,000*)
>> :gc.alloc.rate.norm 15 16,009 ± 0,000 0,007 ± 0,000 B/op -100,0% (p = 0,000*)
>> :gc.count 15 3,000 0,000 counts
>> :gc.time 3 2,000 ms
>> HexFormatBench.toHexLower 15 0,198 ± 0,001 0,119 ± 0,008 us/op 40,1% (p = 0,000*)
>> :gc.alloc.rate 15 0,007 ± 0,000 0,007 ± 0,000 MB/sec -0,0% (p = 0,816 )
>> :gc.alloc.rate.norm 15 0,001 ± 0,000 0,001 ± 0,000 B/op -40,1% (p = 0,000*)
>> :gc....
>
> Claes Redestad has updated the pull request incrementally with one additional commit since the last revision:
>
> Add toHexDigitsByte|Short|Int|Long microbenchmarks
I ran some experiments with a lookup-table approach (based on `HexDigits` and I can get some of these to be marginally faster when combining a lookup-table approach with the `ByteArray` hack, but there's no win when using one or the other in isolation. So I think much of the win is actually not from using a lookup-table, but from tickling the JIT to inline more and optimize a bit more aggressively. So I think this might be a case of micros telling us sweet little lies, and we should favor the intuition that lookup tables should be avoided unless absolutely necessary.
I prefer the simplicity of this PR as it stands and think we should backtrack on some of the lookup tables we've recently added in `jdk.internal.util.Hex|Decimal|OctalDigits`.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/15591#issuecomment-1718230389
More information about the core-libs-dev
mailing list