[jdk17u-dev] RFR: 8304976: Optimize DateTimeFormatterBuilder.ZoneTextPrinterParser.getTree()

Thu Jun 1 11:32:06 UTC 2023

On Thu, 1 Jun 2023 09:02:44 GMT, Oli Gillespie <ogillespie at openjdk.org> wrote:

> Hi all,
> 
> This pull request contains a backport of commit [438c969b](https://github.com/openjdk/jdk/commit/438c969b7b07eeef0158b089e5a168849e04bf56) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository.
> 
> The commit being backported was authored by Sergey Tsypanov on 29 Mar 2023 and was reviewed by Naoto Sato.
> 
> Since the original uses the new `newHashMap` and `newHashSet` methods from https://bugs.openjdk.org/browse/JDK-8284780 which are not here in 17, I've not included those changes in the backport. They are not a significant part of the original fix, the main point of the change is kept which is avoiding the creation of 2 new hash sets on the cached path. Running the included benchmark confirms the performance improvement even with the omissions:
> 
> Base:
> 
> Benchmark                                                  Mode  Cnt      Score     Error   Units
> ZonedDateTimeFormatterBenchmark.parse                      avgt   20  18248.082 ? 166.440   ns/op
> ZonedDateTimeFormatterBenchmark.parse:?gc.alloc.rate       avgt   20   1274.964 ?  11.448  MB/sec
> ZonedDateTimeFormatterBenchmark.parse:?gc.alloc.rate.norm  avgt   20  24400.418 ?   1.591    B/op
> ZonedDateTimeFormatterBenchmark.parse:?gc.count            avgt   20     44.000            counts
> ZonedDateTimeFormatterBenchmark.parse:?gc.time             avgt   20     35.000                ms
> 
> 
> Patch:
> 
> Benchmark                                                  Mode  Cnt    Score   Error   Units
> ZonedDateTimeFormatterBenchmark.parse                      avgt   20  923.943 ? 8.737   ns/op
> ZonedDateTimeFormatterBenchmark.parse:?gc.alloc.rate       avgt   20  883.450 ? 8.163  MB/sec
> ZonedDateTimeFormatterBenchmark.parse:?gc.alloc.rate.norm  avgt   20  856.021 ? 0.079    B/op
> ZonedDateTimeFormatterBenchmark.parse:?gc.count            avgt   20   32.000          counts
> ZonedDateTimeFormatterBenchmark.parse:?gc.time             avgt   20   27.000              ms

Thanks, yes `2` seems best for performance if we take it as correct that this usually has only one value (I have nothing to back that up except the comment in the original change), and we're happy to rely on knowledge of the default load factor.

Do you have a preference for:

- Use 2 with no comment
- Use 2 with comment
- Use something like `1/0.75f + 1` which is semi self-documenting

Judging from https://bugs.openjdk.org/browse/JDK-8186958, load factor is rarely considered when setting initial capacity in existing code. Which to me makes sense, because it's a private detail of the HashMap implementation, hence the new methods which handle those internals safely.

(I still prefer not specifying, but I'm happy to add whatever you think is best)

-------------

PR Comment: https://git.openjdk.org/jdk17u-dev/pull/1414#issuecomment-1571875301