RFR: 8332487: Regression in Crypto-AESGCMBench.encrypt (and others) after JDK-8328181

Tobias Hartmann thartmann at openjdk.org
Thu May 30 05:47:01 UTC 2024


On Wed, 29 May 2024 07:49:21 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

> Re-instantiating the ClearArray opcode check in match_rule_supported_vector, this caused performance regressions in some worklets in Renaissance BM since it prevented small sized instance initialization using quadword stores which showed better performance on non-AVX512 targets.
> 
> Our intent was to save code bloating due to long sequences of quadword store with large InitArrayShortSize value to prevent any side effects on in-lining decisions. Performance of an existing [Benchmark](https://github.com/openjdk/jdk/blob/master/test/micro/org/openjdk/bench/vm/compiler/ClearMemory.java) does not show much performance variation. 
> 
> 
> Baseline with -XX:InitArrayShortSize=100000000
> 
> Benchmark                        Mode  Cnt          Score   Error  Units
> ClearMemory.testClearMemory16K  thrpt    2    2695259.360          ops/s
> ClearMemory.testClearMemory1K   thrpt    2   48622330.474          ops/s
> ClearMemory.testClearMemory1M   thrpt    2      79546.779          ops/s
> ClearMemory.testClearMemory24B  thrpt    2  252740278.617          ops/s
> ClearMemory.testClearMemory2K   thrpt    2   24781443.547          ops/s
> ClearMemory.testClearMemory32B  thrpt    2  251588987.342          ops/s
> ClearMemory.testClearMemory32K  thrpt    2    1487427.378          ops/s
> ClearMemory.testClearMemory40B  thrpt    2  213856093.091          ops/s
> ClearMemory.testClearMemory48B  thrpt    2  193701317.101          ops/s
> ClearMemory.testClearMemory4K   thrpt    2   11961450.919          ops/s
> ClearMemory.testClearMemory56B  thrpt    2  169003238.018          ops/s
> ClearMemory.testClearMemory8K   thrpt    2    5871416.239          ops/s
> ClearMemory.testClearMemory8M   thrpt    2      10663.044          ops/s
> 
> 
> With patch and -XX:InitArrayShortSize=100000000
> 
> Benchmark                        Mode  Cnt          Score   Error  Units
> ClearMemory.testClearMemory16K  thrpt    2    3147203.987          ops/s
> ClearMemory.testClearMemory1K   thrpt    2   48225184.981          ops/s
> ClearMemory.testClearMemory1M   thrpt    2      80016.400           ops/s
> ClearMemory.testClearMemory24B  thrpt    2  253904943.981          ops/s
> ClearMemory.testClearMemory2K   thrpt    2   24664594.490          ops/s
> ClearMemory.testClearMemory32B  thrpt    2  255507231.954          ops/s
> ClearMemory.testClearMemory32K  thrpt    2    1636220.531          ops/s
> ClearMemory.testClearMemory40B  thrpt    2  220718255.832          ops/s
> ClearMemory.testClearMemory48B  thrpt    2  196294911.715          ops/s
> ClearMemory.test...

Looks good and trivial.

-------------

Marked as reviewed by thartmann (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19447#pullrequestreview-2087279575


More information about the hotspot-compiler-dev mailing list