RFR: 8348858: [leyden] Bump the default code buffer sizes to store more generated code

Aleksey Shipilev shade at openjdk.org
Tue Jan 28 14:00:23 UTC 2025


On Tue, 28 Jan 2025 13:55:13 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:

> Due to current prototype limitation, we cannot yet store the generated code that has the expanded code buffer. I tried to address that directly, but I think relocations disagree with the whole thing, so this implementation limitation stays for a bit longer. I turned the bailout due to that cause from `info` into `warning`.
> 
> On `JavacBenchApp 50`, this causes us to lose 700 (!) C2 compiled methods from the SCC! We can dodge significant part of the hit by bumping the default code buffer sizes, and thus making buffers less likely to require resizing, and thus allowing to store more code in SCC. Also needs [JDK-8348855](https://bugs.openjdk.org/browse/JDK-8348855) from mainline to work well with G1. Current PR includes it, and I will upstream it separately first.
> 
> Additional testing:
>  - [x] Linux x86_64 server fastdebug, `runtime/cds`

Performance tests show this realistically hits mostly G1, due to more barriers and thus more generated code, as well as broken stub size estimate.

G1:


# --- 32 CPUs (C2 runs mostly in background, look at "User" time)

# Before
  Time (mean ± σ):     479.8 ms ±   4.5 ms    [User: 2325.9 ms, System: 205.2 ms]
  Range (min … max):   475.2 ms … 489.4 ms    10 runs

# G1 barrier size fix
  Time (mean ± σ):     494.3 ms ±   9.0 ms    [User: 1748.9 ms, System: 208.1 ms]
  Range (min … max):   480.9 ms … 508.8 ms    10 runs

# G1 barrier size fix + adjusting sizes
  Time (mean ± σ):     472.4 ms ±   5.6 ms    [User: 876.7 ms, System: 149.5 ms]
  Range (min … max):   461.2 ms … 480.2 ms    10 runs

# --- 2 CPUs (C2 interferes with the workload more directly)

# Before
  Time (mean ± σ):     620.9 ms ±  16.9 ms    [User: 1029.8 ms, System: 120.1 ms]
  Range (min … max):   598.0 ms … 650.7 ms    10 runs

# G1 barrier size fix
  Time (mean ± σ):     591.2 ms ±  15.2 ms    [User: 960.6 ms, System: 128.9 ms]
  Range (min … max):   565.1 ms … 608.8 ms    10 runs

# G1 barrier size fix + adjusting sizes
  Time (mean ± σ):     553.1 ms ±   9.5 ms    [User: 871.9 ms, System: 127.4 ms]
  Range (min … max):   539.8 ms … 572.6 ms    10 runs


Parallel:



# --- 32 CPUs (C2 runs mostly in background, look at "User" time)

# Before
  Time (mean ± σ):     452.6 ms ±   5.6 ms    [User: 987.1 ms, System: 167.9 ms]
  Range (min … max):   444.5 ms … 459.8 ms    10 runs

# G1 barrier size fix + adjusting sizes
  Time (mean ± σ):     450.3 ms ±   3.9 ms    [User: 966.3 ms, System: 162.3 ms]
  Range (min … max):   445.5 ms … 458.9 ms    10 runs

# --- 2 CPUs (C2 interferes with the workload more directly)

# Before
  Time (mean ± σ):     540.1 ms ±  13.7 ms    [User: 835.2 ms, System: 138.9 ms]
  Range (min … max):   519.9 ms … 561.0 ms    10 runs

# G1 barrier size fix + adjusting sizes
  Time (mean ± σ):     530.7 ms ±  14.1 ms    [User: 828.5 ms, System: 134.5 ms]
  Range (min … max):   509.5 ms … 552.4 ms    10 runs

-------------

PR Comment: https://git.openjdk.org/leyden/pull/28#issuecomment-2619077625


More information about the leyden-dev mailing list