RFR: 8322692: ZGC: avoid over-unrolling due to hidden barrier size

Quan Anh Mai qamai at openjdk.org
Fri Jan 12 10:56:17 UTC 2024


On Thu, 11 Jan 2024 08:47:41 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:

> This changeset refines the C2 loop unrolling heuristic by including an estimation of the final size of (Generational) ZGC barriers in the loop size computation. These are not exposed in C2's intermediate representation and thus currently ignored by the heuristic, which can lead to over-unrolling.
> 
> #### Testing
> 
> - tier1-5, stress test, fuzzing (windows-x64, linux-x64, linux-aarch64, macosx-x64, macosx-aarch64).
> - tier6-9 (windows-x64, linux-x64, linux-aarch64, macosx-x64, macosx-aarch64, ZGC-specific tests only).
> 
> #### Performance and code size evaluation
> 
> - DaCapo, SPECjvm2008, SPECjbb2015 (linux-x64 with `-XX:+UseZGC -XX:+ZGenerational`). The changeset reduces slightly the size of the C2-generated code (around 0.5% fewer bytes per compiled bytecode for the `fop` and `luindex` DaCapo benchmarks) and has no overall significant performance effect.

src/hotspot/share/gc/z/c2/zBarrierSetC2.cpp line 334:

> 332:   // seven more nodes (CallLeaf, control Proj, memory Proj, data Proj, Region,
> 333:   // memory Phi, data Phi).
> 334:   return uncolor_or_color_size + 12;

I thought the runtime call does not lie inside the loop. Is it necessary to take them into account, too?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17367#discussion_r1448602659


More information about the hotspot-compiler-dev mailing list