RFR: 8344009: Improve compiler memory statistics

Fri Feb 14 06:42:09 UTC 2025

On Sat, 8 Feb 2025 06:56:40 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> Greetings,
> 
> This is a rewrite of the Compiler Memory Statistic. The primary new feature is the capability to track allocations by C2 phases. This will allow for a much faster, more thorough analysis of footprint issues. 
> 
> Tracking Arena memory movement is not trivial since one needs to follow the ebb and flow of allocations over nested C2 phases. A phase typically allocates more than it releases, accruing new nodes and resource area. A phase can also release more than allocated when Arenas carried over from other phases go out of scope in this phase. Finally, it can have high temporary peaks that vanish before the phase ends.
> 
> I wanted to track that information correctly and display it clearly in a way that is easy to understand.
> 
> The patch implements per-phase tracking by instrumenting the `TracePhase` stack object (thanks to @rwestrel for this idea).
> 
> The nice thing with this technique is that it also allows for quick analysis of a suspected hot spot (eg, the inside of a loop): drop a TracePhase in there with a speaking name, and you can see the allocations inside that phase.
> 
> The statistic gives us two new forms of output:
> 
> 1) At the moment the compilation memory *peaked*, we now get a detailed breakdown of that peak usage per phase:
> 
> 
> Arena Usage by Arena Type and compilation phase, at arena usage peak of 58817816:
>     Phase                         Total        ra      node      comp      type     index   reglive  regsplit     cienv     other
>     none                        1205512    155104    982984     33712         0         0         0         0         0     33712
>     parse                      11685376    720016   6578728   1899064         0         0         0         0   1832888    654680
>     optimizer                    916584         0    556416         0         0         0         0         0         0    360168
>     escapeAnalysis              1983400         0   1276392    707008         0         0         0         0         0         0
>     connectionGraph              720016         0         0    621832         0         0         0         0     98184         0
>     macroEliminate               196448         0    196448         0         0         0         0         0         0         0
>     iterGVN                      327440         0    196368    131072         0         0         0         0         0         0
>     incrementalInline           3992816         0   3043704    621832         0         0         0         0    261824...

Some additional technical information about how this statistic works:

The JVM informs the statistics about the following events:

A) When a compilation starts

B) When a compilation ends.

C) When a new compilation phase starts. That can happen in nested form.

D) When a compilation phase ends.

E) Whenever an arena grows a new chunk (regardless of whether this was a cached chunk from the chunk pool or a newly allocated chunk).

F) When an arena sheds chunks - either by rolling back to a previous ResourceMark or because the arena itself gets deleted.

During compilation (between (A) and (B)), we keep the statistic state for this compilation in an `ArenaStatCounter` object that is attached to the current compiler thread.

When a new compilation phase starts (C), we push the phase info onto a `PhaseInfoStack`. When a phase ends, we pop that information.

When we are informed of a new chunk allocation (E), we:
  - Set a stamp in the chunk header to mark it as being owned by this phase and this arena type
  - In the `ArenaStatCounter` object, we adjust global counters and counters in a two-dimensional table (`ArenaCounterTable`) that keeps counters per arena tag and compilation phase.
  - If total memory consumption for this compilation reaches a new peak, we take a snapshot of all counters as peak state.
  - We also handle `MemLimit` violations here: if `-XX:CompileCommand=memlimit...` was enabled, and the total footprint of the compilation surpasses that limit, we either end the JVM with a fatal error or we bail on the compilation. That depends on the sub-option given to the command.

When informed of a chunk deletion (F), we:
  - extract the stamp from the chunk header to know what phase/arena type this deallocation accounts to
  - we then adjust the counters for that phase/arena type in the `ArenaCounterTable`

When a compilation phase ends (D), we adjust the "footprint timeline". The footprint timeline - `FootprintTimeline` - is a one-dimensional buffer of (phase info, counter) tupels. It represents the "flattened out" form of the phase invocation tree: an invocation of a child phase nested in a parent phase "interrupts" the parent phase, and when the child phase ends, the parent phase is "restarted" as a new entry in the timeline. For example, let's say we execute phase "optimizer", and inside that, call the phase "iterGVN" and then "incrementalInline". Between these two phases, we allocate from resource area. The invocation tree looks like this:

> optimizer  1024 KB
     > iterGVN  1032 KB
< optimizer (cont.) 1032 KB + 1MB resource arena
     > incrementalInline 1032 KB + 1MB resource arena
< optimizer (cont.) 1032 KB + 1MB resource arena

The flattened-out footprint timeline will look somewhat like this:

Phase Sequence Number | Phase Name       | Footprint
5                       optimizer          1024 KB
6                       iterGVN            1032 KB
5                       optimizer          1032 KB + 1MB
7                       incrementalInline  1032 KB + 1MB
5                       optimizer          1032 KB + 1MB

Finally, when the compilation ends, we print out the statistic for it (if the suboption `print` was given with `-XX:CompileCommand=memstat`). We also save a copy of the counters to a global table that contains the N most expensive compilations. That table will be printed when one uses `jcmd <pid> Compiler.memory`. We also print it into the hs-err file.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23530#issuecomment-2658400920