RFR: JDK-8297958: NMT: Display peak values [v2]

Mon Dec 5 20:08:34 UTC 2022

On Sat, 3 Dec 2022 12:25:28 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> Historical peak values can be very useful in analyzing memory footprint. 
>> 
>> For example, want to know how much memory the compiler needs during warmup? You have to get an NMT report at the exact time, with compile arenas at their largest combined extensions. But if we had peak values, you'd see how much the compiler needed by just looking at its arena peak size.
>> 
>> We already collect peak values in debug builds, but never actually display them. Since we already pay for them, we might just as well print them.
>> 
>> There is also a small issue that peak size and count are updated separately. It makes not much sense to treat these as independent values. Therefore this patch changes the meaning and implementation of peak count from today's "highest count" to "count at the point peak size was reached". 
>> 
>> How this looks like:
>> 
>> 
>> -                        GC (reserved=425748KB, committed=94868KB)
>>                             (malloc=38552KB #1086) (peak=38622KB #1409)   <<<<
>>                             (mmap: reserved=387196KB, committed=56316KB) 
>>  
>> -                 GCCardSet (reserved=29KB, committed=29KB)
>>                             (malloc=29KB #387) 
>>  
>> -                  Compiler (reserved=200KB, committed=200KB)
>>                             (malloc=36KB #59) (peak=37KB #74)    <<<<
>>                             (arena=165KB #5) (peak=6192KB #18)  <<<<
>> 
>> 
>> 
>> [0x00007f7e6b0866b0] Arena::grow(unsigned long, AllocFailStrategy::AllocFailEnum)+0x40
>> [0x00007f7e6c117a29] OopMap::OopMap(int, int)+0x69
>> [0x00007f7e6b2857a5] LinearScan::compute_oop_map(IntervalWalker*, LIR_Op*, CodeEmitInfo*, bool)+0x85
>> [0x00007f7e6b285f2a] LinearScan::compute_oop_map(IntervalWalker*, LIR_OpVisitState const&, LIR_Op*)+0x5a
>>                              (malloc=32KB type=Arena Chunk #1) (peak=288KB #9) <<<<
>> 
>> 
>> Notes:
>> 
>> - This RFE just adds peak to the *debug* VM. In debug, we already have all values, its just a matter of printing them. I would love to see peak values in release builds too, but collecting them does cost one or more CAS per malloc and therefore we must analyze performance before enabling. Having them in release would also remove the remaining #ifdef ASSERT.
>> 
>> - I omit printing peak values when we are at peak. So if "peak" is missing, current peak is implied.
>> 
>> - I only print peak values in summary and detail mode, not in any of the diff modes, to keep code complexity low and because diff modes are more about baseline compare.
>> 
>> - In detail mode, there is a small display issue that call sites will be omitted that have no current allocations. A hypothetical call site that allocated a zillion byte, then freed them all, will not be shown even though its peak value would be interesting. That is an issue for another RFE.
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   small fix

Important addition to the memory stats in my opinion. I agree having it in release build would be great.

> I omit printing peak values when we are at peak. So if "peak" is missing, current peak is implied.

It can be confusing for others not aware of this behavior, and may think that the "peak" value is missing in the output. I believe it would be less confusing if "peak" is "always" displayed.

-------------

PR: https://git.openjdk.org/jdk/pull/11497