Pause reported in MXBean

Per Liden per.liden at oracle.com
Fri Dec 11 10:58:38 UTC 2020


Hi Pedro,

On 12/7/20 12:29 PM, Viton, Pedro (Nokia - ES/Madrid) wrote:
> Hi:
> 
> A couple of years ago, I was trying out ZGC on Java 11.
> Now, I've retested it for our Java Application on 15.0.1.
> 
> What I've noticed now, is that I get the impression that the data reported in MXBeans is the total time ZGC is running and using CPU, both the pause times and the concurrent times.

That's correct. This changed with JDK-8240679. The problem is of course 
that the GarbageCollectorMXBean wasn't really designed to convey 
information from a concurrent GC, so whatever model we pick some 
information will be inaccurate or missing.

> For instance, as seen from the CLI of our Application (that reads the data from MXBeans):
> 
> [CMD] java gc stats
> 
> Collector  Count  Time(ms)
> ---------  -----  --------
> ZGC          702    415291
> 
> 
> However, if my memory is not too bad, I would swear that the time reported back in Java 11, was only the pause time, and didn't include the time of the concurrent phases.
> Somehow, that's also the behavior of G1 and old CMS GC, that in MXBeans, they only report the pause times.
> 
> I suppose that at this stage, it is not possible to change current behavior, as it could affect other people and would break backwards compatibility.
> But, alternatively, I was wondering if something could be done similar to what Shenandoah does; report 2 instances in MXBeans: pause counts & times and cycles count & times
> 
> policy-luna38> java gc stats
> 103 Multi-line response follows.
> Collector          Count  Time(ms)
> -----------------  -----  --------
> Shenandoah Pauses     20        21
> Shenandoah Cycles      5      1471
> 100 Ok.
> 
> policy-luna38> java gc force
> 100 Ok.
> 
> policy-luna38> java gc stats
> 103 Multi-line response follows.
> Collector          Count  Time(ms)
> -----------------  -----  --------
> Shenandoah Pauses     24        26
> Shenandoah Cycles      6      1769
> 100 Ok.
> 
> Though CPU time is important, what people are normally concerned about is the pause times, and avg time per pause (specially people who have traditionally used G1 and CMS)

Reporting both could be an alternative going forward. It's perhaps not a 
super great fit either, but might still be an acceptable options. There 
has also been some loose discussions about creating a new type of 
MXBean, which would be more suitable to convey information from a 
concurrent GC. But no work has been done on that so far.

Also worth noting is that starting with JDK 16, ZGC pauses will execute 
in constant time (MarkStart and RelocateStart) or be bound in time to 
200us (MarkEnd), so pause time information from ZGC is getting less 
interesting, as they are so short.

cheers,
Per

> 
> Thanks a lot,
> 	Pedro
> 
> 
>    
> 
> -----Original Message-----
> From: Viton, Pedro (Nokia - ES/Madrid) <pedro.viton at nokia.com>
> Sent: Wednesday, September 26, 2018 3:13 PM
> To: Per Liden <per.liden at oracle.com>
> Cc: zgc-dev at openjdk.java.net
> Subject: RE: Inclusion of pause time in gc.log
> 
> Hi:
> 
> Adding the pause time in the way you indicate:
>      1.302ms/0.987ms/1.211ms
> Is even better than my initial suggestion.
> 
> Thanks,
> 	Pedro
> 
> PEDRO VITON
> NOKIA
> AAA Specialist
> C/ Maria Tubau 9 (28050 Madrid - SPAIN)
> M: +34 690 964740 (ext. 2411 5746)
> pedro.viton at nokia.com
> 
> -----Original Message-----
> From: Per Liden <per.liden at oracle.com>
> Sent: Wednesday, September 26, 2018 3:04 PM
> To: Viton, Pedro (Nokia - ES/Madrid) <pedro.viton at nokia.com>; zgc-dev at openjdk.java.net
> Subject: Re: Inclusion of pause time in gc.log
> 
> Hi,
> 
> On 09/25/2018 02:41 PM, Viton, Pedro (Nokia - ES/Madrid) wrote:
>> Hi:
>>
>> I've downloaded the latest Java11 Early Access to try out ZGC.
>> I have to admit I'm really impressed with its reduced pause time, that
>> I measure with MXBeans
>>
>> I also like that the log line (included in gc.log, with default
>> logging enabled) indicates the % of the heap before and after the
>> collection [26.814s][info][gc] GC(0) Garbage Collection (Metadata GC
>> Threshold) 168M(1%)->78M(1%) [28.719s][info][gc] GC(1) Garbage
>> Collection (Metadata GC Threshold) 212M(2%)->94M(1%)
>> [37.136s][info][gc] GC(2) Garbage Collection (Warmup)
>> 3764M(31%)->1606M(13%) [45.807s][info][gc] GC(3) Garbage Collection
>> (Allocation Rate) 6468M(53%)->2530M(21%) [48.553s][info][gc] GC(4)
>> Garbage Collection (System.gc()) 2530M(21%)->1640M(13%)
>> [114.713s][info][gc] GC(5) Garbage Collection (Allocation Rate)
>> 6454M(53%)->2198M(18%) [165.234s][info][gc] GC(6) Garbage Collection
>> (Allocation Rate) 8496M(69%)->3190M(26%)
>>
>> I just have 1 suggestion:
>> Would it be possible to also include in that log line, the pause time of the collection (as the addition of the 3 pauses for each collection)?
> 
> We've discussed various ways of showing this, for example like this
> 
> "Garbage Collection (Allocation Rate) 8496M(69%)->3190M(26%), 1.302ms/0.987ms/1.211ms"
> 
> where those times would correspond the MarkStart/MarkEnd/RelocateStart pause times. It's a bit more to it though, since there are potentially more than one MarkEnd pause it's not obvious what to show there. The longest of the MarkEnd pauses is probably what most people want to see.
> 
> cheers,
> Per
> 
>>
>> Thanks,
>>                   Pedro
>>
>> PEDRO VITON
>> NOKIA
>> AAA Specialist
>> C/ Maria Tubau 9 (28050 Madrid - SPAIN)
>> M: +34 690 964740 (ext. 2411 5746)
>> pedro.viton at nokia.com<mailto:pedro.viton at alcatel-lucent.com>
>>


More information about the zgc-dev mailing list