MMU drops suddenly

Per Liden per.liden at oracle.com
Mon Nov 18 08:43:14 UTC 2019


Hi,

On 11/15/19 7:55 PM, Sundara Mohan M wrote:
> Hi,
>      Have noticed following in gc log
> [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu      ] GC(10952)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8%
> [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu      ] GC(11441)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, *100ms/90.8%*
> [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu      ] GC(11927)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%*
> [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu      ] GC(12450)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu      ] GC(12946)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7%
> [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu      ] GC(13410)
> MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%*
> 
> Was trying to understand what it means and here is my understanding, This
> says how much minimum CPU available for mutator thread in last Xms
> 1. Is this correct?

Not quite. The MMU printout tells you the minimum amount of time Java 
threads could execute in the different time windows. Note that it's the 
worst case since the VM started. For example, 10ms/23.7% means there has 
been at least one 10ms window, where Java threads could only execute for 
23.7% of that time (2.37ms).

> 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other time
> unit it is 0% does that mean my application doesn't get a chance to run?

Right, 2ms/0.0% means there was at least one 2ms windows, where the Java 
threads didn't get a chance to run at all.

> Also i see it never goes back to higher value.

Correct. Since it shows the worst case since the VM started it will 
never go back to a higher value.

> 3. Does this measure indicates something good or bad?

In general, 0% is bad, 100% is good. Exactly which time window you're 
interested in depends on what response time requirements you have. A 
simplified example to show the principle: Assume a request takes 5ms to 
process in your application, and you have a response time requirement of 
10ms, then 10ms/60% would be good, but 10ms/40% would not be good enough.

> 3. If this is bad what should i look further to get more insights?

Look at the GC pauses, how long they are and how far apart they are. The 
GC statistics printed by gc+stats shows you where you're spending time 
in pauses. If the GC pauses are long, then ZGC is likely starved on CPU. 
If the GC pauses are close to each other, then ZGC is likely doing 
back-to-back GCs and needs more heap to work with.

cheers,
Per


More information about the zgc-dev mailing list