Odd G1GC behavior on 8u91

Thu Aug 25 01:18:10 UTC 2016

Vitaly,

Thanks for the gc logs.

Before reaching full gc and the entry at

2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause) 
(young)

...

[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 
95.2G(96.0G)->95.2G(96.0G)]

There are several young gcs with 'to-space exhausted'. For example

"2016-08-24T15:29:03.936+0000: 17767.663: [GC pause (G1 Evacuation 
Pause) (young)
Desired survivor size 1795162112 bytes, new threshold 2 (max 2)
  17767.663: [G1Ergonomics (CSet Construction) start choosing CSet, 
_pending_cards: 185708, predicted base time: 46.71 ms, remaining time: 
253.29 ms, target pause time: 300.00 ms]
  17767.663: [G1Ergonomics (CSet Construction) add young regions to 
CSet, eden: 159 regions, survivors: 0 regions, predicted young region 
time: 173.05 ms]
  17767.663: [G1Ergonomics (CSet Construction) finish choosing CSet, 
eden: 159 regions, survivors: 0 regions, old: 0 regions, predicted pause 
time: 219.76 ms, target pause time: 300.00 ms]
  17767.664: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: 
region allocation request failed, allocation request: 16340088 bytes]
  17767.664: [G1Ergonomics (Heap Sizing) expand the heap, requested 
expansion amount: 16340088 bytes, attempted expansion amount: 33554432 
bytes]
  17767.664: [G1Ergonomics (Heap Sizing) did not expand the heap, 
reason: heap already fully expanded]
...

(to-space exhausted), 8.0293588 secs]
    [Parallel Time: 6920.2 ms, GC Workers: 23]
...
    [Other: 1106.9 ms]
       [Evacuation Failure: 1095.6 ms]
       [Choose CSet: 0.0 ms]
       [Ref Proc: 4.4 ms]
       [Ref Enq: 0.4 ms]
       [Redirty Cards: 4.8 ms]
       [Humongous Register: 0.2 ms]
       [Humongous Reclaim: 0.2 ms]
       [Free CSet: 0.3 ms]
    [Eden: 5088.0M(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap: 
95.2G(96.0G)->95.2G(96.0G)]
  [Times: user=68.75 sys=1.66, real=8.03 secs]

Please note the high "Evacuation Failure" time. Some of those young gcs 
might collect some. Some (like this one) do not collect any heap.

(to-space exhausted) means it can not find enough old regions when 
evacuating from the young regions.

There are only 4 marking cycles, 3 of which are at the beginning due to 
meta data space. Mix gc is not trigger at that time since the heap is 
not full enough.

There is only 1 marking cycle requested due to heap occupancy, but it is 
too late. The heap is very full already. No mixed gc found in the log.
"request concurrent cycle initiation, reason: occupancy higher than 
threshold, occupancy: 88080384000 bytes, allocation request: 0 bytes, 
threshold: 77309411325 bytes (75.00 %), source: end of GC]"

To avoid this, as others suggested, you can try
1. set a reasonable pause time goal and let g1 decide the young gen size
2. if you have to set the young gen size fixed, reduce the NewSize, 
increase max tenure threshold to let objects die in young gen.
3. Reduce InitiatingHeapOccupancyPercent
The goal is to trigger more mixed gc and avoid to-space exhausted.

The message after 'to-space exhausted' might be confusing. I need to 
discuss this with dev team. For example, at time stamp
2016-08-24T15:28:05.905+0000: 17709.633: [GC pause (G1 Evacuation Pause) 
(young)
...
(to-space exhausted), 2.6149566 secs]
...

[Eden: 28.4G(28.4G)->0.0B(28.4G) Survivors: 1664.0M->1664.0M Heap: 
93.5G(96.0G)->73.9G(96.0G)]

the eden used after gc might not be true. I will do some investigation 
and get back to you.

Thanks
Jenny

On 08/24/2016 03:50 PM, Vitaly Davidovich wrote:
>
>
> On Wed, Aug 24, 2016 at 6:18 PM, Jenny Zhang <yu.zhang at oracle.com 
> <mailto:yu.zhang at oracle.com>> wrote:
>
>     More comments about the questions
>
>     Thanks
>     Jenny
>
>     On 8/24/2016 11:43 AM, Vitaly Davidovich wrote:
>
>         Right before the Full GC, ergonomics report a failure to
>         expand the heap due to an allocation request of 32 bytes.  Is
>         this implying that a mutator tried to allocate 32 bytes but
>         couldn't? How do I reconcile that with Eden+Survivor occupancy
>         reported right above that?
>
>     Yes, it means the mutator tries to allocate 32byte but can not get
>     it. Heap won't expand as it already reaches max heap.
>
>     Do you see any humongous objects allocatoin?
>
> As mentioned in my previous email, I don't see any humongous 
> allocations recorded.  The Humongous Register/Reclaim output does show 
> non-zero timing, so not sure if the log is simply missing them or 
> something else is going on.
>
>
>
>         Young gen is sized to 30GB, total heap is 96GB. Allocation
>         rate of the application is roughly 1GB/s. Am I correct in
>         assuming that allocation is outpacing concurrent marking,
>         based on the above? What tunable(s) would you advise to tweak
>         to get G1 to keep up with the allocation rate? I'm ok taking
>         some throughput hit to mitigate 90s+ pauses.
>
>     The entire log might give a better picture. Especially if the
>     marking cycle is triggered, how well the mixed gc cleans up the heap.
>
> Are there particular gc lines you're looking for? I can grep for them 
> quickly and provide that for you.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20160824/9dd2ffd4/attachment.html>