CMS large objects vs G1 humongous allocations

Mon Jan 30 13:19:27 UTC 2017

Italy, thank you.

There is no explicit line which says Full GC. I thought line printed as
"(Heap Sizing) did not expand the heap, reason: heap already fully
expanded" is an implicit indication that G1 have taken all necessary
actions which full GC would take and it lead to 7.7sec pause.

Sorry a bit confused about how can 'G1's concurrent cycles' & 'parallel
phases of young collections' run concurrently? Does that mean ConcGCThreads
+ ParallelGCThreads cannot be greater than cpu cores.?

Good point about tuning G1MixedGCLiveThresholdPercent.

On Mon, Jan 30, 2017 at 6:36 PM, Vitaly Davidovich <vitalyd at gmail.com>
wrote:

>
>
> On Mon, Jan 30, 2017 at 7:41 AM, Amit Balode <amit.balode at gmail.com>
> wrote:
>
>> Hi Vitaly,
>>
>> JRE: 1.8u40
>> Settings: -XX:+AlwaysPreTouch -XX:ConcGCThreads=5
>> -XX:G1HeapRegionSize=33554432 -XX:InitialHeapSize=17179869184
>> <(717)%20986-9184> -XX:MaxGCPauseMillis=40 -XX:MaxHeapSize=17179869184
>> <(717)%20986-9184> -XX:MaxNewSize=10301210624
>>
>> full GC reclaim: What i meant is after heap was exhausted, A full pause
>> was done actively to reclaim available pause.
>>
> Right, but how much of the heap was reclaimed? If very little, then your
> max heap size is too low for your working set.  If it reclaimed a bunch,
> then it's a matter of concurrent GC not keeping up.  If you can paste the
> output of the Full GC phase, it would be helpful.
>
>>
>>
>> Yeah, the next thing I was going try was increasing ConcGCThreads to
>> atleast 16 (given core count is 24). Is it possible that at any given time ConcGCThreads
>> and ParallelGCThreads might be running at the same time?
>>
> Yes, they can run concurrently.  ConcGCThreads service the concurrent
> phases of G1's concurrent cycles (e.g. concurrent marking), and the
> ParallelGCThreads are used during young collections to perform the parallel
> phases of young collections.
>
>>
>> Could you elaborate why you think InitiatingHeapOccupancyPercent
>> reduction might help? InitiatingHeapOccupancyPercent reduction means
>> more frequent concurrent cycles so problem might happen sooner, no?
>> Although combination of increase of ConcGCThreads  and reduction of
>> InitiatingHeapOccupancyPercent might help.
>>
> Reduced IHOP makes concurrent cycles kick in sooner, and if the issue is
> that concurrent marking cannot keep up, then making it start sooner could
> help.  If you start concurrent marking sooner/earlier, it may be able to
> complete (and thus clean up) before your Java threads completely fill the
> heap and necessitate a Full GC.  That's the thinking behind that
> suggestion, but it needs to be processed with respect to your actual
> situation.
>
>>
>> Pasting more logs as requested.
>> {code}
>> 2017-01-29T14:52:41.570+0000: 189073.759: [GC pause (G1 Humongous
>> Allocation) (young) (initial-mark)
>> Desired survivor size 603979776 bytes, new threshold 15 (max 15)
>> - age   1:  302507344 bytes,  302507344 total
>> - age   2:    4153408 bytes,  306660752 total
>> - age   3:    3325224 bytes,  309985976 total
>> - age   4:     482672 bytes,  310468648 total
>> - age   5:     659992 bytes,  311128640 total
>> - age   6:    1189240 bytes,  312317880 total
>> - age   7:     645256 bytes,  312963136 total
>> - age   8:     438800 bytes,  313401936 total
>> - age   9:    1603768 bytes,  315005704 total
>>  189073.759: [G1Ergonomics (CSet Construction) start choosing CSet,
>> _pending_cards: 24579, predicted base time: 18.93 ms, remaining time: 21.07
>> ms, target pause time: 40.00 ms]
>>  189073.759: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 170 regions, survivors: 26 regions, predicted young region time:
>> 10.94 ms]
>>  189073.759: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 170 regions, survivors: 26 regions, old: 0 regions, predicted pause
>> time: 29.87 ms, target pause time: 40.00 ms]
>> , 0.0690000 secs]
>>    [Parallel Time: 60.2 ms, GC Workers: 18]
>>       [GC Worker Start (ms): Min: 189073760.5, Avg: 189073760.7, Max:
>> 189073761.0, Diff: 0.5]
>>       [Ext Root Scanning (ms): Min: 3.7, Avg: 4.2, Max: 4.5, Diff: 0.8,
>> Sum: 75.1]
>>       [Update RS (ms): Min: 6.9, Avg: 7.2, Max: 7.6, Diff: 0.7, Sum:
>> 129.3]
>>          [Processed Buffers: Min: 42, Avg: 62.8, Max: 93, Diff: 51, Sum:
>> 1131]
>>       [Scan RS (ms): Min: 6.4, Avg: 6.8, Max: 6.9, Diff: 0.5, Sum: 122.6]
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>> Sum: 0.0]
>>       [Object Copy (ms): Min: 41.4, Avg: 41.5, Max: 41.8, Diff: 0.5, Sum:
>> 746.6]
>>       [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum:
>> 1.3]
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
>> Sum: 0.7]
>>       [GC Worker Total (ms): Min: 59.5, Avg: 59.8, Max: 60.0, Diff: 0.5,
>> Sum: 1075.7]
>>       [GC Worker End (ms): Min: 189073820.5, Avg: 189073820.5, Max:
>> 189073820.5, Diff: 0.1]
>>    [Code Root Fixup: 0.2 ms]
>>    [Code Root Purge: 0.0 ms]
>>    [Clear CT: 0.6 ms]
>>    [Other: 8.0 ms]
>>       [Choose CSet: 0.0 ms]
>>       [Ref Proc: 3.2 ms]
>>       [Ref Enq: 0.2 ms]
>>       [Redirty Cards: 0.1 ms]
>>       [Humongous Reclaim: 1.7 ms]
>>       [Free CSet: 0.2 ms]
>>    [Eden: 5440.0M(8288.0M)->0.0B(8960.0M) Survivors: 832.0M->864.0M
>> Heap: 10.9G(16.0G)->2359.8M(16.0G)]
>>  [Times: user=1.07 sys=0.00, real=0.07 secs]
>> 2017-01-29T14:52:41.639+0000: 189073.828: [GC
>> concurrent-root-region-scan-start]
>> 2017-01-29T14:52:41.640+0000: 189073.828: Total time for which
>> application threads were stopped: 0.0742949 seconds, Stopping threads took:
>> 0.0004049 seconds
>> 2017-01-29T14:52:41.680+0000: 189073.868: [GC
>> concurrent-root-region-scan-end, 0.0402986 secs]
>> 2017-01-29T14:52:41.680+0000: 189073.868: [GC concurrent-mark-start]
>> 2017-01-29T14:52:41.892+0000: 189074.081: [GC concurrent-mark-end,
>> 0.2123457 secs]
>> 2017-01-29T14:52:41.896+0000: 189074.085: [GC remark 189074.085:
>> [Finalize Marking, 0.0012962 secs] 189074.087: [GC ref-proc, 0.0043022
>> secs] 189074.091: [Unloading, 0.0226266 secs], 0.0300066 secs]
>>  [Times: user=0.34 sys=0.01, real=0.03 secs]
>> 2017-01-29T14:52:41.927+0000: 189074.116: Total time for which
>> application threads were stopped: 0.0351083 seconds, Stopping threads took:
>> 0.0004268 seconds
>> 2017-01-29T14:52:41.931+0000: 189074.120: [GC cleanup 2630M->2630M(16G),
>> 0.0020884 secs]
>>  [Times: user=0.02 sys=0.00, real=0.00 secs]
>>
> Let's note a few things here:
> 1) concurrent marking completes fairly quickly
> 2) No space is reclaimed.
>
> Another thing to consider, in light of this, is to tweak G1MixedGCLiveThresholdPercent
> (it's an experimental feature, so needs to be unlocked via -XX:+UnlockExperimentalVMOptions).
> This value (85 by default) controls the liveness of a region when it's
> considered for mixed collections.  Any region that's X% (or more) garbage,
> where X is the flag's value, is selected for mixed collections.  So with
> the default, a region has to be at least 85% garbage to be considered for
> mixed GCs.  You could experiment with a lower value to see if mixed GC can
> clean up a bit more.
>
>
>> 2017-01-29T14:52:41.933+0000: 189074.122: Total time for which
>> application threads were stopped: 0.0062040 seconds, Stopping threads took:
>> 0.0002670 seconds
>> 2017-01-29T14:52:42.939+0000: 189075.127: Total time for which
>> application threads were stopped: 0.0051245 seconds, Stopping threads took:
>> 0.0004442 seconds
>> 2017-01-29T14:52:43.944+0000: 189076.133: Total time for which
>> application threads were stopped: 0.0052855 seconds, Stopping threads took:
>> 0.0004542 seconds
>> 2017-01-29T14:52:46.950+0000: 189079.139: Total time for which
>> application threads were stopped: 0.0055019 seconds, Stopping threads took:
>> 0.0006084 seconds
>> 2017-01-29T14:52:47.955+0000: 189080.144: Total time for which
>> application threads were stopped: 0.0050989 seconds, Stopping threads took:
>> 0.0004534 seconds
>> 2017-01-29T14:52:49.961+0000: 189082.149: Total time for which
>> application threads were stopped: 0.0053802 seconds, Stopping threads took:
>> 0.0006599 seconds
>> 2017-01-29T14:52:51.966+0000: 189084.154: Total time for which
>> application threads were stopped: 0.0047913 seconds, Stopping threads took:
>> 0.0004446 seconds
>> 2017-01-29T14:52:53.970+0000: 189086.158: Total time for which
>> application threads were stopped: 0.0037714 seconds, Stopping threads took:
>> 0.0002900 seconds
>> 2017-01-29T14:52:55.974+0000: 189088.163: Total time for which
>> application threads were stopped: 0.0044301 seconds, Stopping threads took:
>> 0.0003547 seconds
>> 2017-01-29T14:52:56.979+0000: 189089.167: Total time for which
>> application threads were stopped: 0.0043758 seconds, Stopping threads took:
>> 0.0003496 seconds
>> 2017-01-29T14:52:57.983+0000: 189090.172: Total time for which
>> application threads were stopped: 0.0043504 seconds, Stopping threads took:
>> 0.0003511 seconds
>> 2017-01-29T14:53:06.989+0000: 189099.177: Total time for which
>> application threads were stopped: 0.0045634 seconds, Stopping threads took:
>> 0.0004279 seconds
>> 2017-01-29T14:53:08.993+0000: 189101.182: Total time for which
>> application threads were stopped: 0.0043401 seconds, Stopping threads took:
>> 0.0003958 seconds
>> 2017-01-29T14:53:11.998+0000: 189104.186: Total time for which
>> application threads were stopped: 0.0044074 seconds, Stopping threads took:
>> 0.0003907 seconds
>> 2017-01-29T14:53:14.770+0000: 189106.959: [GC pause (G1 Evacuation Pause)
>> (young)
>> Desired survivor size 654311424 bytes, new threshold 15 (max 15)
>> - age   1:  240262896 bytes,  240262896 total
>> - age   2:    3476760 bytes,  243739656 total
>> - age   3:    3293240 bytes,  247032896 total
>> - age   4:    3147072 bytes,  250179968 total
>> - age   5:     420832 bytes,  250600800 total
>> - age   6:     614688 bytes,  251215488 total
>> - age   7:    1139960 bytes,  252355448 total
>> - age   8:     632088 bytes,  252987536 total
>> - age   9:     425488 bytes,  253413024 total
>> - age  10:    1592608 bytes,  255005632 total
>>  189106.960: [G1Ergonomics (CSet Construction) start choosing CSet,
>> _pending_cards: 29363, predicted base time: 20.87 ms, remaining time: 19.13
>> ms, target pause time: 40.00 ms]
>>  189106.960: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 276 regions, survivors: 27 regions, predicted young region time:
>> 12.11 ms]
>>  189106.960: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 276 regions, survivors: 27 regions, old: 0 regions, predicted pause
>> time: 32.98 ms, target pause time: 40.00 ms]
>>  189106.961: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
>> region allocation request failed, allocation request: 5549208 bytes]
>>  189106.961: [G1Ergonomics (Heap Sizing) expand the heap, requested
>> expansion amount: 5549208 bytes, attempted expansion amount: 33554432 bytes]
>>  189106.961: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
>> heap already fully expanded]
>>  189114.730: [G1Ergonomics (Concurrent Cycles) do not request concurrent
>> cycle initiation, reason: still doing mixed collections, occupancy:
>> 13119782912 bytes, allocation request: 0 bytes, threshold: 7730941095 bytes
>> (45.00 %), source: end of GC]
>>  189114.730: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: candidate
>> old regions available, candidate old regions: 45 regions, reclaimable:
>> 1026676456 bytes (5.98 %), threshold: 5.00 %]
>>
> So here it found ~1G reclaimable in old regions, but it's too late - it's
> hitting to-space exhaustion right below.  This is what leads me to believe
> that G1's concurrent cleanup is "losing the race" against the Java threads
> (that are concurrently allocating).  Or, it could be that your max heap is
> too small for your allocation rate and current GC params.  This is where
> it's useful to see how much memory a Full GC reclaims.
>
>>  (to-space exhausted), 7.7714626 secs]
>>    [Parallel Time: 7182.7 ms, GC Workers: 18]
>>       [GC Worker Start (ms): Min: 189106960.8, Avg: 189106961.0, Max:
>> 189106961.2, Diff: 0.4]
>>       [Ext Root Scanning (ms): Min: 1.0, Avg: 1.7, Max: 2.5, Diff: 1.5,
>> Sum: 30.9]
>>       [Update RS (ms): Min: 2.1, Avg: 8.2, Max: 11.5, Diff: 9.4, Sum:
>> 147.0]
>>          [Processed Buffers: Min: 15, Avg: 62.9, Max: 130, Diff: 115,
>> Sum: 1132]
>>       [Scan RS (ms): Min: 4.0, Avg: 6.1, Max: 6.4, Diff: 2.4, Sum: 109.6]
>>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>> Sum: 0.0]
>>       [Object Copy (ms): Min: 7164.5, Avg: 7166.1, Max: 7171.6, Diff:
>> 7.1, Sum: 128989.0]
>>       [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum:
>> 3.9]
>>       [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
>> Sum: 0.4]
>>       [GC Worker Total (ms): Min: 7182.1, Avg: 7182.3, Max: 7182.5, Diff:
>> 0.4, Sum: 129280.9]
>>       [GC Worker End (ms): Min: 189114143.2, Avg: 189114143.3, Max:
>> 189114143.3, Diff: 0.1]
>>    [Code Root Fixup: 0.3 ms]
>>    [Code Root Purge: 0.0 ms]
>>    [Clear CT: 0.7 ms]
>>    [Other: 587.7 ms]
>>       [Evacuation Failure: 579.6 ms]
>>       [Choose CSet: 0.0 ms]
>>       [Ref Proc: 2.8 ms]
>>       [Ref Enq: 0.5 ms]
>>       [Redirty Cards: 1.4 ms]
>>       [Humongous Reclaim: 0.8 ms]
>>       [Free CSet: 0.2 ms]
>>    [Eden: 8832.0M(8960.0M)->0.0B(800.0M) Survivors: 864.0M->0.0B Heap:
>> 13.9G(16.0G)->11.6G(16.0G)]
>>  [Times: user=16.27 sys=0.35, real=7.77 secs]
>> 2017-01-29T14:53:22.542+0000: 189114.731: Total time for which
>> application threads were stopped: 7.7764915 seconds, Stopping threads took:
>> 0.0003511 seconds
>> {code}
>>
>>
>> On Mon, Jan 30, 2017 at 5:14 PM, Vitaly Davidovich <vitalyd at gmail.com>
>> wrote:
>>
>>> I don't believe region sizes >32MB are planned, at least that wasn't the
>>> case before - maybe that's changed though.
>>>
>>> Some tuning options to consider:
>>> 1) Increase max heap size (this is always a blunt but frequently
>>> effective instrument)
>>> 2) Turn InitiatingHeapOccupancyPercent down from its default 45 value.
>>> 3) Increase ConcGCThreads
>>>
>>> What version of the JRE are you running? From the short GC log snippet
>>> you pasted, it looks like G1 concurrent cycles are losing the race to your
>>> Java threads (i.e. you exhaust the heap before concurrent phase can clean
>>> up).
>>>
>>> How much does a full GC reclaim? You may want to provide more of your GC
>>> log - someone is going to ask for that sooner or later :).
>>>
>>>
>>> On Mon, Jan 30, 2017 at 3:24 AM Amit Balode <amit.balode at gmail.com>
>>> wrote:
>>>
>>>> Hello, so the only reason we decided to move to G1 from CMS was due to
>>>> fragmentation issues of CMS. After moving to G1, we have started humongous
>>>> allocations resulting in Full GC's. We have some large objects which are
>>>> allocated in successive order resulting into this issue but the use case
>>>> for continious allocation from application side seems genuine. Heap is
>>>> getting full so thats explainable why full GC is happening although this
>>>> issue did not happen in CMS of full GC.
>>>>
>>>> a) Currently running with max 32MB region size, humongous allocations
>>>> as high as 21MB are happening. So for such large allocation, 32MB seems
>>>> smaller, might be 64MB would have been appropriate, is that an option
>>>> available in future G1 releases?
>>>> b) Given that application behaviour cannot be changed much which can
>>>> stop continuous large allocation what are some G1 specific settings to tune
>>>> to make it more resilient?
>>>>
>>>> Below is the snippet from application running with 16GB heap, 40ms
>>>> pause time and 32MB region size.
>>>> {code}
>>>> 2017-01-29T14:53:14.770+0000: 189106.959: [GC pause (G1 Evacuation
>>>> Pause) (young)
>>>> Desired survivor size 654311424 bytes, new threshold 15 (max 15)
>>>> - age   1:  240262896 bytes,  240262896 total
>>>> - age   2:    3476760 bytes,  243739656 total
>>>> - age   3:    3293240 bytes,  247032896 total
>>>> - age   4:    3147072 bytes,  250179968 total
>>>> - age   5:     420832 bytes,  250600800 total
>>>> - age   6:     614688 bytes,  251215488 total
>>>> - age   7:    1139960 bytes,  252355448 total
>>>> - age   8:     632088 bytes,  252987536 total
>>>> - age   9:     425488 bytes,  253413024 total
>>>> - age  10:    1592608 bytes,  255005632 total
>>>>  189106.960: [G1Ergonomics (CSet Construction) start choosing CSet,
>>>> _pending_cards: 29363, predicted base time: 20.87 ms, remaining time: 19.13
>>>> ms, target pause time: 40.00 ms]
>>>>  189106.960: [G1Ergonomics (CSet Construction) add young regions to
>>>> CSet, eden: 276 regions, survivors: 27 regions, predicted young region
>>>> time: 12.11 ms]
>>>>  189106.960: [G1Ergonomics (CSet Construction) finish choosing CSet,
>>>> eden: 276 regions, survivors: 27 regions, old: 0 regions, predicted pause
>>>> time: 32.98 ms, target pause time:
>>>> 40.00 ms]
>>>>  189106.961: [G1Ergonomics (Heap Sizing) attempt heap expansion,
>>>> reason: region allocation request failed, allocation request: 5549208 bytes]
>>>>  189106.961: [G1Ergonomics (Heap Sizing) expand the heap, requested
>>>> expansion amount: 5549208 bytes, attempted expansion amount: 33554432 bytes]
>>>>  189106.961: [G1Ergonomics (Heap Sizing) did not expand the heap,
>>>> reason: heap already fully expanded]
>>>>  189114.730: [G1Ergonomics (Concurrent Cycles) do not request
>>>> concurrent cycle initiation, reason: still doing mixed collections,
>>>> occupancy: 13119782912 bytes, allocation request
>>>> : 0 bytes, threshold: 7730941095 bytes (45.00 %), source: end of GC]
>>>> 189114.730: [G1Ergonomics (Mixed GCs) start mixed GCs, reason:
>>>> candidate old regions available, candidate old regions: 45 regions,
>>>> reclaimable: 1026676456 bytes (5.98 %), threshold: 5.00 %]
>>>>  (to-space exhausted), *7.7714626 secs*]
>>>>    [Parallel Time: 7182.7 ms, GC Workers: 18]
>>>> {code}
>>>>
>>>> --
>>>> Thanks & Regards,
>>>> Amit
>>>>
>>>> _______________________________________________
>>>> hotspot-gc-use mailing list
>>>> hotspot-gc-use at openjdk.java.net
>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>>
>>> --
>>> Sent from my phone
>>>
>>
>>
>>
>> --
>> Thanks & Regards,
>> Amit.Balode
>>
>>
>

-- 
Thanks & Regards,
Amit.Balode
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20170130/fb9d26b7/attachment-0001.html>