Odd G1GC behavior on 8u91
Jenny Zhang
yu.zhang at oracle.com
Wed Aug 24 22:11:35 UTC 2016
Hi, Vitaly,
Charlie and I were just talking about your post.
The parameters you used
-XX:MaxNewSize=32212254720 -XX:NewSize=32212254720(Plus
-XX:-UseAdaptiveSizePolicy ) basically fixed the young gen size. Which
is reflected in the gc log entries.
As Eric points out, the -XX:MaxTenuringThreshold=2 might promotes
objects quicker to old gen. But it does not explain the 0 Eden used.
Maybe all your objects are humongous objects and allocated to old gen
directly?
Do you mind post the entire gc log? If it is too big, can I copy it from
some where?
BTW, -XX:PrintFLSStatistics=1 is for CMS not G1
Thanks
Jenny
On 8/24/2016 2:32 PM, Vitaly Davidovich wrote:
> Hi Eric,
>
> Thanks for the reply. Yeah, we're experimenting with reducing the
> initiating heap occupancy to 55%, hopefully letting the concurrent
> marking complete before the allocation and promotion rate results in a
> Full GC. In particular, I'm not sure how to interpret the gc log
> snippets I posted showing a young evac with an empty Young gen, and
> then followed by a Full GC due to heap expansion failure for a 32 byte
> allocation. I'm likely missing something in my interpretation.
>
> I think you're also on point about the max tenuring threshold being
> too low - need to see how that value was arrived at.
>
> Thanks
>
> On Wed, Aug 24, 2016 at 5:20 PM, Eric Caspole <eric.caspole at oracle.com
> <mailto:eric.caspole at oracle.com>> wrote:
>
> I have not used G1 in JDK 8 that much but the two trouble spots to
> me are:
>
> -XX:InitiatingHeapOccupancyPercent=75
> -XX:MaxTenuringThreshold=2
>
> So this will tenure very quickly, filling up old gen and start the
> marking relatively late at 75%. This looks like it is pretty
> likely to end up in a STW full GC.
> Since you do have a huge amount of garbage getting collected in
> the full gc maybe try letting more of it die off in young gen with
> higher tenuring threshold and also start marking earlier than 75%.
>
> good luck,
> Eric
>
>
>
>
> On 08/24/2016 02:43 PM, Vitaly Davidovich wrote:
>
> Hi guys,
>
> Hoping someone could shed some light on G1 behavior (as seen
> from the gc
> log) that I'm having a hard time understanding. The root
> problem is G1
> enters a Full GC that takes many tens of seconds, and need
> some advice
> on what could be causing it.
>
> First, some basic info:
> Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE
> (1.8.0_91-b14), built on Apr 1 2016 00:57:21 by "java_re"
> with gcc
> 4.3.0 20080428 (Red Hat 4.3.0-8)
> Memory: 4k page, physical 264115728k(108464820k free), swap
> 0k(0k free)
> CommandLine flags: -XX:G1HeapWastePercent=5
> -XX:G1MixedGCCountTarget=4
> -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<some_path>
> -XX:InitialCodeCacheSize=104857600
> -XX:InitialHeapSize=103079215104
> -XX:InitialTenuringThreshold=2
> -XX:InitiatingHeapOccupancyPercent=75
> -XX:+ManagementServer -XX:MaxGCPauseMillis=300
> -XX:MaxHeapSize=103079215104 -XX:MaxNewSize=32212254720
> -XX:MaxTenuringThreshold=2 -XX:NewSize=32212254720
> -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy
> -XX:+PrintCommandLineFlags -XX:+PrintCompilation
> -XX:PrintFLSStatistics=1 -XX:+PrintGC
> -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -XX:+PrintPromotionFailure -XX:+PrintReferenceGC
> -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1
> -XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600
> -XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC
>
> Swap is disabled. THP is disabled.
>
> First issue I have a question about:
>
> 2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1
> Evacuation Pause)
> (young)
> Desired survivor size 1795162112 bytes, new threshold 2 (max 2)
> 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet,
> _pending_cards: 0, predicted base time: 14.07 ms, remaining
> time: 285.93
> ms, target pause time: 300.00 ms]
> 17776.029: [G1Ergonomics (CSet Construction) add young
> regions to CSet,
> eden: 0 regions, survivors: 0 regions, predicted young region
> time: 0.00 ms]
> 17776.029: [G1Ergonomics (CSet Construction) finish choosing
> CSet,
> eden: 0 regions, survivors: 0 regions, old: 0 regions,
> predicted pause
> time: 14.07 ms, target pause time: 300.00 ms]
> 2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs,
> 0.0012417 secs]2016-08-24T15:29:12.307+0000: 17776.034:
> [WeakReference,
> 0 refs, 0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035:
> [FinalReference, 0 refs, 0.0007027
> secs]2016-08-24T15:29:12.308+0000:
> 17776.035: [PhantomReference, 0 refs, 0 refs, 0.0013585
> secs]2016-08-24T15:29:12.309+0000: 17776.037: [JNI Weak Reference,
> 0.0000118 secs], 0.0089758 secs]
> [Parallel Time: 3.1 ms, GC Workers: 23]
> [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3,
> Max:
> 17776029.4, Diff: 0.2]
> [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8,
> Diff: 1.9,
> Sum: 24.2]
> [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.1]
> [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff:
> 1, Sum: 2]
> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0,
> Diff: 0.0,
> Sum: 0.0]
> [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff:
> 0.1, Sum: 1.2]
> [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff:
> 1.8, Sum: 37.9]
> [Termination Attempts: Min: 1, Avg: 1.0, Max: 1,
> Diff: 0, Sum: 23]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0,
> Diff: 0.0,
> Sum: 0.2]
> [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9,
> Diff: 0.2,
> Sum: 63.8]
> [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max:
> 17776032.1, Diff: 0.0]
> [Code Root Fixup: 0.2 ms]
> [Code Root Purge: 0.0 ms]
> [Clear CT: 0.4 ms]
> [Other: 5.3 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 4.4 ms]
> [Ref Enq: 0.3 ms]
> [Redirty Cards: 0.3 ms]
> [Humongous Register: 0.1 ms]
> [Humongous Reclaim: 0.1 ms]
> [Free CSet: 0.0 ms]
> *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
> 95.2G(96.0G)->95.2G(96.0G)]*
> [Times: user=0.08 sys=0.00, real=0.01 secs]
> 2016-08-24T15:29:12.311+0000: 17776.038: Total time for which
> application threads were stopped: 0.0103002 seconds, Stopping
> threads
> took: 0.0000566 seconds
> 17776.039: [G1Ergonomics (Heap Sizing) attempt heap
> expansion, reason:
> allocation request failed, allocation request: 32 bytes]
> 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested
> expansion amount: 33554432 bytes, attempted expansion amount:
> 33554432
> bytes]
> 17776.039: [G1Ergonomics (Heap Sizing) did not expand the
> heap, reason:
> heap already fully expanded]
> 2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation
> Failure)
> 2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504
> refs,
> 0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456:
> [WeakReference,
> 1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456:
> [FinalReference, 3270 refs, 0.0033290
> secs]2016-08-24T15:29:40.732+0000:
> 17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257
> secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference,
> 0.0000172 secs] 95G->38G(96G), 95.5305034 secs]
> [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
> 95.2G(96.0G)->38.9G(96.0G)], [Metaspace:
> 104180K->103365K(106496K)]
> * [Times: user=157.02 sys=0.28, real=95.54 secs] *
> *
> *
> So here we have a lengthy full GC pause that collects quite a
> bit of old
> gen (expected). Right before this is a young evac pause.
>
> Why is the heap sizing (bolded) reported after the evac pause
> showing
> empty Eden+Survivor?
>
> Why is ergonomic info reporting 0 regions selected (i.e. what's
> evacuated then)?
>
> Right before the Full GC, ergonomics report a failure to
> expand the heap
> due to an allocation request of 32 bytes. Is this implying that a
> mutator tried to allocate 32 bytes but couldn't? How do I
> reconcile that
> with Eden+Survivor occupancy reported right above that?
>
> Young gen is sized to 30GB, total heap is 96GB. Allocation
> rate of the
> application is roughly 1GB/s. Am I correct in assuming that
> allocation
> is outpacing concurrent marking, based on the above? What
> tunable(s)
> would you advise to tweak to get G1 to keep up with the
> allocation rate?
> I'm ok taking some throughput hit to mitigate 90s+ pauses.
>
> Let me know if any additional info is needed (I have the full
> GC log,
> and can attach that if desired).
>
> Thanks
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> <http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use>
>
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20160824/6d96d6e8/attachment.html>
More information about the hotspot-gc-use
mailing list