Odd G1GC behavior on 8u91

Thu Aug 25 17:01:22 UTC 2016

Hi Vitaly,

Would you be able to post all the GC events from 'initial-mark' leading
up-to the 0.0B Eden event? (Please include the PrintAdaptiveSizePolicy
information as well).

Regards,
Monica

Sent from my Huawei Mobile
Hi guys,

Hoping someone could shed some light on G1 behavior (as seen from the gc
log) that I'm having a hard time understanding.  The root problem is G1
enters a Full GC that takes many tens of seconds, and need some advice on
what could be causing it.

First, some basic info:
Java HotSpot(TM) 64-Bit Server VM (25.91-b14) for linux-amd64 JRE
(1.8.0_91-b14), built on Apr  1 2016 00:57:21 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 264115728k(108464820k free), swap 0k(0k free)
CommandLine flags: -XX:G1HeapWastePercent=5 -XX:G1MixedGCCountTarget=4
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=<some_path> -XX:InitialCodeCacheSize=104857600
-XX:InitialHeapSize=103079215104 -XX:InitialTenuringThreshold=2 -XX:
InitiatingHeapOccupancyPercent=75 -XX:+ManagementServer
-XX:MaxGCPauseMillis=300 -XX:MaxHeapSize=103079215104
-XX:MaxNewSize=32212254720 -XX:MaxTenuringThreshold=2
-XX:NewSize=32212254720 -XX:+ParallelRefProcEnabled
-XX:+PrintAdaptiveSizePolicy -XX:+PrintCommandLineFlags
-XX:+PrintCompilation -XX:PrintFLSStatistics=1 -XX:+PrintGC
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-XX:+PrintPromotionFailure -XX:+PrintReferenceGC
-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1
-XX:+PrintTenuringDistribution -XX:ReservedCodeCacheSize=104857600
-XX:SurvivorRatio=9 -XX:-UseAdaptiveSizePolicy -XX:+UseG1GC

Swap is disabled.  THP is disabled.

First issue I have a question about:

2016-08-24T15:29:12.302+0000: 17776.029: [GC pause (G1 Evacuation Pause)
(young)
Desired survivor size 1795162112 bytes, new threshold 2 (max 2)
 17776.029: [G1Ergonomics (CSet Construction) start choosing CSet,
_pending_cards: 0, predicted base time: 14.07 ms, remaining time: 285.93
ms, target pause time: 300.00 ms]
 17776.029: [G1Ergonomics (CSet Construction) add young regions to CSet,
eden: 0 regions, survivors: 0 regions, predicted young region time: 0.00 ms]
 17776.029: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 0
regions, survivors: 0 regions, old: 0 regions, predicted pause time: 14.07
ms, target pause time: 300.00 ms]
2016-08-24T15:29:12.305+0000: 17776.033: [SoftReference, 0 refs, 0.0012417
secs]2016-08-24T15:29:12.307+0000: 17776.034: [WeakReference, 0 refs,
0.0007101 secs]2016-08-24T15:29:12.307+0000: 17776.035: [FinalReference, 0
refs, 0.0007027 secs]2016-08-24T15:29:12.308+0000: 17776.035:
[PhantomReference, 0 refs, 0 refs, 0.0013585 secs]2016-08-24T15:29:12.309+0000:
17776.037: [JNI Weak Reference, 0.0000118 secs], 0.0089758 secs]
   [Parallel Time: 3.1 ms, GC Workers: 23]
      [GC Worker Start (ms): Min: 17776029.2, Avg: 17776029.3, Max:
17776029.4, Diff: 0.2]
      [Ext Root Scanning (ms): Min: 0.8, Avg: 1.1, Max: 2.8, Diff: 1.9,
Sum: 24.2]
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
         [Processed Buffers: Min: 0, Avg: 0.1, Max: 1, Diff: 1, Sum: 2]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 1.2]
      [Termination (ms): Min: 0.0, Avg: 1.6, Max: 1.8, Diff: 1.8, Sum: 37.9]
         [Termination Attempts: Min: 1, Avg: 1.0, Max: 1, Diff: 0, Sum: 23]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
0.2]
      [GC Worker Total (ms): Min: 2.7, Avg: 2.8, Max: 2.9, Diff: 0.2, Sum:
63.8]
      [GC Worker End (ms): Min: 17776032.0, Avg: 17776032.0, Max:
17776032.1, Diff: 0.0]
   [Code Root Fixup: 0.2 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.4 ms]
   [Other: 5.3 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 4.4 ms]
      [Ref Enq: 0.3 ms]
      [Redirty Cards: 0.3 ms]
      [Humongous Register: 0.1 ms]
      [Humongous Reclaim: 0.1 ms]
      [Free CSet: 0.0 ms]
   *[Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
95.2G(96.0G)->95.2G(96.0G)]*
 [Times: user=0.08 sys=0.00, real=0.01 secs]
2016-08-24T15:29:12.311+0000: 17776.038: Total time for which application
threads were stopped: 0.0103002 seconds, Stopping threads took: 0.0000566
seconds
 17776.039: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
allocation request failed, allocation request: 32 bytes]
 17776.039: [G1Ergonomics (Heap Sizing) expand the heap, requested
expansion amount: 33554432 bytes, attempted expansion amount: 33554432
bytes]
 17776.039: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
heap already fully expanded]
2016-08-24T15:29:12.312+0000: 17776.039: [Full GC (Allocation Failure)
2016-08-24T15:29:40.727+0000: 17804.454: [SoftReference, 5504 refs,
0.0012432 secs]2016-08-24T15:29:40.728+0000: 17804.456: [WeakReference,
1964 refs, 0.0003012 secs]2016-08-24T15:29:40.728+0000: 17804.456:
[FinalReference, 3270 refs, 0.0033290 secs]2016-08-24T15:29:40.732+0000:
17804.459: [PhantomReference, 0 refs, 75 refs, 0.0000257
secs]2016-08-24T15:29:40.732+0000: 17804.459: [JNI Weak Reference,
0.0000172 secs] 95G->38G(96G), 95.5305034 secs]
   [Eden: 0.0B(30.0G)->0.0B(30.0G) Survivors: 0.0B->0.0B Heap:
95.2G(96.0G)->38.9G(96.0G)], [Metaspace: 104180K->103365K(106496K)]
* [Times: user=157.02 sys=0.28, real=95.54 secs] *

So here we have a lengthy full GC pause that collects quite a bit of old
gen (expected).  Right before this is a young evac pause.

Why is the heap sizing (bolded) reported after the evac pause showing empty
Eden+Survivor?

Why is ergonomic info reporting 0 regions selected (i.e. what's evacuated
then)?

Right before the Full GC, ergonomics report a failure to expand the heap
due to an allocation request of 32 bytes.  Is this implying that a mutator
tried to allocate 32 bytes but couldn't? How do I reconcile that with
Eden+Survivor occupancy reported right above that?

Young gen is sized to 30GB, total heap is 96GB.  Allocation rate of the
application is roughly 1GB/s.  Am I correct in assuming that allocation is
outpacing concurrent marking, based on the above? What tunable(s) would you
advise to tweak to get G1 to keep up with the allocation rate? I'm ok
taking some throughput hit to mitigate 90s+ pauses.

Let me know if any additional info is needed (I have the full GC log, and
can attach that if desired).

Thanks

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20160826/9c65b6df/attachment.html>