Erratic(?) CMS behaviour after 5d

Bernd Eckenfels bernd-2012 at eckenfels.net
Thu Sep 27 21:16:10 UTC 2012


Hello,

I have talked about this before, however I had no good trace file. Now I
have a verbose GC log.

In the attached log you see a system where CMS and ParNew are running more
or less fine and stable. But then after 5d suddenly the CMS goes wild and
shows heavy activity. I dont think it looks like the traffic/usage pattern
has changed and the old heap is pretty empty. This has been happening
repeatedly, always after 5-6d the GC goes wild and the overall system
performance drops.

The permgen is used to 75%, so I dont think it is the issue here. Here is
a jmap -heap output of the system at the end of that GC log timeframe:

Attaching to process ID 46008, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 20.8-b03
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
Heap Configuration:
      MinHeapFreeRatio = 40
      MaxHeapFreeRatio = 70
      MaxHeapSize      = 51539607552 (49152.0MB)
      NewSize          = 1310720 (1.25MB)
      MaxNewSize       = 17592186044415 MB
      OldSize          = 5439488 (5.1875MB)
      NewRatio         = 1
      SurvivorRatio    = 8
      PermSize         = 104857600 (100.0MB)
      MaxPermSize      = 314572800 (300.0MB)
Heap Usage:
New Generation (Eden + 1 Survivor Space):
      capacity = 23192862720 (22118.4375MB)
      used     = 19142525576 (18255.734992980957MB)
      free     = 4050337144 (3862.702507019043MB)
      82.5362776777562% used
Eden Space:
      capacity = 20615921664 (19660.875MB)
      used     = 18511946216 (17654.367652893066MB)
      free     = 2103975448 (2006.5073471069336MB)
      89.79441481059753% used
   From Space:
      capacity = 2576941056 (2457.5625MB)
      used     = 630579360 (601.3673400878906MB)
      free     = 1946361696 (1856.1951599121094MB)
      24.47007309429122% used
To Space:
      capacity = 2576941056 (2457.5625MB)
      used     = 0 (0.0MB)
      free     = 2576941056 (2457.5625MB)
      0.0% used
concurrent mark-sweep generation:
      capacity = 25769803776 (24576.0MB)
      used     = 1704634416 (1625.6660614013672MB)
      free     = 24065169360 (22950.333938598633MB)
      6.614852137863636% used
Perm Generation:
      capacity = 314572800 (300.0MB)
      used     = 238131048 (227.0994644165039MB)
      free     = 76441752 (72.9005355834961MB)
      75.69982147216797% used


(note that "heap configuration" seems to be broken, I specified
NewRatio=1, but the NewSize and MaxNewSize displayed by jmap are way off)

Any Idea what can cause the GC go wild, here is an  excerpt (note that
ParNew happens only every 300s, after 10 of those one CMS happens, etc):


God NewGC/CMS/NewGC Cycle:

14586.770: [GC 14586.772: [ParNew: 1991511K->319572K(22649280K), 0.0489290
secs] 3229450K->1573413K(47815104K), 0.0513610 secs] [Times: user=1.29
sys=0.01, real=0.05 secs]
14586.834: [GC [1 CMS-initial-mark: 1253840K(25165824K)]
1573413K(47815104K), 0.2554070 secs] [Times: user=0.26 sys=0.00, real=0.26
secs]
14587.091: [CMS-concurrent-mark-start]
14587.506: [CMS-concurrent-mark: 0.276/0.415 secs] [Times: user=4.29
sys=0.06, real=0.41 secs]
14587.507: [CMS-concurrent-preclean-start]
14587.592: [CMS-concurrent-preclean: 0.085/0.085 secs] [Times: user=0.13
sys=0.03, real=0.09 secs]
14587.592: [CMS-concurrent-abortable-preclean-start]
    CMS: abort preclean due to time 14592.774:
[CMS-concurrent-abortable-preclean: 5.180/5.182 secs] [Times: user=7.45
sys=0.42, real=5.18 secs]
14592.786: [GC[YG occupancy: 860025 K (22649280 K)]14592.787: [Rescan
(parallel) , 0.1106510 secs]14592.898: [weak refs processing, 0.0009520
secs]14592.899: [class unloading, 0.0440260 secs]14592.943: [scrub symbol
& string tables, 0.0386660 secs] [1 CMS-remark: 1253840K(25165824K)]
2113866K(47815104K), 0.2091080 secs] [Times: user=5.65 sys=0.02, real=0.21
secs]
14592.997: [CMS-concurrent-sweep-start]
14594.081: [CMS-concurrent-sweep: 1.084/1.084 secs] [Times: user=1.43
sys=0.10, real=1.09 secs]
14594.081: [CMS-concurrent-reset-start]
14594.190: [CMS-concurrent-reset: 0.108/0.108 secs] [Times: user=0.17
sys=0.00, real=0.11 secs]
15041.297: [GC 15041.299: [ParNew: 20452429K->555085K(22649280K),
0.3904490 secs] 21375098K->1485221K(47815104K), 0.3937640 secs] [Times:
user=2.23 sys=0.04, real=0.39 secs]
15041.706: [GC 15041.707: [ParNew: 598821K->603384K(22649280K), 0.0499560
secs] 1528957K->1542532K(47815104K), 0.0520930 secs] [Times: user=1.88
sys=0.01, real=0.05 secs]
15583.968: [GC 15583.970: [ParNew: 20736120K->479921K(22649280K),
0.3162220 secs] 21675269K->1438894K(47815104K), 0.3195300 secs] [Times:
user=1.66 sys=0.02, real=0.32 secs]
15952.644: [GC 15952.646: [ParNew: 20612657K->469661K(22649280K),
0.2957950 secs] 21571630K->1429526K(47815104K), 0.2989230 secs] [Times:
user=1.73 sys=0.01, real=0.30 secs]
16324.194: [GC 16324.196: [ParNew: 20602397K->394671K(22649280K),
0.3984460 secs] 21562262K->1388556K(47815104K), 0.4017930 secs] [Times:
user=2.08 sys=0.04, real=0.40 secs]
16686.481: [GC 16686.484: [ParNew: 20527410K->515765K(22649280K),
0.2218350 secs] 21521295K->1509652K(47815104K), 0.2252750 secs] [Times:
user=1.90 sys=0.03, real=0.23 secs]
16988.880: [GC 16988.882: [ParNew: 20649063K->617925K(22649280K),
0.4101530 secs] 21642950K->1638935K(47815104K), 0.4135050 secs] [Times:
user=2.38 sys=0.02, real=0.41 secs]
16989.316: [GC 16989.317: [ParNew: 796218K->679931K(22649280K), 0.0544170
secs] 1817228K->1718567K(47815104K), 0.0566550 secs] [Times: user=1.92
sys=0.04, real=0.06 secs]
17292.082: [GC 17292.085: [ParNew: 20812672K->520352K(22649280K),
0.3049860 secs] 21851308K->1566872K(47815104K), 0.3082490 secs] [Times:
user=1.57 sys=0.02, real=0.30 secs]
17564.795: [GC 17564.797: [ParNew: 20653088K->473652K(22649280K),
0.3699550 secs] 21699608K->1572756K(47815104K), 0.3731990 secs] [Times:
user=2.03 sys=0.05, real=0.37 secs]
17974.219: [GC 17974.221: [ParNew: 20606388K->423143K(22649280K),
0.3992190 secs] 21705492K->1572876K(47815104K), 0.4026250 secs] [Times:
user=1.71 sys=0.03, real=0.40 secs]
18194.094: [GC 18194.096: [ParNew: 13619215K->538306K(22649280K),
0.2371980 secs] 14768948K->1688055K(47815104K), 0.2400000 secs] [Times:
user=2.16 sys=0.03, real=0.24 secs]


Here it becomes wild after 5d:

467312.874: [GC 467312.876: [ParNew: 20650041K->551029K(22649280K),
0.4254210 secs] 22325555K->2243659K(47815104K), 0.4291600 secs] [Times:
user=2.61 sys=0.04, real=0.44 secs]
467519.696: [GC 467519.698: [ParNew: 20683765K->513430K(22649280K),
0.2666780 secs] 22376395K->2210010K(47815104K), 0.2703630 secs] [Times:
user=2.39 sys=0.02, real=0.27 secs]
467523.989: [GC [1 CMS-initial-mark: 1696579K(25165824K)]
2728951K(47815104K), 0.7737110 secs] [Times: user=0.77 sys=0.00, real=0.78
secs]
467524.765: [CMS-concurrent-mark-start]
467525.201: [CMS-concurrent-mark: 0.420/0.436 secs] [Times: user=6.69
sys=0.08, real=0.44 secs]
467525.201: [CMS-concurrent-preclean-start]
467525.314: [CMS-concurrent-preclean: 0.111/0.113 secs] [Times: user=0.31
sys=0.01, real=0.11 secs]
467525.315: [CMS-concurrent-abortable-preclean-start]
    CMS: abort preclean due to time 467531.001:
[CMS-concurrent-abortable-preclean: 5.664/5.687 secs] [Times: user=10.99
sys=0.46, real=5.69 secs]
467531.022: [GC[YG occupancy: 1673737 K (22649280 K)]467531.024: [Rescan
(parallel) , 0.5323440 secs]467531.556: [weak refs processing, 0.0010300
secs]467531.557: [class unloading, 0.0550310 secs]467531.612: [scrub
symbol & string tables, 0.0447430 secs] [1 CMS-remark:
1696579K(25165824K)] 3370316K(47815104K), 0.6511260 secs] [Times:
user=25.71 sys=0.07, real=0.65 secs]
467531.676: [CMS-concurrent-sweep-start]
467534.168: [CMS-concurrent-sweep: 2.492/2.492 secs] [Times: user=6.17
sys=0.33, real=2.49 secs]
467534.168: [CMS-concurrent-reset-start]
467534.278: [CMS-concurrent-reset: 0.110/0.110 secs] [Times: user=0.23
sys=0.02, real=0.11 secs]
467581.647: [GC [1 CMS-initial-mark: 1352729K(25165824K)]
7579835K(47815104K), 5.2493880 secs] [Times: user=5.24 sys=0.01, real=5.25
secs]
467586.898: [CMS-concurrent-mark-start]
467587.639: [CMS-concurrent-mark: 0.455/0.740 secs] [Times: user=7.50
sys=0.17, real=0.74 secs]
467587.639: [CMS-concurrent-preclean-start]
467587.801: [CMS-concurrent-preclean: 0.153/0.162 secs] [Times: user=0.32
sys=0.02, real=0.16 secs]
467587.802: [CMS-concurrent-abortable-preclean-start]
    CMS: abort preclean due to time 467593.476:
[CMS-concurrent-abortable-preclean: 5.671/5.674 secs] [Times: user=20.61
sys=0.75, real=5.68 secs]
467593.497: [GC[YG occupancy: 7185297 K (22649280 K)]467593.499: [Rescan
(parallel) , 4.9687880 secs]467598.468: [weak refs processing, 0.0000570
secs]467598.468: [class unloading, 0.0540020 secs]467598.522: [scrub
symbol & string tables, 0.0434850 secs] [1 CMS-remark:
1352729K(25165824K)] 8538026K(47815104K), 5.0845650 secs] [Times:
user=235.75 sys=0.66, real=5.08 secs]
467598.584: [CMS-concurrent-sweep-start]
467601.429: [CMS-concurrent-sweep: 2.801/2.845 secs] [Times: user=11.65
sys=0.42, real=2.85 secs]
467601.429: [CMS-concurrent-reset-start]
467601.539: [CMS-concurrent-reset: 0.109/0.109 secs] [Times: user=0.20
sys=0.01, real=0.11 secs]
467603.561: [GC [1 CMS-initial-mark: 1352455K(25165824K)]
9474109K(47815104K), 7.1588640 secs] [Times: user=7.14 sys=0.01, real=7.16
secs]
467610.722: [CMS-concurrent-mark-start]
467611.120: [CMS-concurrent-mark: 0.397/0.398 secs] [Times: user=7.28
sys=0.16, real=0.40 secs]
467611.120: [CMS-concurrent-preclean-start]
467611.224: [CMS-concurrent-preclean: 0.102/0.104 secs] [Times: user=0.58
sys=0.03, real=0.11 secs]
467611.225: [CMS-concurrent-abortable-preclean-start]
    CMS: abort preclean due to time 467617.031:
[CMS-concurrent-abortable-preclean: 5.667/5.806 secs] [Times: user=13.49
sys=0.50, real=5.80 secs]
467617.052: [GC[YG occupancy: 8991850 K (22649280 K)]467617.053: [Rescan
(parallel) , 6.1182390 secs]467623.172: [weak refs processing, 0.0000610
secs]467623.172: [class unloading, 0.1001680 secs]467623.272: [scrub
symbol & string tables, 0.0434510 secs] [1 CMS-remark:
1352455K(25165824K)] 10344305K(47815104K), 6.2801570 secs] [Times:
user=288.78 sys=0.80, real=6.28 secs]
467623.334: [CMS-concurrent-sweep-start]
467625.074: [CMS-concurrent-sweep: 1.706/1.740 secs] [Times: user=3.86
sys=0.20, real=1.74 secs]
467625.074: [CMS-concurrent-reset-start]
467625.184: [CMS-concurrent-reset: 0.110/0.110 secs] [Times: user=0.27
sys=0.01, real=0.12 secs]
467627.206: [GC [1 CMS-initial-mark: 1352264K(25165824K)]
10644634K(47815104K), 8.0760290 secs] [Times: user=8.06 sys=0.00,
real=8.07 secs]
467635.284: [CMS-concurrent-mark-start]
467635.710: [CMS-concurrent-mark: 0.404/0.427 secs] [Times: user=6.72
sys=0.23, real=0.42 secs]
467635.711: [CMS-concurrent-preclean-start]
467635.814: [CMS-concurrent-preclean: 0.101/0.103 secs] [Times: user=0.33
sys=0.03, real=0.10 secs]
467635.814: [CMS-concurrent-abortable-preclean-start]
    CMS: abort preclean due to time 467641.531:
[CMS-concurrent-abortable-preclean: 5.690/5.717 secs] [Times: user=17.67
sys=0.85, real=5.73 secs]
467641.552: [GC[YG occupancy: 10330198 K (22649280 K)]467641.554: [Rescan
(parallel) , 7.7808590 secs]467649.335: [weak refs processing, 0.0001180
secs]467649.335: [class unloading, 0.1016250 secs]467649.437: [scrub
symbol & string tables, 0.0437740 secs] [1 CMS-remark:
1352264K(25165824K)] 11682462K(47815104K), 7.9447800 secs] [Times:
user=368.46 sys=1.14, real=7.94 secs]
467649.499: [CMS-concurrent-sweep-start]
467651.175: [CMS-concurrent-sweep: 1.595/1.676 secs] [Times: user=10.23
sys=0.41, real=1.68 secs]
467651.176: [CMS-concurrent-reset-start]
467651.285: [CMS-concurrent-reset: 0.109/0.109 secs] [Times: user=0.58
sys=0.00, real=0.11 secs]
467652.306: [GC [1 CMS-initial-mark: 1352044K(25165824K)]
12914694K(47815104K), 10.0236240 secs] [Times: user=10.01 sys=0.01,
real=10.02 secs]
467662.332: [CMS-concurrent-mark-start]

You can find the GC log (1.6MB) here:

https://mft.seeburger.de/portal-seefx/~public/46ed46f6-69f8-4f05-929a-5af9ab19b57f?download

Greetings
Bernd



More information about the hotspot-gc-dev mailing list