Very frequent CMS without obvious reason

Thu Nov 8 07:41:47 UTC 2012

On 08.11.2012 02:35, Peter B. Kessler wrote:
> In fact, it looks like your permanent generation *is* nearly out of
> space.  746784K/811008K is 92% full, which is over the default
> CMSTriggerPermRatio of 80% (in JDK-6).  What happens if you make the
> permanent generation larger?

Thanks for the hint. I checked the latest JDK 6 sources available to me 
(6.0.32) and indeed the 80 in CMSTriggerPermRatio and the default value 
of 40 in MinHeapFreeRatio means a threshold of (100-40) + 40*80/100 = 
60+32 = 92, so you are right, Perm is triggering the GC.

Will work on Perm. We already did increase it but I thought I'd seen 
much higher Perm values before triggering a CMS, so wasn't taking Perm 
serious as a possible reason. Likely that were situations were 
CMSClassUnloadingEnabled was off. It seems then the Perm is not checked 
for occupancy by CMS.

Thanks for pointing me into the right direction!

Regards,

Rainer

> Rainer Jung wrote:
>> We have an application running that after some time starts to do very
>> frequent CMS without any reason I can see. Here's a snippet from a GC
>> log where soon after starting the CMS a young generation collection
>> happens, so that we can easily see the amount of memory currently in use:
>>
>> 06.11.2012 10:48:31 65102.013: [GC [1 CMS-initial-mark:
>> 1461672K(2469888K)] 2026034K(3125248K), 0.6163870 secs] [Times:
>> user=0.62 sys=0.00, real=0.62 secs]
>> 06.11.2012 10:48:31 Total time for which application threads were
>> stopped: 0.6173730 seconds
>> 06.11.2012 10:48:31 65102.630: [CMS-concurrent-mark-start]
>> 06.11.2012 10:48:31 Application time: 0.0012740 seconds
>> 06.11.2012 10:48:31 {Heap before GC invocations=912 (full 39):
>> 06.11.2012 10:48:31  par new generation   total 655360K, used 564361K
>> [0x0000000707c00000, 0x0000000737c00000, 0x0000000737c00000)
>> 06.11.2012 10:48:31   eden space 524288K, 100% used
>> [0x0000000707c00000, 0x0000000727c00000, 0x0000000727c00000)
>> 06.11.2012 10:48:31   from space 131072K,  30% used
>> [0x000000072fc00000, 0x0000000732322670, 0x0000000737c00000)
>> 06.11.2012 10:48:31   to   space 131072K,   0% used
>> [0x0000000727c00000, 0x0000000727c00000, 0x000000072fc00000)
>> 06.11.2012 10:48:31  concurrent mark-sweep generation total 2469888K,
>> used 1461672K [0x0000000737c00000, 0x00000007ce800000,
>> 0x00000007ce800000)
>> 06.11.2012 10:48:31  concurrent-mark-sweep perm gen total 811008K,
>> used 746784K [0x00000007ce800000, 0x0000000800000000, 0x0000000800000000)
>> 06.11.2012 10:48:31 65102.633: [GC 65102.633: [ParNew
>> 06.11.2012 10:48:31 Desired survivor size 67108864 bytes, new
>> threshold 15 (max 15)
>> 06.11.2012 10:48:31 - age   1:    5295680 bytes,    5295680 total
>> 06.11.2012 10:48:31 - age   2:    2371208 bytes,    7666888 total
>> 06.11.2012 10:48:31 - age   3:     798848 bytes,    8465736 total
>> 06.11.2012 10:48:31 - age   4:     989720 bytes,    9455456 total
>> 06.11.2012 10:48:31 - age   5:    1837168 bytes,   11292624 total
>> 06.11.2012 10:48:31 - age   6:     447072 bytes,   11739696 total
>> 06.11.2012 10:48:31 - age   7:     868304 bytes,   12608000 total
>> 06.11.2012 10:48:31 - age   8:    1673520 bytes,   14281520 total
>> 06.11.2012 10:48:31 - age   9:    3109384 bytes,   17390904 total
>> 06.11.2012 10:48:31 - age  10:    1724424 bytes,   19115328 total
>> 06.11.2012 10:48:31 - age  11:    1206352 bytes,   20321680 total
>> 06.11.2012 10:48:31 - age  12:     908944 bytes,   21230624 total
>> 06.11.2012 10:48:31 - age  13:    1772568 bytes,   23003192 total
>> 06.11.2012 10:48:31 - age  14:    1310648 bytes,   24313840 total
>> 06.11.2012 10:48:31 - age  15:     867560 bytes,   25181400 total
>> 06.11.2012 10:48:31 : 564361K->30541K(655360K), 0.0292050 secs]
>> 2026034K->1493398K(3125248K), 0.0293500 secs] [Times: user=0.21
>> sys=0.00, real=0.03 secs]
>> 06.11.2012 10:48:31 Heap after GC invocations=913 (full 39):
>> 06.11.2012 10:48:31  par new generation   total 655360K, used 30541K
>> [0x0000000707c00000, 0x0000000737c00000, 0x0000000737c00000)
>> 06.11.2012 10:48:31   eden space 524288K,   0% used
>> [0x0000000707c00000, 0x0000000707c00000, 0x0000000727c00000)
>> 06.11.2012 10:48:31   from space 131072K,  23% used
>> [0x0000000727c00000, 0x00000007299d35f0, 0x000000072fc00000)
>> 06.11.2012 10:48:31   to   space 131072K,   0% used
>> [0x000000072fc00000, 0x000000072fc00000, 0x0000000737c00000)
>> 06.11.2012 10:48:31  concurrent mark-sweep generation total 2469888K,
>> used 1462856K [0x0000000737c00000, 0x00000007ce800000,
>> 0x00000007ce800000)
>> 06.11.2012 10:48:31  concurrent-mark-sweep perm gen total 811008K,
>> used 746784K [0x00000007ce800000, 0x0000000800000000, 0x0000000800000000)
>> 06.11.2012 10:48:31 }
>>
>> Since neither tenured nor perm are nearly full I don't see a reason
>> for this CMS. After some time CMS happens very frequently, multiple
>> times per minute until finally it runs more or less continuously and
>> the JVM needs to be restarted.
>>
>> JVM flags are:
>>
>> argv[0]: /opt/java/jdk1.6.0_35/bin/java
>> ... application system property ...
>> argv[2]: -server
>> argv[3]: -Dsun.rmi.dgc.client.gcInterval=28800000
>> ... application system property ...
>> argv[5]: -XX:+UseParNewGC
>> argv[6]: -XX:ParallelGCThreads=8
>> argv[7]: -XX:+UseConcMarkSweepGC
>> argv[8]: -XX:SurvivorRatio=4
>> argv[9]: -XX:+CMSClassUnloadingEnabled
>> argv[10]: -XX:CMSInitiatingOccupancyFraction=80
>> argv[11]: -XX:+UseCMSInitiatingOccupancyOnly
>> argv[12]: -Dsun.reflect.noInflation=true
>> argv[13]: -XX:+TraceClassLoading
>> argv[14]: -verbose:gc
>> argv[15]: -XX:+PrintGCApplicationConcurrentTime
>> argv[16]: -XX:+PrintGCApplicationStoppedTime
>> argv[17]: -XX:+PrintGCTimeStamps
>> argv[18]: -XX:+PrintGCDetails
>> argv[19]: -XX:+PrintHeapAtGC
>> argv[20]: -XX:+PrintTenuringDistribution
>> argv[21]: -Xms3052m
>> argv[22]: -Xmx3052m
>> argv[23]: -XX:PermSize=792m
>> argv[24]: -XX:MaxPermSize=792m
>> argv[25]: -XX:NewSize=640m
>> argv[26]: -XX:MaxNewSize=640m
>> ... application system properties ...
>> argv[38]: -Dcom.sun.management.jmxremote
>> argv[39]: -Dcom.sun.management.jmxremote.port=1090
>> argv[40]: -Dcom.sun.management.jmxremote.ssl=false
>> argv[41]: -Dcom.sun.management.jmxremote.authenticate=true
>> argv[42]:
>> -Dcom.sun.management.jmxremote.password.file=/.../jmxremote.password
>> argv[43]:
>> -Dcom.sun.management.jmxremote.access.file=/.../jmxremote.access
>> argv[44]: -XX:-UseBiasedLocking
>> argv[45]: -XX:-EliminateLocks
>> argv[46]: -XX:-DoEscapeAnalysis
>>
>> - There are no messages about promotion failures or concurrent mode
>> failures in the logs
>>
>> - System.gc() would produce different log messages
>>
>> - DGC would not run in intervals that get shorter and shorter down to
>> seconds
>>
>> - perm is far from being full
>>
>> - tenured is well below the Occupancy Fraction (always below 60%)
>>
>> How can we find out the CMS cause here?
>>
>> Note: args 44-46 are there because of historical reasons (load test
>> results). Even if for current JVM releases it might be better to
>> remove them, I don't see how they would explain frequent CMS runs.
>>
>> Thanks for any hint!
>>
>> Regards,
>>
>> Rainer

-- 
kippdata
informationstechnologie GmbH   Tel: 0228 98549 -0
Bornheimer Str. 33a            Fax: 0228 98549 -50
53111 Bonn                     www.kippdata.de

HRB 8018 Amtsgericht Bonn / USt.-IdNr. DE 196 457 417
Geschäftsführer: Dr. Thomas Höfer, Rainer Jung, Sven Maurmann