understanding GC logs

Wed Mar 31 19:21:13 UTC 2010

Shaun,

I tried redirecting the output with -Xloggc: and I still see
everything I'm expecting.

Who else out there is seeing the problems (1, 2, 3 below) with the
GC logging that Shaun is seeing?

Jon

On 03/31/10 10:45, Shaun Hennessy wrote:
> Hmm just looking around here -- I have 3 different applications all 
> using CMS --
> from glancing at some automated logs and a collections of machines I 
> can find it seems
> that all application on all machines
> 1) Don't log full date timestamps on minor collections
> 2) No longer log the initial-mark
> 3) No longer log the "CMS-remark" in that entry.
>
> All run 6u17 now, and have slightly different parameters..
> I'm pretty sure 3) was also true for me even before 6u17,  I thought 
> 2) did
> work in 6u12, and unsure on 1).     I'll see if I can roll back to confirm
> how 6u12 behaves for me and try stripping our options....
>
> We are mostly a x64 AMD (4140/444/4600) shop -- but I did check a 
> Intel 4170
> and a SPARC v440 and same problems seem to exist:
>
> Here's a typical example from a different app:
>
> # uname -a
> SunOS parking 5.10 Generic_139556-08 i86pc i386 i86pc
>
> 16:44:57,699 INFO  [ServerInfo] Java version: 1.6.0_17,Sun 
> Microsystems Inc.
> 16:44:57,699 INFO  [ServerInfo] Java VM: Java HotSpot(TM) 64-Bit 
> Server VM 14.3-b01,Sun Microsystems Inc.
> 16:44:57,699 INFO  [ServerInfo] OS-System: SunOS 5.10,amd64
>
> argv[0]: ../../jre/bin/amd64/java
> argv[1]: -server
> argv[2]: -DSAM
> argv[3]: -XX:ThreadStackSize=512
> argv[4]: -Xms16384m
> argv[5]: -Xmx16384m
> argv[6]: -XX:PermSize=900m
> argv[7]: -XX:NewSize=4096m
> argv[8]: -XX:MaxNewSize=4096m
> argv[9]: -XX:MaxPermSize=900m
> argv[10]: -XX:+DisableExplicitGC
> argv[11]: -XX:+UseConcMarkSweepGC
> argv[12]: -XX:CMSInitiatingOccupancyFraction=75
> argv[13]: -XX:+UseParNewGC
> argv[14]: -XX:+UseCMSCompactAtFullCollection
> argv[15]: -XX:+CMSClassUnloadingEnabled
> argv[16]: -XX:CMSInitiatingPermOccupancyFraction=95
> argv[17]: 
> -Djavax.xml.stream.XMLOutputFactory=com.bea.xml.stream.XMLOutputFactoryBase
> argv[18]: 
> -Djavax.xml.stream.XMLInputFactory=com.bea.xml.stream.MXParserFactory
> argv[19]: 
> -Djavax.xml.stream.XMLEventFactory=com.bea.xml.stream.EventFactory
> argv[20]: -Dsam.thread.dump=true
> argv[21]: -Xdebug
> argv[22]: -Xnoagent
> argv[23]: -Djava.compiler=NONE
> argv[24]: -XX:+PrintClassHistogram
> argv[25]: -XX:+HeapDumpOnOutOfMemoryError
> argv[26]: -Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=5005
> argv[27]: -Dcom.sun.management.jmxremote
> argv[28]: -Dcom.sun.management.jmxremote.authenticate=false
> argv[29]: -Dcom.sun.management.jmxremote.ssl=false
> argv[30]: -Dcom.sun.management.jmxremote.port=9999
> argv[31]: -verbose:gc
> argv[32]: -XX:+PrintGCDetails
> argv[33]: -XX:+PrintGCDateStamps
> argv[34]: 
> -Djava.security.policy=../../nms/jboss/server/default/conf/server.policy
> argv[35]: -Dcom.timetra.nms.propertyFile=../../nms/config/nms-server.xml
> argv[36]: 
> -Dcom.timetra.nms.propertyFileMetrics=../../nms/config/nms-metrics.xml
> argv[37]: 
> -Dcom.timetra.nms.propertyJaasConfigFile=../../nms/config/SamJaasLogin.config
> argv[38]: -Djava.endorsed.dirs=../../nms/jboss/lib/endorsed
> argv[39]: -Ddrools.compiler=JANINO
> argv[40]: -Dmap.dao.descriptions.directory=../../nms/config/map
> argv[41]: 
> -Xloggc:../../nms/log/server/GC_logs/GC_trace_032610-16:44:56.log
> argv[42]: -classpath
> argv[43]: 
> :../../nms/lib/jdom/jdom.jar:../../nms/lib/installer/nms_installer.jar:../../nms/jboss/bin/run.jar
> argv[44]: 
> com.timetra.nms.installer.appservertools.ApplicationServerStarter
> argv[45]: start
> argv[46]: default
> argv[47]: 138.120.183.17
>
>
>
> Jon Masamitsu wrote:
>> On 03/30/10 07:30, Shaun Hennessy wrote:
>>> A couple of question related to the GC logs and promotion failure 
>>> messages
>>>
>>> I am running 6u17.
>>>
>>> rgv[2]: -server
>>> argv[3]: -Xms14000m
>>> argv[4]: -Xmx14000m
>>> argv[5]: -XX:PermSize=800m
>>> argv[6]: -XX:NewSize=5600m
>>> argv[7]: -XX:MaxNewSize=5600m
>>> argv[8]: -XX:MaxPermSize=800m
>>> argv[9]: -XX:+DisableExplicitGC
>>> argv[10]: -XX:+UseConcMarkSweepGC
>>> argv[11]: -XX:+UseParNewGC
>>> argv[12]: -XX:+UseCMSCompactAtFullCollection
>>> argv[13]: -XX:+CMSClassUnloadingEnabled
>>> argv[28]: -verbose:gc
>>> argv[29]: -XX:+PrintGCDetails
>>> argv[30]: -XX:+PrintGCDateStamps
>>>
>>>
>>>
>>> 4112.744: [GC 4112.744: [ParNew: 4940531K->530787K(5017600K), 
>>> 0.1455641 secs] 11878708K->7540012K(13619200K), 0.1457559 secs] 
>>> [Times: user=1.38 sys=0.02, real=0.15 secs]
>>> 4113.780: [GC 4113.780: [ParNew: 4831587K->372801K(5017600K), 
>>> 0.2093305 secs] 11840812K->7551390K(13619200K), 0.2095270 secs] 
>>> [Times: user=1.34 sys=0.07, real=0.21 secs]
>>>  [Times: user=0.10 sys=0.00, real=0.11 secs]
>>> 2010-03-24T16:31:56.490-0400: 4114.097: [CMS-concurrent-mark-start]
>>> 4115.261: [GC 4115.261: [ParNew: 4674075K->364108K(5017600K), 
>>> 0.0755017 secs] 11852663K->7542736K(13619200K), 0.0756880 secs] 
>>> [Times: user=0.93 sys=0.00, real=0.08 secs]
>>> 4115.338: [GC 4115.338: [ParNew: 420064K->323310K(5017600K), 
>>> 0.1112115 secs] 7598693K->7587370K(13619200K), 0.1113667 secs] 
>>> [Times: user=0.98 sys=0.02, real=0.11 secs]
>>> 2010-03-24T16:31:58.647-0400: 4116.254: [CMS-concurrent-mark: 
>>> 1.909/2.157 secs] [Times: user=31.47 sys=1.55, real=2.16 secs]
>>> 2010-03-24T16:31:58.647-0400: 4116.254: [CMS-concurrent-preclean-start]
>>> 2010-03-24T16:31:58.798-0400: 4116.405: [CMS-concurrent-preclean: 
>>> 0.149/0.151 secs] [Times: user=2.29 sys=0.12, real=0.15 secs]
>>> 2010-03-24T16:31:58.799-0400: 4116.406: 
>>> [CMS-concurrent-abortable-preclean-start]
>>> 4116.460: [GC 4116.460: [ParNew: 4624110K->301464K(5017600K), 
>>> 0.0914784 secs] 11888170K->7617401K(13619200K), 0.0916679 secs] 
>>> [Times: user=0.89 sys=0.03, real=0.09 secs]
>>> *2010-03-24T16:31:59.494-0400: 4117.101: 
>>> [CMS-concurrent-abortable-preclean: 0.596/0.695 secs] [Times: 
>>> user=9.88 sys=0.60, real=0.70 secs]
>>> [YG occupancy: 2756990 K (5017600 K)]4117.102: [Rescan (parallel) , 
>>> 0.4648394 secs]4117.567: [weak refs processing, 0.0028851 
>>> secs]4117.570: [class unloading, 0.0240174 secs]4117.594: [scrub 
>>> symbol & string tables, 0.0898531 secs] [Times: user=1.72 sys=0.37, 
>>> real=0.58 secs]*
>>> 2010-03-24T16:32:00.079-0400: 4117.686: [CMS-concurrent-sweep-start]
>>> 4118.116: [GC 4118.116: [ParNew: 4602264K->305089K(5017600K), 
>>> 0.0712571 secs] 11891816K->7620802K(13619200K), 0.0714474 secs] 
>>> [Times: user=0.75 sys=0.00, real=0.07 secs]
>>> 4119.117: [GC 4119.117: [ParNew: 4605889K->263281K(5017600K), 
>>> 0.0842051 secs] 11665429K->7368947K(13619200K), 0.0843955 secs] 
>>> [Times: user=0.79 sys=0.01, real=0.08 secs]
>>> 4125.941: [GC 4125.941: [ParNew: 4936868K->708251K(5017600K), 
>>> 0.1426036 secs] 9789305K->5612975K(13619200K), 0.1427944 secs] 
>>> [Times: user=1.56 sys=0.01, real=0.14 secs]
>>> 4126.893: [GC 4126.894: [ParNew: 5009051K->485783K(5017600K), 
>>> 0.2210054 secs] 9536611K->5247528K(13619200K), 0.2211964 secs] 
>>> [Times: user=1.58 sys=0.04, real=0.22 secs]
>>> 4128.102: [GC 4128.102: [ParNew: 4786583K->455386K(5017600K), 
>>> 0.0748814 secs] 8588693K->4257495K(13619200K), 0.0750694 secs] 
>>> [Times: user=0.94 sys=0.00, real=0.08 secs]
>>> 2010-03-24T16:32:11.951-0400: 4129.558: [CMS-concurrent-sweep: 
>>> 10.777/11.872 secs] [Times: user=149.77 sys=7.25, real=11.87 secs]
>>> 2010-03-24T16:32:11.951-0400: 4129.558: [CMS-concurrent-reset-start]
>>> 2010-03-24T16:32:11.984-0400: 4129.591: [CMS-concurrent-reset: 
>>> 0.033/0.033 secs] [Times: user=0.04 sys=0.00, real=0.03 secs] 
>>> 4140.537: [GC 4140.537: [ParNew: 4756186K->539705K(5017600K), 
>>> 0.0873384 secs] 6572247K->2355767K(13619200K), 0.0875281 secs] 
>>> [Times: user=1.10 sys=0.00, re
>>> al=0.09 secs]
>>>
>>>
>>> 1) I no longer seem to get any "|CMS-initial-mark"  - is this a 
>>> change since 6u12?
>>> |
>>
>> I'm running
>>
>> Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>> Java HotSpot(TM) Server VM (build 14.3-b01, mixed mode)
>>
>> and I see entries such as
>>
>> 0.869: [GC [1 CMS-initial-mark: 22901K(229376K)] 28264K(258880K), 
>> 0.0561592 secs] [Times: user=0.05 sys=0.01, real=0.06 secs]
>>
>>> |2) The rescan portion than is the only non-concurrent correct?  So 
>>> from the above the application was only STW for 0.58 sec.
>>> |
>>
>> The initial-mark is also STW.
>>
>> The rescan is part of the remark which is STW.  From my run
>>
>> 1.270: [GC[YG occupancy: 15860 K (29504 K)]1.270: [Rescan (parallel) 
>> , 0.1002467 secs]1.370: [weak refs processing, 0.0000167 secs] [1 
>> CMS-remark: 22901K(229376K)] 38761K(258880K), 0.1004598 secs] [Times: 
>> user=0.11 sys=0.03, real=0.10 secs]
>>
>>
>> In your entry yes it is .058 sec.
>>
>>> |3) This may have been a chance from 1.5 to 1.6, but this line also 
>>> used to display a CMS-remark did it not?|
>>
>> Yes, see my example above.
>>
>>> |4) Is there a way to have my minor collections also display the 
>>> full date stamp (ie  |2010-03-24T16:31:58.799-0400)
>>>
>>>
>>
>> When I run with -XX:+PrintGCDateStamps I see entries such as
>>
>> 2010-03-30T16:06:55.297-0700: 2.602: [GC 2.602: [ParNew: 
>> 29504K->3264K(29504K), 0.2457302 secs] 52405K->38187K(258880K), 
>> 0.2460394 secs] [Times: user=0.41 sys=0.02, real=0.25 secs]
>>
>> I don't know why you're not seeing that.
>>>
>>> 1270.736: [GC 1270.736: [ParNew: 4872340K->574345K(5017600K), 
>>> 0.1967262 secs] 7123984K->2972742K(13619200K), 0.1969106 secs] 
>>> [Times: user=1.45 sys=0.05, re
>>> al=0.20 secs]
>>> 1272.024: [GC 1272.024: [ParNew: 4875542K->653139K(5017600K), 
>>> 0.1334760 secs] 7273939K->3051536K(13619200K), 0.1336582 secs] 
>>> [Times: user=1.54 sys=0.01, re
>>> al=0.13 secs]
>>> *1272.158: [GC 1272.159: [ParNew: 681949K->563105K(5017600K), 
>>> 0.2187865 secs] 3080347K->3158904K(13619200K), 0.2189362 secs] 
>>> [Times: user=1.48 sys=0.06, rea
>>> l=0.22 secs]*
>>> 1273.398: [GC 1273.398: [ParNew: 4863905K->535051K(5017600K), 
>>> 0.1196808 secs] 7459704K->3130850K(13619200K), 0.1198694 secs] 
>>> [Times: user=1.51 sys=0.00, re
>>> al=0.12 secs]
>>> 1274.461: [GC 1274.461: [ParNew: 4835851K->399744K(5017600K), 
>>> 0.2861376 secs] 7431650K->3249248K(13619200K), 0.2863296 secs] 
>>> [Times: user=1.61 sys=0.09, re
>>> al=0.29 secs]
>>>
>>> 5) Why did the middle minor collection occur?  A big allocation?
>>>
>>
>> That seems rather suspicious.   It may be a side effect of using JNI 
>> critical sections.  I don't
>> know if this is such a case but its the best behavior.
>>
>>>
>>>
>>> - Promotion Failure
>>> 4896.478: [GC 4896.478: [ParNew: 4894353K->587864K(5017600K), 
>>> 0.4789909 secs] 8473688K->4268560K(13619200K), 0.4791812 secs] 
>>> [Times: user=1.00 sys=0.61, real=0.48 secs]
>>> 4897.812: [GC 4897.812: [ParNew: 4888664K->545903K(5017600K), 
>>> 0.4105613 secs] 8569360K->4326583K(13619200K), 0.4107560 secs] 
>>> [Times: user=1.06 sys=0.55, real=0.41 secs]
>>> 4899.057: [GC 4899.058: [ParNew: 4846703K->638966K(5017600K), 
>>> 0.2759734 secs] 8627383K->4496987K(13619200K), 0.2761637 secs] 
>>> [Times: user=1.13 sys=0.36, real=0.28 secs]
>>> 4900.101: [GC 4900.101: [ParNew: 4939768K->630721K(5017600K), 
>>> 0.5117751 secs] 8797789K->4607020K(13619200K), 0.5119662 secs] 
>>> [Times: user=0.84 sys=0.66, real=0.51 secs]
>>> 4900.615: [GC 4900.615: [ParNew: 651487K->487288K(5017600K), 
>>> 0.0780183 secs] 4627786K->4463587K(13619200K), 0.0781687 secs] 
>>> [Times: user=0.96 sys=0.00, real=0.08 secs]
>>> *4901.581: [GC 4901.581: [ParNew (promotion failed): 
>>> 4788088K->4780999K(5017600K), 2.8947499 secs]4904.476: [CMS: 
>>> 4003090K->1530872K(8601600K), 7.5122451 secs] 
>>> 8764387K->1530872K(13619200K), [CMS Perm : 
>>> 671102K->671102K(819200K)], 10.4072102 secs] [Times: user=11.03 
>>> sys=1.09, real=10.41 secs]*
>>> 4913.024: [GC 4913.024: [ParNew: 4300800K->316807K(5017600K), 
>>> 0.0615917 secs] 5831672K->1847679K(13619200K), 0.0617857 secs] 
>>> [Times: user=0.74 sys=0.00, real=0.06 secs]
>>> 4914.015: [GC 4914.015: [ParNew: 4617607K->475077K(5017600K), 
>>> 0.0771389 secs] 6148479K->2005949K(13619200K), 0.0773290 secs] 
>>> [Times: user=0.95 sys=0.00, real=0.08 secs]
>>> 4914.908: [GC 4914.908: [ParNew: 4775877K->586339K(5017600K), 
>>> 0.0857102 secs] 6306749K->2117211K(13619200K), 0.0859046 secs] 
>>> [Times: user=1.06 sys=0.00, real=0.09 secs]
>>> 4915.816: [GC 4915.816: [ParNew: 4887139K->476398K(5017600K), 
>>> 0.1841627 secs] 6418011K->2152868K(13619200K), 0.1843556 secs] 
>>> [Times: user=1.32 sys=0.07, real=0.18 secs]
>>> 6) So here i had a promotion failure, this is due to fragmentation 
>>> of the tenured generation rather than lack of space?
>>
>> Fragmentation is the likely problem.  When 6u20 is released try it.  
>> It does a better job
>> of keeping fragmentation down.
>>
>>> 7) Do we need 1 contiguous space in tenured big enough to hold the 
>>> complete list/size of all objects being promoted, or
>>>    do multiple spaces get used& the pieces don't all fit? 
>>
>> The free space in the tenured generation is kept in a free list so 
>> there are multiple chunks.
>> Don't need 1 contiguous chunk for all the promotions.
>>
>>> 8) What exactly is occurring during this promotion failed 
>>> collection?  Based on the next example I assume
>>>   it's a (successful) scavenge.  What exactly is this - which 
>>> thread(s) serial / ParallelGCThreads?,
>>>   STW?, are we simply compacting the tenured gen or are we can 
>>> actually GC the tenured?
>>
>> A promotion failure is a scavenge that does not succeed because there 
>> is not enough
>> space in the old gen to do all the  needed promotions.  The scavenge 
>> is in essence
>> unwound and then a full STW compaction of the entire heap is done.
>>
>>>
>>>
>>>
>>> promotion failed, and full GC
>>> 50786.124: [GC 50786.124: [ParNew: 4606713K->338518K(5017600K), 
>>> 0.0961884 secs] 12303455K->8081859K(13619200K), 0.0963907 secs] 
>>> [Times: user=0.91 sys=0.01, real=0.10 secs]
>>> 50787.373: [GC 50787.373: [ParNew: 4639318K->272229K(5017600K), 
>>> 0.0749353 secs] 12382659K->8053730K(13619200K), 0.0751408 secs] 
>>> [Times: user=0.75 sys=0.00, real=0.08 secs]
>>> 50788.483: [GC 50788.483: [ParNew: 4573029K->393397K(5017600K), 
>>> 0.0837182 secs] 12354530K->8185595K(13619200K), 0.0839321 secs] 
>>> [Times: user=1.03 sys=0.00, real=0.08 secs]
>>> 50789.590: [GC 50789.590: [ParNew (promotion failed): 
>>> 4694264K->4612345K(5017600K), 1.5974678 secs] 
>>> 12486461K->12447305K(13619200K), 1.5976765 secs] [Times : user=2.38 
>>> sys=0.20, real=1.60 secs]
>>> GC locker: Trying a full collection because scavenge failed
>>> 50791.188: [Full GC 50791.188: [CMS: 7834959K->1227325K(8601600K), 
>>> 6.7102106 secs] 12447305K->1227325K(13619200K), [CMS Perm : 
>>> 670478K->670478K(819200K)], 6.7103417 secs] [Times: user=6.71 
>>> sys=0.00, real=6.71 secs]
>>> 50798.982: [GC 50798.982: [ParNew: 4300800K->217359K(5017600K), 
>>> 0.0364557 secs] 5528125K->1444685K(13619200K), 0.0366630 secs] 
>>> [Times: user=0.44 sys=0.00, real=0.04 secs]
>>> 50800.246: [GC 50800.246: [ParNew: 4518167K->198753K(5017600K), 
>>> 0.0368620 secs] 5745493K->1426078K(13619200K), 0.0370604 secs] 
>>> [Times: user=0.46 sys=0.01, real=0.04 secs]
>>> 9) Probably once I understand what the scavenge is doing will help 
>>> me understand this case, but logic seems
>>>  simply enough - fullgc on promotionfailure&scavenge failed.
>>
>> Yes, full STW compaction.
>>
>>>  
>>>
>>>
>>> promotion and concurrent mode failures
>>> 53494.424: [GC 53494.424: [ParNew: 4979001K->716800K(5017600K), 
>>> 0.2120290 secs] 12583633K->8434774K(13619200K), 0.2122200 secs] 
>>> [Times: user=2.12 sys=0.03, real=0.21 secs]
>>> 53496.131: [GC 53496.131: [ParNew: 5017600K->605278K(5017600K), 
>>> 0.2761710 secs] 12735574K->8578720K(13619200K), 0.2763597 secs] 
>>> [Times: user=1.94 sys=0.08, real=0.28 secs]
>>>  [Times: user=0.16 sys=0.00, real=0.16 secs]
>>> 2010-03-25T06:14:58.961-0400: 53496.568: [CMS-concurrent-mark-start]
>>> 53497.688: [GC 53497.688: [ParNew: 4906078K->545999K(5017600K), 
>>> 0.0989930 secs] 12879520K->8519441K(13619200K), 0.0991785 secs] 
>>> [Times: user=1.21 sys=0.02, real=0.10 secs]
>>> 2010-03-25T06:15:00.188-0400: 53497.795: [CMS-concurrent-mark: 
>>> 1.107/1.227 secs] [Times: user=15.14 sys=0.42, real=1.23 secs]
>>> 2010-03-25T06:15:00.188-0400: 53497.795: [CMS-concurrent-preclean-start]
>>> 2010-03-25T06:15:00.233-0400: 53497.840: [CMS-concurrent-preclean: 
>>> 0.043/0.045 secs] [Times: user=0.31 sys=0.01, real=0.04 secs]
>>> 2010-03-25T06:15:00.233-0400: 53497.840: 
>>> [CMS-concurrent-abortable-preclean-start]
>>> 2010-03-25T06:15:00.794-0400: 53498.401: 
>>> [CMS-concurrent-abortable-preclean: 0.541/0.560 secs] [Times: 
>>> user=6.11 sys=0.22, real=0.56 secs]
>>> [YG occupancy: 3222128 K (5017600 K)]53498.402: [Rescan (parallel) , 
>>> 0.4447462 secs]53498.847: [weak refs processing, 0.0028967 
>>> secs]53498.850: [class unloading, 0.0248904 secs]53498.875: [scrub 
>>> symbol & string tables, 0.0896937 secs] [Times: user=1.79 sys=0.35, 
>>> real=0.56 secs]
>>> 2010-03-25T06:15:01.360-0400: 53498.967: 
>>> [CMS-concurrent-sweep-start] 53499.350: [GC 53499.350: [ParNew 
>>> (promotion failed): 4846799K->4718254K(5017600K), 5.3142493 
>>> secs]53504.664: [CMS2010-03-25T06:15:11.506-0400: 53509.113:
>>> [CMS-concurrent-sweep: 4.825/10.146 secs] [Times: user=16.61 
>>> sys=2.94, real=10.15 secs]
>>>  (concurrent mode failure): 8087820K->1346631K(8601600K), 11.0573075 
>>> secs] 12820241K->1346631K(13619200K), [CMS Perm : 
>>> 670478K->670478K(819200K)], 16.37177 19 secs] [Times: user=17.62 
>>> sys=2.66, real=16.37 secs]
>>> 53516.713: [GC 53516.714: [ParNew: 4300800K->283359K(5017600K), 
>>> 0.0498000 secs] 5647431K->1629990K(13619200K), 0.0499965 secs] 
>>> [Times: user=0.62 sys=0.00, real=0.05 secs]
>>> 53517.743: [GC 53517.743: [ParNew: 4584343K->340302K(5017600K), 
>>> 0.0544853 secs] 5930975K->1686933K(13619200K), 0.0546710 secs] 
>>> [Times: user=0.68 sys=0.00, real=0.05 secs]
>>> 10)  I think it's just the system is allocating at such at high rate 
>>> at this point in time ( and we don't use InitiatingOccupancy on this 
>>> app)
>>>   so we get close to full on tenured, minor collection came - no 
>>> room in tenured ---- so even though we don't say "Full GC" in this one,
>>>   don't you get a Full GC as part of any concurrent-mode-failure?
>>
>> The promotion failure that happens leads to the concurrent mode failure.
>>>
>>>
>>> promotion failed, scavenge failed, concurrent mode failure
>>> 86833.016: [GC 86833.017: [ParNew: 4769273K->453398K(5017600K), 
>>> 0.1316717 secs] 12418197K->8169164K(13619200K), 0.1319220 secs] 
>>> [Times: user=1.22 sys=0.03, real=0.13 secs]
>>>  [Times: user=0.14 sys=0.00, real=0.15 secs]
>>> 2010-03-25T15:30:37.688-0400: 86833.295: [CMS-concurrent-mark-start]
>>> 86834.751: [GC 86834.751: [ParNew: 4754198K->513298K(5017600K), 
>>> 0.1250485 secs] 12469964K->8281014K(13619200K), 0.1252553 secs] 
>>> [Times: user=1.38 sys=0.01, real=0.13 secs]
>>> 2010-03-25T15:30:39.310-0400: 86834.917: [CMS-concurrent-mark: 
>>> 1.453/1.621 secs] [Times: user=21.57 sys=1.15, real=1.62 secs]
>>> 2010-03-25T15:30:39.310-0400: 86834.917: [CMS-concurrent-preclean-start]
>>> 2010-03-25T15:30:39.650-0400: 86835.258: [CMS-concurrent-preclean: 
>>> 0.337/0.341 secs] [Times: user=5.30 sys=0.18, real=0.34 secs]
>>> 2010-03-25T15:30:39.651-0400: 86835.258: 
>>> [CMS-concurrent-abortable-preclean-start]
>>> 2010-03-25T15:30:39.864-0400: 86835.471: 
>>> [CMS-concurrent-abortable-preclean: 0.211/0.214 secs] [Times: 
>>> user=3.16 sys=0.19, real=0.21 secs]
>>> [YG occupancy: 3329361 K (5017600 K)]86835.500: [Rescan (parallel) , 
>>> 0.3868448 secs]86835.887: [weak refs processing, 0.0030042 
>>> secs]86835.890: [class unloading, 0.0250008 secs]86835.915: [scrub 
>>> symbol & string tables, 0.0904210 secs] [Times: user=1.85 sys=0.29, 
>>> real=0.51 secs]
>>> 2010-03-25T15:30:40.401-0400: 86836.008: [CMS-concurrent-sweep-start]
>>> 86836.421: [GC 86836.422: [ParNew: 4814154K->680591K(5017600K), 
>>> 0.2031305 secs] 12581870K->8543701K(13619200K), 0.2033332 secs] 
>>> [Times: user=1.88 sys=0.04, real=0.20 secs]
>>> 86836.627: [GC 86836.627: [ParNew (promotion failed): 
>>> 720747K->511306K(5017600K), 1.3076955 secs] 
>>> 8583857K->8560580K(13619200K), 1.3078889 secs] [Times: user=2.66 
>>> sys=0.78, real=1.31 secs]
>>> GC locker: Trying a full collection because scavenge failed
>>> 86837.935: [Full GC 86837.935: [CMS2010-03-25T15:30:46.850-0400: 
>>> 86842.457: [CMS-concurrent-sweep: 4.926/6.449 secs] [Times: 
>>> user=15.24 sys=1.19, real=6.45 secs]
>>>  (concurrent mode failure): 8049273K->1356962K(8601600K), 9.6514031 
>>> secs] 8560580K->1356962K(13619200K), [CMS Perm : 
>>> 670523K->670523K(819200K)], 9.6515260 secs] [Times: user=9.65 
>>> sys=0.00, real=9.65 secs]
>>> 86848.669: [GC 86848.669: [ParNew: 4301133K->201781K(5017600K), 
>>> 0.0452702 secs] 5658095K->1558743K(13619200K), 0.0454738 secs] 
>>> [Times: user=0.57 sys=0.00, real=0.05 secs]
>>>
>>> 11) - So here our scavenge failed, - this is what gave us the "Full 
>>> GC" log message -- the concurrent mode failure
>>>   was really just a coincidence/timing?  The full gc (triggered by 
>>> the promotion failure) aborts the tenured CMS collection
>>> does it not?
>>
>> The "GC locker" message says that after a JNI critical section was 
>> exited the GC wanted to
>> do a scavenge but did not think there was enough room in  the old gen 
>> so it does a full
>> STW compaction.  Because a CMS concurrent collection was in progress, 
>> it is aborted
>> and that abortion causes the concurrent mode failure to be printed. 
>>  Not a coincidence.
>> Just telling you that the CMS concurrent collection could not be 
>> completed for some
>> reason.
>>
>>
>>>
>>>
>>> thanks,
>>> Shaun
>>>
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>>   
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20100331/a0d78428/attachment.htm>
-------------- next part --------------
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use