Identifying concurrent mode failures caused by fragmentation

Tue Nov 1 06:01:38 PDT 2011

Hi Kris,

Many thanks for PrintFLSStatistics - it looks like just the sort of 
thing I'm after.  I'm mostly reading the GC logs manually anyway, so 
breaking the parsing tools isn't a big deal for me.

As I mentioned in my reply to Jon, we'll probably turn off incremental 
mode - I hadn't realised it was obsolete.  Thanks for the hint.

Jon

On 01.11.2011 04:21, Krystal Mok wrote:
> Hi Jon,
>
> It might be helpful to set -XX:PrintFLSStatistics to a value greater
> than zero, to get the stats of FreeListSpace so that you'd know the size
> of the biggest fragment. The GC log produced by -XX:+PrintGCDetails
> doesn't give enough information on fragmentation.
>
> Here's an example of using -XX:PrintFLSStatistics=1:
> https://gist.github.com/1329783
> It does make the GC log messier, and some of the GC log parsing tools
> won't cope with this, but you get to know how bad the fragmentation is.
>
> Anyway, it looks like you're using CMS in incremental mode. This mode
> should be obsolete in JDK6 already. Is there a good reason for you to be
> using it? If not, I'd suggest turning it off, though, no matter if
> you're upgrading your JDK or not.
>
> Regards,
> Kris Mok
>
> On Mon, Oct 31, 2011 at 9:06 PM, Jon Bright <jon at siliconcircus.com
> <mailto:jon at siliconcircus.com>> wrote:
>
>     Hi,
>
>     We have an application running with a 6GB heap (complete parameters
>     below).  Mostly it has a fairly low turnover of memory use, but on
>     occasion, it will come under some pressure as it reloads a large
>     in-memory data set from a database.
>
>     Sometimes in this situation, we'll see a concurrent mode failure.
>     Here's one failure:
>
>     20021.464: [GC 20021.465: [ParNew: 13093K->3939K(76672K), 0.0569240
>     secs]20021.522: [CMS20023.747: [CMS-concurrent-mark: 11.403/29.029
>     secs] [Times: user=41.11 sys=1.03, real=29.03 secs]
>       (concurrent mode failure): 3873922K->2801744K(6206272K),
>     30.7900180 secs] 3886215K->2801744K(6282944K), [CMS Perm :
>     142884K->142834K(524288K)] icms_dc=33 , 30.8473830 secs] [Times:
>     user=30.26 sys=0.71, real=30.85 secs]
>     Total time for which application threads were stopped: 30.8484460
>     seconds
>
>     (I've attached a lengthier log including the previous and subsequent
>     CMS collection.)
>
>     Am I correct in thinking that this failure can basically only be
>     caused by fragmentation?  Both young and old seem to have plenty of
>     space. There doesn't seem to be any sign that the tenured generation
>     would run out of space before CMS completes.  Fragmentation is the
>     only remaining cause that occurs to me.
>
>     We're running with 1.6.0_11, although this will be upgraded to
>     1.6.0_26 tomorrow.  I realise our current version is ancient - I'm
>     not really looking for help on the problem itself, just for advice
>     on whether the log line above indicates fragmentation.
>
>     Thanks
>
>     Jon Bright
>
>
>
>     The parameters we have set are:
>
>     -server
>     -Xmx6144M
>     -Xms6144M
>     -XX:MaxPermSize=512m
>     -XX:PermSize=512m
>     -XX:+UseConcMarkSweepGC
>     -XX:+CMSIncrementalMode
>     -XX:+CMSIncrementalPacing
>     -XX:SoftRefLRUPolicyMSPerMB=3
>     -XX:__CMSIncrementalSafetyFactor=30
>     -XX:+PrintGCDetails
>     -XX:+__PrintGCApplicationStoppedTime
>     -XX:+__PrintGCApplicationConcurrentTi__me
>     -XX:+PrintGCTimeStamps
>     -Xloggc:/home/tbmx/log/gc_`__date +%Y%m%d%H%M`.log
>
>
>     _______________________________________________
>     hotspot-gc-use mailing list
>     hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
>     http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>