Unexplained long stop when using CMS

Tue Feb 24 14:57:18 PST 2009

Hi Jim --

Yes, i'd try 5u14 (or later), or the -XX:+UseMembar workaround.
It's the most likely explanation for the pause you are seeing.

In more modern JVM's there are dtrace probes you could leverage
for this kind of debugging.

-- ramki

----- Original Message -----
From: James Nichols <jamesnichols3 at gmail.com>
Date: Tuesday, February 24, 2009 2:44 pm
Subject: Unexplained long stop when using CMS
To: "hotspot-gc-use at openjdk.java.net" <hotspot-gc-use at openjdk.java.net>

> Greetings.
> 
> I've suddenly started seeing several long stop the world pauses using 
> the
> CMS collector that I can't explain.  Below is the output from my gc log
> file.  As you can see, the application threads were stopped for 50 seconds,
> but I can't see where that time is being sent in the gc timing output.
> 
> 9872.975: [CMS-concurrent-abortable-preclean: 6.885/9.607 secs]
> Application time: 7.7035080 seconds
> 9923.339: [GC[YG occupancy: 563247 K (1100288 K)]9923.341: [Rescan
> (parallel) , 0.3760170 secs]9923.717: [weak refs processing, 0.0496640
> secs]9923.767: [class unloading,
> 0.1167260 secs]9923.884: [scrub symbol & string tables, 0.0241920 
> secs] [1
> CMS-remark: 1657038K(2936832K)] 2220286K(4037120K), 0.5834530 secs]
> Total time for which application threads were stopped: 50.9487390 seconds
> 
> 
> Due to some unrelated application problems, I had submitted a case to 
> JBoss
> support.  They had noticed the phenomenon in the GC log and suggest an
> upgrade to JDK 1.5.0_14, pointing out the following bug:
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6546278
> 
> It's crazy to take anything JBoss support ever gives you on faith, so 
> I
> thought I would check with the experts.  Unfortunately, the 
> methodology for
> measuring if this is impacting my application described in the bug is 
> really
> hard for me to implement, even though this is only occuring in a test
> environment.  I am seeing really high load for short periods of time 
> on this
> particular box, but it's hard for me to correlate the long application 
> pause
> to the load average measurement.  Is there anyway I can verify that 
> this is
> the problem I'm seeing?  Or even any insight into why there is this
> discrepency between the "application stopped" message and the rescan
> timings?  Or should I just jump on rev 14?  We've been on 12 for a while
> now.
> 
> 
> I have attached the gc.dat file containing this particular long GC.
> 
> I'm using Java HotSpot(TM) 64-Bit Server VM (build 1.5.0_12-b04, mixed
> mode), and here are the JVM arguments:
> 
> /usr/java/jdk1.5.0_12/bin/java -Dprogram.name=run.sh -server -Xms4096m
> -Xmx4096m -XX:NewSize=1228M -XX:MaxNewSize=1228M -XX:MaxTenuringThreshold=4
> -XX:SurvivorRatio=6 -XX:+ScavengeBeforeFullGC -XX:PermSize=256M
> -XX:MaxPermSize=256M -XX:+HandlePromotionFailure -XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC -XX:ParallelGCThreads=3 -XX:+CMSParallelRemarkEnabled
> -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled -verbosegc
> -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -XX:+PrintHeapAtGC -XX:+PrintGCApplicationConcurrentTime
> -XX:+PrintGCApplicationStoppedTime -Xloggc:/var/log/jboss/gc.dat
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use