We are observing that certain functions/operations appear to be getting delayed or blocked by the CMS Concurrent Mark, causing multi-second outliers. Typically our requests are processed in less than 1ms. However, we periodically see multi-second processing times that correlate perfectly with the completion of the CM. Excerpts from logs below demonstrate the strong correlation between the outliers and completion of CM.<br>
<br><div><span class="540252318-06042011"><font face="Arial" size="2">2011-04-05
13:53:05.845-INFO
- ServerGroupAdapter-JGroupsReceiverThread-End
process received messages: final queue size 35 time
10343ms<br>2011-04-05T13:53:05.845-0400: 144421.365: [CMS-concurrent-mark:
23.492/23.498 secs] [Times: user=36.52 sys=0.06, real=23.50 secs]
<br></font></span></div>
<div><span class="540252318-06042011"><font face="Arial" size="2"><br>2011-04-05
14:05:06.873-INFO
- ServerGroupAdapter-JGroupsReceiverThread-End
process received messages: final queue size 38 time
15325ms<br>2011-04-05T14:05:06.871-0400: 145142.391: [CMS-concurrent-mark:
25.652/25.746 secs] [Times: user=38.09 sys=0.16, real=25.75 secs]
<br></font></span></div>
<div><span class="540252318-06042011"><font face="Arial" size="2"><br>2011-04-05
14:05:47.562-INFO
- ServerGroupAdapter-JGroupsReceiverThread-End
process received messages: final queue size 27 time
7552ms<br>2011-04-05T14:05:47.563-0400: 145183.083: [CMS-concurrent-mark:
25.821/25.831 secs] [Times: user=37.89 sys=0.13, real=25.83 secs]
<br></font></span></div>
<div><span class="540252318-06042011"><font face="Arial" size="2"><br>2011-04-05
14:06:28.677-INFO
- ServerGroupAdapter-JGroupsReceiverThread-End
process received messages: final queue size 13 time
15781ms<br>2011-04-05T14:06:28.677-0400: 145224.197: [CMS-concurrent-mark:
26.138/26.143 secs] [Times: user=36.97 sys=0.08, real=26.14 secs]
<br></font></span></div>
<div><span class="540252318-06042011"><font face="Arial" size="2"><br>2011-04-05
14:07:05.283-INFO
- ServerGroupAdapter-JGroupsReceiverThread-End
process received messages: final queue size 8 time
6034ms<br>2011-04-05T14:07:05.284-0400: 145260.803: [CMS-concurrent-mark:
21.316/21.330 secs] [Times: user=36.52 sys=0.15, real=21.33 secs]
<br></font></span></div><br><br>I don't believe that the threads themselves are paused. CPU utilization averaged 25% during this period (2 CMS threads were busy on an 8 core box), so it's not a case of starvation. Our guess is that we're doing some operation/function that's blocked by the CM. Any ideas would be helpful/appreciated.<br>
<br><br>java version "1.6.0_21"<br>Java(TM) SE Runtime Environment (build 1.6.0_21-b06)<br>Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)<br><br>Linux <a href="http://pdk-pt-cxas-01.intcx.net">pdk-pt-cxas-01.intcx.net</a> 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05 EST 2009 x86_64 x86_64 x86_64 GNU/Linux<br>
<br>Red Hat Enterprise Linux Server release 5.3 (Tikanga)<br><br>-Xms24g -Xmx24g -Xmn1g -Xss256k -XX:PermSize=256m -XX:MaxPermSize=256m -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC <br>