<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">The options are working great and as expected (faster initial mark, and no long pauses after abortable preclean timeout). One weird thing though which I’m curious about: I’m showing some data for six JVMs (calling them nodes - they are on separate machines)<div><br></div><div>all with :<div><br></div><div>Linux version 2.6.32-431.3.1.el6.x86_64 (<a href="mailto:mockbuild@c6b10.bsys.dev.centos.org">mockbuild@c6b10.bsys.dev.centos.org</a>) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP Fri Jan 3 21:39:27 UTC 2014</div><div>JDK 1.7.0_60-b19</div><div>16 gig old gen</div><div>8 gig new (6.4 gig eden)</div><div>-XX:+UseParNewGC</div><div>-XX:+UseConcMarkSweepGC</div><div>-XX:+CMSParallelRemarkEnabled</div><div>256 gig RAM</div><div>16 cores (sandy bridge)</div><div><br></div><div>Nodes 4-6 also have </div><div><br></div><div>-XX:+CMSEdenChunksRecordAlways</div><div>-XX:+CMSParallelInitialMarkEnabled</div><div><br></div><div>There are some application level config differences (which limit amount of certain objects kept in memory before flushing to disk) - 1&4 have the same app config, 2&5 have the same app config, 3&6 have the same app config</div><div><br></div><div>This first dump, shows two days worth of total times application threads were stopped via grepping logs for <span style="background-color: rgb(0, 36, 2); color: rgb(0, 228, 56); font-family: Menlo; font-size: 11px;">Total time for which application threads were stopped</span> and summing the values. worst case 4 minutes over the day is not too bad, so this isn’t a big issue</div><div><br></div><div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 1 154.623</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 2 90.3006</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 3 75.3602</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 4 180.618</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 5 107.668</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 6 99.7783</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">-------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 1 190.741</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 2 82.8865</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 3 90.0098</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 4 239.702</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 5 149.332</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 6 138.03</div></div><div><br></div><div>Notably however if you look via JMX/visualGC, the total GC time is actually lower on nodes 4 to 6 than the equivalent nodes 1 to 3. Now I know that biased lock revocation and other things cause safe point, so I figure something other than GC must be the cause… so I just did a count of log lines with <span style="background-color: rgb(0, 36, 2); color: rgb(0, 228, 56); font-family: Menlo; font-size: 11px;">Total time for which application threads were stopped</span> and got this:</div><div><span style="background-color: rgb(0, 36, 2); color: rgb(0, 228, 56); font-family: Menlo; font-size: 11px;"><br></span></div><div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 1 19282</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 2 6784</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 3 1275</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 4 26356</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 5 14491</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17 : 6 8402</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">-------</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 1 20943</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 2 1134</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 3 1129</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 4 30289</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 5 16508</div><div style="margin: 0px; font-size: 11px; font-family: Menlo; color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18 : 6 11459</div><div><div><br></div><div>I can’t cycle these nodes right now (to try each new parameter individually), but am curious whether you can think of why adding these parameters would have such a large effect on the number of safe point stops - e.g. 1129 vs 11459 for otherwise identically configured nodes with very similar workload. Note the ratio is highest on nodes 2 vs node 5 which spill the least into the old generation (so certainly fewer CMS cycles, and also fewer young gen collections) if that sparks any ideas.</div><div><br></div><div>Thanks,</div><div><br></div><div>Graham.</div><div><br></div><div>P.S. It is entirely possible I don’t know exactly what <span style="background-color: rgb(0, 36, 2); color: rgb(0, 228, 56); font-family: Menlo; font-size: 11px;">Total time for which application threads were stopped</span> refers to in all cases (I’m assuming it is a safe point stop)</div><div><br></div><div>On Jun 16, 2014, at 1:54 PM, graham sanderson <<a href="mailto:graham@vast.com">graham@vast.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite"><meta http-equiv="Content-Type" content="text/html charset=windows-1252"><div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Thanks Jon; that’s exactly what i was hoping<div><br><div><div>On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <<a href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<div bgcolor="#FFFFFF" text="#000000">
<br>
<div class="moz-cite-prefix">On 06/12/2014 09:16 PM, graham
sanderson wrote:<br>
</div>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
Hi, I hope this is the right list for this question:
<div><br>
</div>
<div>I was investigating abortable preclean timeouts in our app
(and associated long remark pause) so had a look at the old jdk6
code I had on my box, wondered about recording eden chunks
during certain eden slow allocation paths (I wasn’t sure if TLAB
allocation is just a CAS bump), and saw what looked perfect in
the latest code, so was excited to install 1.7.0_60-b19</div>
<div><br>
</div>
<div>I wanted to ask what you consider the stability of these two
options to be (I’m pretty sure at least the first one is new in
this release)</div>
<div><br>
</div>
<div>I have just installed locally on my mac, and am aware of <a moz-do-not-send="true" href="http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809">http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809</a> which
I could reproduce, however I wasn’t able to reproduce it without
-XX:-<span style="background-color: rgb(255, 255, 255);
white-space: pre-wrap;">UseCMSCompactAtFullCollection (is this
your understanding too?)</span></div>
</blockquote>
<br>
Yes.<br>
<br>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<div><br>
</div>
<div>We are running our application with 8 gig young generation
(6.4g eden), on boxes with 32 cores… so parallelism is good for
short pauses</div>
<div><br>
</div>
<div>we already have</div>
<div><br>
</div>
<div>
<div>-XX:+UseParNewGC </div>
<div>-XX:+UseConcMarkSweepGC</div>
<div>-XX:+CMSParallelRemarkEnabled</div>
</div>
<div><br>
</div>
<div>we have seen a few long(isn) initial marks, so </div>
<div><br>
</div>
<div>-XX:+CMSParallelInitialMarkEnabled sounds good</div>
<div><br>
</div>
<div>as for </div>
<div><br>
</div>
<div>-XX:+CMSEdenChunksRecordAlways</div>
<div><br>
</div>
<div>my question is: what constitutes a slow path such an eden
chunk is potentially recorded… TLAB allocation, or more horrific
things; basically (and I’ll test our app
with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I’ll
actually get less samples using -XX:+CMSEdenChunksRecordAlways
in a highly multithread app than I would with sampling, or put
another way… what sort of app allocation patterns if any might
avoid the slow path altogether and might leave me with just one
chunk?</div>
</blockquote>
<br>
Fast path allocation is done from TLAB's. If you have to get<br>
a new TLAB, the call to get the new TLAB comes from compiled<br>
code but the call is into the JVM and that is the slow path where<br>
the sampling is done.<br>
<br>
Jon<br>
<br>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Graham</div>
<div><br>
</div>
<div>P.S. less relevant I think, but our old generation is 16g</div>
<div>P.P.S. I suspect the abortable preclean timeouts mostly
happen after a burst of very high allocation rate followed by an
almost complete lull… this is one of the patterns that can
happen in our application</div>
</blockquote>
<br>
</div>
</blockquote></div><br></div></div></blockquote></div><br></div></div></body></html>