<html><head><meta http-equiv="Content-Type" content="text/html charset=windows-1252"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">Ok, thanks…<div><br></div><div>The 3 before the 1129 and the 6 before the 11459 are the node numbers</div><div><br></div><div>I’ll dig around in the source for any way of finding out what the cause of safe points are (I’m not aware of a better -XX: option) … frankly I’ll probably just report this in the user hotspot-gc-use thread (it isn’t causing any real issues) and see if other people report it too.</div><div><br></div><div>Thanks,</div><div><br></div><div>Graham</div><div><br><div><div>On Jun 19, 2014, at 12:23 PM, Jon Masamitsu <<a href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>> wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<div text="#000000" bgcolor="#FFFFFF">
Graham,<br>
<br>
I don't have any guesses about what is causing the difference in<br>
the number of safepoints. As you note the messages you are<br>
counting are not specifically GC pauses. The differences you<br>
are seeing are huge (1129 vs 11459, which, by the way, I don't <br>
see those numbers in the tables; did you mean 31129 vs 611459?).<br>
If those numbers are really GC related I would expect some<br>
dramatic effect on the application behavior.<br>
<br>
Sometimes better GC means applications run faster so<br>
maybe more work is getting done in nodes 4-6.<br>
The scale of the difference is surprising though.<br>
<br>
Jon<br>
<br>
<div class="moz-cite-prefix">On 6/18/2014 8:19 PM, graham sanderson
wrote:<br>
</div>
<blockquote cite="mid:D17A9AC7-A7D0-4AE5-B8F0-2917F7480244@vast.com" type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
The options are working great and as expected (faster initial
mark, and no long pauses after abortable preclean timeout). One
weird thing though which I’m curious about: I’m showing some data
for six JVMs (calling them nodes - they are on separate machines)
<div><br>
</div>
<div>all with :
<div><br>
</div>
<div>Linux version 2.6.32-431.3.1.el6.x86_64 (<a moz-do-not-send="true" href="mailto:mockbuild@c6b10.bsys.dev.centos.org">mockbuild@c6b10.bsys.dev.centos.org</a>)
(gcc version 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) ) #1 SMP
Fri Jan 3 21:39:27 UTC 2014</div>
<div>JDK 1.7.0_60-b19</div>
<div>16 gig old gen</div>
<div>8 gig new (6.4 gig eden)</div>
<div>-XX:+UseParNewGC</div>
<div>-XX:+UseConcMarkSweepGC</div>
<div>-XX:+CMSParallelRemarkEnabled</div>
<div>256 gig RAM</div>
<div>16 cores (sandy bridge)</div>
<div><br>
</div>
<div>Nodes 4-6 also have </div>
<div><br>
</div>
<div>-XX:+CMSEdenChunksRecordAlways</div>
<div>-XX:+CMSParallelInitialMarkEnabled</div>
<div><br>
</div>
<div>There are some application level config differences (which
limit amount of certain objects kept in memory before flushing
to disk) - 1&4 have the same app config, 2&5 have the
same app config, 3&6 have the same app config</div>
<div><br>
</div>
<div>This first dump, shows two days worth of total times
application threads were stopped via grepping logs for <span style="background-color: rgb(0, 36, 2); color: rgb(0, 228,
56); font-family: Menlo; font-size: 11px;">Total time for
which application threads were stopped</span> and summing
the values. worst case 4 minutes over the day is not too bad,
so this isn’t a big issue</div>
<div><br>
</div>
<div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 1 154.623</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 2 90.3006</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 3 75.3602</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 4 180.618</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 5 107.668</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 6 99.7783</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">-------</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 1 190.741</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 2 82.8865</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 3 90.0098</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 4 239.702</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 5 149.332</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 6 138.03</div>
</div>
<div><br>
</div>
<div>Notably however if you look via JMX/visualGC, the total GC
time is actually lower on nodes 4 to 6 than the equivalent
nodes 1 to 3. Now I know that biased lock revocation and other
things cause safe point, so I figure something other than GC
must be the cause… so I just did a count of log lines with <span style="background-color: rgb(0, 36, 2); color: rgb(0, 228,
56); font-family: Menlo; font-size: 11px;">Total time for
which application threads were stopped</span> and got this:</div>
<div><span style="background-color: rgb(0, 36, 2); color: rgb(0,
228, 56); font-family: Menlo; font-size: 11px;"><br>
</span></div>
<div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 1 19282</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 2 6784</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 3 1275</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 4 26356</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 5 14491</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-17
: 6 8402</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">-------</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 1 20943</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 2 1134</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 3 1129</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 4 30289</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 5 16508</div>
<div style="margin: 0px; font-size: 11px; font-family: Menlo;
color: rgb(0, 228, 56); background-color: rgb(0, 36, 2);">2014-06-18
: 6 11459</div>
<div>
<div><br>
</div>
<div>I can’t cycle these nodes right now (to try each new
parameter individually), but am curious whether you can
think of why adding these parameters would have such a
large effect on the number of safe point stops - e.g. 1129
vs 11459 for otherwise identically configured nodes with
very similar workload. Note the ratio is highest on nodes
2 vs node 5 which spill the least into the old generation
(so certainly fewer CMS cycles, and also fewer young gen
collections) if that sparks any ideas.</div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Graham.</div>
<div><br>
</div>
<div>P.S. It is entirely possible I don’t know exactly what <span style="background-color: rgb(0, 36, 2); color: rgb(0,
228, 56); font-family: Menlo; font-size: 11px;">Total
time for which application threads were stopped</span> refers
to in all cases (I’m assuming it is a safe point stop)</div>
<div><br>
</div>
<div>On Jun 16, 2014, at 1:54 PM, graham sanderson <<a moz-do-not-send="true" href="mailto:graham@vast.com">graham@vast.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<div style="word-wrap: break-word; -webkit-nbsp-mode:
space; -webkit-line-break: after-white-space;">Thanks
Jon; that’s exactly what i was hoping
<div><br>
<div>
<div>On Jun 16, 2014, at 1:27 PM, Jon Masamitsu <<a moz-do-not-send="true" href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<blockquote type="cite">
<meta content="text/html; charset=windows-1252" http-equiv="Content-Type">
<div bgcolor="#FFFFFF" text="#000000"> <br>
<div class="moz-cite-prefix">On 06/12/2014 09:16
PM, graham sanderson wrote:<br>
</div>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
Hi, I hope this is the right list for this
question:
<div><br>
</div>
<div>I was investigating abortable preclean
timeouts in our app (and associated long
remark pause) so had a look at the old jdk6
code I had on my box, wondered about
recording eden chunks during certain eden
slow allocation paths (I wasn’t sure if TLAB
allocation is just a CAS bump), and saw what
looked perfect in the latest code, so was
excited to install 1.7.0_60-b19</div>
<div><br>
</div>
<div>I wanted to ask what you consider the
stability of these two options to be (I’m
pretty sure at least the first one is new in
this release)</div>
<div><br>
</div>
<div>I have just installed locally on my mac,
and am aware of <a moz-do-not-send="true" href="http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809">http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809</a> which
I could reproduce, however I wasn’t able to
reproduce it without -XX:-<span style="background-color: rgb(255, 255,
255); white-space: pre-wrap;">UseCMSCompactAtFullCollection
(is this your understanding too?)</span></div>
</blockquote>
<br>
Yes.<br>
<br>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<div><br>
</div>
<div>We are running our application with 8 gig
young generation (6.4g eden), on boxes with
32 cores… so parallelism is good for short
pauses</div>
<div><br>
</div>
<div>we already have</div>
<div><br>
</div>
<div>
<div>-XX:+UseParNewGC </div>
<div>-XX:+UseConcMarkSweepGC</div>
<div>-XX:+CMSParallelRemarkEnabled</div>
</div>
<div><br>
</div>
<div>we have seen a few long(isn) initial
marks, so </div>
<div><br>
</div>
<div>-XX:+CMSParallelInitialMarkEnabled sounds
good</div>
<div><br>
</div>
<div>as for </div>
<div><br>
</div>
<div>-XX:+CMSEdenChunksRecordAlways</div>
<div><br>
</div>
<div>my question is: what constitutes a slow
path such an eden chunk is potentially
recorded… TLAB allocation, or more horrific
things; basically (and I’ll test our app
with -XX:+CMSPrintEdenSurvivorChunks) is it
likely that I’ll actually get less samples
using -XX:+CMSEdenChunksRecordAlways in a
highly multithread app than I would with
sampling, or put another way… what sort of
app allocation patterns if any might avoid
the slow path altogether and might leave me
with just one chunk?</div>
</blockquote>
<br>
Fast path allocation is done from TLAB's. If
you have to get<br>
a new TLAB, the call to get the new TLAB comes
from compiled<br>
code but the call is into the JVM and that is
the slow path where<br>
the sampling is done.<br>
<br>
Jon<br>
<br>
<blockquote cite="mid:43FD20E2-774A-4F30-ACEA-F10395175C82@vast.com" type="cite">
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Graham</div>
<div><br>
</div>
<div>P.S. less relevant I think, but our old
generation is 16g</div>
<div>P.P.S. I suspect the abortable preclean
timeouts mostly happen after a burst of very
high allocation rate followed by an almost
complete lull… this is one of the patterns
that can happen in our application</div>
</blockquote>
<br>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</div>
</blockquote></div><br></div></body></html>