On Thu, Oct 14, 2010 at 19:00, Y. S. Ramakrishna <span dir="ltr"><<a href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Hi Adam --<br>
<br>
...<div class="im"><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I understood before that "initial" is not done in parallel. I'm curious - why not?<br>
</blockquote>
<br></div>
When it was first implemented CMS did all its work single-threaded<br>
over a serial scavenger. It was incrementally parallelized over time<br>
but because initial-mark pauses ere usually not a concern (small edens,<br>
small survivor spaces, initial mark immediately following a scavenge)<br>
it never rose high enough in priority to parallelize. Clearly we have<br>
reached a point where the old assumptions no longer hold and it's<br>
time to parallelize it. Or better still to move to G1 which is fully<br>
parallel and concurrent, and have other advantages as well.<div class="im"><br></div></blockquote><div><br></div><div>Thanks for the history lesson! We did mention G1 to our customer yesterday, but I'm not yet familiar enough with its tuning knobs to be confident to suggest it for a production system. We've only done minimal testing in-house, and not yet on the scale of this customer.</div>
<div><br></div><div>More generally, for ParGC and CMS, our heuristic has been to set heap size, configure new size, and then if necessary, configure survivor spaces and maybe some other knobs to fulfill our customer requirements. I don't know what the equivalent settings are for G1. I'm curious if there's a similar "recipe" for getting it configured and tuned. When we tried earlier, we didn't have much success with it. Can anyone who's spent significant time tuning it relate their experiences? Is it worth trying on 2-4 core systems with 1-4g of RAM?</div>
<div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
I have CMSInitiatingOccupancyFraction=50 because I was concerned about some finalization issues in our application, and I thought I remembered reference processing wasn't done in young GC's. After enabling PrintReferenceGC, the logs imply ParNewGC also clears references - is that true? If so, it may not be necessary for us to include that option anyway.<br>
</blockquote>
<br></div>
Yes, scavenges do process unreachable Reference objects found in the young gen.<br>
However, once these get into the old gen, you are right that you will need a<br>
CMS cycle to identify them as unreachable and to process them appropriately. </blockquote><div><br></div><div>Thanks for the confirmation.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"> (1) use no survivor spaces (at the risk of larger scavenge<br>
pauses, larger remark pauses,<br>
even concurrent mode failures)<br>
(2) use a sufficiently large heap so as to be able to afford to<br>
set a<br>
mark initiation threshold above the low water-mark (after a major<br>
collection cycle). This will keep init-mark's riding on the<br>
coat-tails<br>
of scavenges.<br></div><div class="im">
<br>
The customer's application appears to fit neatly in a 2.4G heap, and we have -Xmx4g, so I believe we might be able to apply (2) here. Is (1) above required along with (2), or do these workarounds address the problem independently? I ask because (a) this customer is already concerned about pause times, so I don't have a lot of room to increase remark and scavenge times, and (b) I'm concerned about eliminating survivor spaces since we've dealt with significant heap fragmentation in the past.<br>
</div></blockquote>
<br>
Precisely. The two are actually additive, but either by itself may not<br>
be sufficient, and as you pointed out (1) may not even be always feasible.</blockquote><div><br></div><div>I reduced the survivor spaces in my recommendation for today but did not completely eliminate them, and increased the old gen size. Unfortunately, the customer made a mistake in the settings that disabled -XX:+PrintGCDetails, so they failed to get new logs. They reported that their user experience was slightly worse, but without logs, I can't determine whether the GC's are the problem or something else.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
One other data point is that we have a large number of mostly idle threads (3826 at one count), with most of the idle threads holding onto approximately 2MB of object data. I don't know if that would significantly contribute to the initial mark pause, but my intuition is that it would increase the time if some of that time is spent marking the stack locals.<br>
</blockquote>
<br></div>
Yes, that could be, but probably less significant than a large Eden or survivor<br>
space, given that when the CMS initial-mark pauses come immediately after<br>
a scavenge, the pauses are much shorter, so the larger contribution is<br>
from the large Eden. If you pour your GC logs into GCHisto, you<br>
should probably see that the CMS intial-mark pauses increase as<br>
the most recent scavenge becomes more distant (or you could plot that<br>
via a spreadsheet and note that relationship).<br></blockquote><div><br></div><div>Ok, I checked it in gchisto and you were exactly right. This was immediately obvious.</div><div><br></div><div>Thanks for your help again. </div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
-- ramki<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<br>
<br>
<br>
<br>
Also, if using iCMS (Inceremental CMS), drop the Incremental<br>
mode and revert to<br>
vanilla CMS.<br>
*** (#2 of 2): 2010-04-14 11:02:03 PDT <a href="mailto:xxxx@oracle.com" target="_blank">xxxx@oracle.com</a><br></div>
<mailto:<a href="mailto:xxxx@oracle.com" target="_blank">xxxx@oracle.com</a>><div class="im"><br>
<br>
<br>
If you have support, you can try escalating it via your support channels<br>
to get this addressed, especially if the workaround/retuning doesn't<br>
do the job.<br>
<br>
-- ramki<br>
<br>
<br>
My option seems to be to eliminate the CMSInitiatingOccupancyFraction=50 and keep the -Xmx4g. Would it be prudent to set -Xms4g also?<br>
<br>
And the log excerpt from a steady-state in the application. The sigma on pause times for young gc and remark is 17ms and 26ms - they're like clockwork. The initial mark is higher, 334ms due to the large-valued outliers.<br>
<br>
<br>
</div></blockquote>
<br>
...<br>
</blockquote></div><br>