CMSEdenChunksRecordAlways & CMSParallelInitialMarkEnabled

Wed Jul 16 02:59:22 UTC 2014

I didn’t know about it - I’m go to try it out on some of the nodes in a test cluster

On Jul 15, 2014, at 7:12 PM, Pas <pasthelod at gmail.com> wrote:

> Hello,
> 
> I was wondering, how come no one (else) uses ParGCCardsPerStrideChunk on large heaps? It's shown to decrease (precious STW) time spent during minor collections by ParNew. ( http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html and also we started to use it on 20+ GB heaps and it was helpful, though we did a more than one setting at a time, so I can't say that it was just this setting.)
> 
> Regards,
> Pas
> 
> 
> On Sat, Jun 21, 2014 at 6:52 PM, graham sanderson <graham at vast.com> wrote:
> Note this works great for us too … given formatting in this email is a bit flaky, I’ll refer you to our numbers I posted in a Cassandra issue I opened to add these flags as defaults for ParNew/CMS (on the appropriate JVMs)
> 
> https://issues.apache.org/jira/browse/CASSANDRA-7432
> 
> On Jun 14, 2014, at 7:05 PM, graham sanderson <graham at vast.com> wrote:
> 
>> Thanks for the answer Gustav,
>> 
>> The fact that you have been running in production for months makes me confident enough to try this on at least one our nodes… (this is actually cassandra)
>> 
>> Current GC related options are at the bottom - these nodes have 256G of RAM, and they aren’t swapping, and we are certainly used to a pause within the first 10 seconds or so, but the nodes haven’t even joined the ring yet, so we don’t really care. yeah ms != mx is bad; we want one heap size and to stick with it.
>> 
>> I will gather data via -XX:+CMSEdenChunksRecordAlways, however I’d be interested if a developer has an answer as to when we expect potential chunk recording… Otherwise I’ll have to go dig into the code a bit deeper - my assumption was that this call would not be in the inlined allocation code, but I had thought that even allocation of a new TLAB was inlined by the compilers - perhaps not.
>> 
>> Current GC related settings - note we were running with a lower CMSInitiatingOccupancyFraction until recently - seems to have gotten changed back by accident, but that is kind of tangential.
>> 
>> -Xms24576M
>> -Xmx24576M
>> -Xmn8192M
>> -XX:+HeapDumpOnOutOfMemoryError
>> -XX:+UseParNewGC
>> -XX:+UseConcMarkSweepGC
>> -XX:+CMSParallelRemarkEnabled
>> -XX:SurvivorRatio=8
>> -XX:MaxTenuringThreshold=1
>> -XX:CMSInitiatingOccupancyFraction=70
>> -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:+UseTLAB
>> -XX:+UseCondCardMark
>> -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps
>> -XX:+PrintHeapAtGC
>> -XX:+PrintTenuringDistribution
>> -XX:+PrintGCApplicationStoppedTime
>> -XX:+PrintPromotionFailure
>> -XX:PrintFLSStatistics=1
>> -Xloggc:/var/log/cassandra/gc.log
>> -XX:+UseGCLogFileRotation
>> -XX:NumberOfGCLogFiles=30
>> -XX:GCLogFileSize=20M
>> -XX:+PrintGCApplicationConcurrentTime
>> 
>> Thanks, Graham
>> 
>> P.S. Note tuning here is rather interesting since we use this cassandra cluster for lots of different data with very different usage patterns - sometimes we’ll suddenly dump 50G of data in over the course of a few minutes. Also cassandra doesn’t really mind a node being paused for a while due to GC, but things get a little more annoying if they pause at the same time… even though promotion failure can we worse for us (that is a separate issue), we’ve seen STW pauses up to about 6-8 seconds in re mark (presumably when things go horribly wrong and you only get one chunk). Basically I’m on a mission to minimize all pauses, since their effects can propagate (timeouts are very short in a lot of places)
>> 
>> I will report back with my findings
>> 
>> On Jun 14, 2014, at 6:29 PM, Gustav Åkesson <gustav.r.akesson at gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Even though I won't answer all your questions I'd like to share my experience with these settings (plus additional thoughts) even though I haven't yet have had the time to dig into details.
>>> 
>>> We've been using these flags for several months in production (yes, Java 7 even before latest update release) and we've seen a lot of improvements for CMS old gen STW. During execution occasional initial mark of 1.5s could occur, but using these settings combined CMS pauses are consistently around ~100ms (on high-end machine as yours, they are 20-30ms). We're using 1gb and 2gb heaps with roughly half/half old/new. Obviously, YMMV but this is at least the behavior of this particular application - we've had nothing but positive outcome from using these settings. Additionally, the pauses are rather deterministic.
>>> 
>>> Not sure what your heap size settings are, but what I've also observed is that setting Xms != Xmx could also cause occasional long initial mark when heap capacity is slightly increased. I had a discussion a while back ( http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2014-February/001795.html ) regarding this, and this seems to be an issue with CMS.
>>> 
>>> Also, swapping/paging is another factor which could cause indeterministic / occasional long STW GCs. If you're on Linux, try swappiness=0 and see if pauses get more stable.
>>> 
>>> 
>>> Best Regards,
>>> Gustav Åkesson
>>> 
>>> 
>>> On Fri, Jun 13, 2014 at 6:48 AM, graham sanderson <graham at vast.com> wrote:
>>> I was investigating abortable preclean timeouts in our app (and associated long remark pause) so had a look at the old jdk6 code I had on my box, wondered about recording eden chunks during certain eden slow allocation paths (I wasn’t sure if TLAB allocation is just a CAS bump), and saw what looked perfect in the latest code, so was excited to install 1.7.0_60-b19
>>> 
>>> I wanted to ask what you consider the stability of these two options to be (I’m pretty sure at least the first one is new in this release)
>>> 
>>> I have just installed locally on my mac, and am aware of http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8021809 which I could reproduce, however I wasn’t able to reproduce it without -XX:-UseCMSCompactAtFullCollection (is this your understanding too?)
>>> 
>>> We are running our application with 8 gig young generation (6.4g eden), on boxes with 32 cores… so parallelism is good for short pauses
>>> 
>>> we already have
>>> 
>>> -XX:+UseParNewGC 
>>> -XX:+UseConcMarkSweepGC
>>> -XX:+CMSParallelRemarkEnabled
>>> 
>>> we have seen a few long(isn) initial marks, so 
>>> 
>>> -XX:+CMSParallelInitialMarkEnabled sounds good
>>> 
>>> as for 
>>> 
>>> -XX:+CMSEdenChunksRecordAlways
>>> 
>>> my question is: what constitutes a slow path such an eden chunk is potentially recorded… TLAB allocation, or more horrific things; basically (and I’ll test our app with -XX:+CMSPrintEdenSurvivorChunks) is it likely that I’ll actually get less samples using -XX:+CMSEdenChunksRecordAlways in a highly multithread app than I would with sampling, or put another way… what sort of app allocation patterns if any might avoid the slow path altogether and might leave me with just one chunk?
>>> 
>>> Thanks,
>>> 
>>> Graham
>>> 
>>> P.S. less relevant I think, but our old generation is 16g
>>> P.P.S. I suspect the abortable preclean timeouts mostly happen after a burst of very high allocation rate followed by an almost complete lull… this is one of the patterns that can happen in our application
>>> 
>>> _______________________________________________
>>> hotspot-gc-use mailing list
>>> hotspot-gc-use at openjdk.java.net
>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>> 
>>> 
>> 
> 
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140715/10687201/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1574 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140715/10687201/smime.p7s>