CMS vs G1 - Scan RS very long
kirk at kodewerk.com
Thu Jan 31 14:47:58 UTC 2013
I'd like to add to Michal's comment to say that i've to add that I've seen very similar results in recent tuning efforts for low latency. In this case we didn't have a lot of mutation in old gen but I wasn't able to get young gen pauses times down to any where near what I could get to with the CMS collector. Unfortunately I've not been able to characterize the problem as well as you have as we had other fish to fry and I only had a limited amount of time to look at GC. That said I still will be able to run more experiments in the next two weeks.
What I did notice is that young gen started reducing it's size but where as I calculated that a 15m eden was optimal, it stopped down sizing @ 40m. I'd be interested if anyone has any suggestions on how to get the young gen shrink if it's not shrinking enough on it's own. I'm hesitant to fix the size as there are times when heap should grow but under normal load I would hope that it would return to the smaller size.
Over all I'd have to say that this is an application where I definitively would have recommended iCMS even though the hardware has 24 cores. It's very disappointing that iCMS has been depreciated even though there are many using it. I did a quick scan of my GC log DB and I'm seeing about 15% of the logs showing an icms_dc tag.
On 2013-01-31, at 3:12 PM, "Michal Frajt" <michal at frajt.eu> wrote:
> Hi all,
> After the iCMS got officially deprecated we decided to compare the G1 collector with our best tuned (i)CMS setup. Unfortunately we are not able to make the G1 young collection running any closer to the ParNew. Actually we wanted to compare the G1 concurrent marking STW pauses with the CMS initial-mark and remark STW pauses but already incredibly long running G1 young collections are unacceptable for us.
> We were able to recognize that the very long G1 young collections are caused by the scanning remembered sets. There is not much documentation about G1 internals but we were able to understand that the size of the remembered sets is related to the amount of mutating references from old regions (cards) to young regions. Unfortunately all our applications mutate permanently thousands references from old objects to young objects.
> We are testing with the latest OpenJDK7u extended by the 7189971 patch and CMSTriggerInterval implementation. The attached GC log files represent two very equal applications processing very similar data sets, one running the G1, second running the CMS collector. The OpenJDK7u has an extra output of _pending_cards (when G1TraceConcRefinement activated) which somehow relates to the remembered sets size.
> Young Comparison (both 128m, survivor ratio 5, max tenuring 15)
> CMS - invoked every ~20 sec, avg. stop 60ms
> G1 - invoked every ~16 sec, avg. stop 410ms !!!
> It there anything what could help us to reduce the Scan RS time or the G1 is simply not targeted for applications mutating heavily old region objects?
> CMS parameters
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:MaxNewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:CMSMarkStackSize=8M -XX:CMSMarkStackSizeMax=32M -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:CMSWaitDuration=60000 -XX:+CMSScavengeBeforeRemark -XX:CMSTriggerInterval=600000 -XX:+UseParNewGC -XX:ParallelGCThreads=8 -XX:ParallelCMSThreads=2
> G1 parameters (mind MaxNewSize not specified)
> -Xmx8884m -Xms2048m -XX:NewSize=128m -XX:PermSize=128m -XX:SurvivorRatio=5 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:G1MixedGCCountTarget=16 -XX:ParallelGCThreads=8 -XX:ConcGCThreads=2
> G1 log file GC young pause
> [GC pause (young) [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 23697, predicted base time: 32.88 ms, remaining
> time: 167.12 ms, target pause time: 200.00 ms]
> [Parallel Time: 389.8 ms, GC Workers: 8]
> [Scan RS (ms): Min: 328.8, Avg: 330.4, Max: 332.6, Diff: 3.8, Sum: 2642.9]
> [Eden: 119.0M(119.0M)->0.0B(118.0M) Survivors: 9216.0K->10.0M Heap: 1801.6M(2048.0M)->1685.7M(2048.0M)]
More information about the hotspot-gc-dev