Spikes in the duration of the ParNew collections

Thu Jan 24 21:13:03 PST 2013

If you have a test setup where you can run some experiments,
try -XX:-ParNewGC.  There have been instances in the past
where flaws in the partitioning for parallelism has caused some
dramatic increases in the ParNew times.  This setting will use
the serial young generation collector.  It will be slow but perhaps
not have the spiking.

If that removes the spiking, it gives us some information about
the cause but probably not enough to pinpoint the problem.
If I were attacking this I'd try to profile the VM to see which
methods are consuming all that time.

Jon

On 1/18/2013 5:10 AM, Bartek Markocki wrote:
> Hello all,
>
> During tests of a new version of our application we found out that
> some of the ParNew times spike to 170ms (avg 10ms) - Java6 update 38,
> 64bit, -server with CMS.
>
> Of course the first thing that came to our mind was a spike in
> allocation rate resulting in a spike in the amount of surviving
> objects and/ or a spike in promotion rate. Unfortunately the
> collection(s) in question did not showed any abnormality in this
> matter. To make the things even more interesting, showed in the
> attached extract from the gc log, some of those long lasting ParNew
> showed smaller promotion rate comparing to the average (21k per
> collection).
>
> Before re-run of the test we enabled  -XX:+PrintSafepointStatistics
> and -XX:+TraceSafepointCleanupTime  to get better understanding of STW
> times. As a result of that we found out that almost all the time goes
> to the collection time.
> 28253.076: GenCollectForAllocation          [     382          0
>         0    ]      [     0     0     0     3   170    ]  0
>
> Additionally we noticed that user to real time ratio for a normal
> (normally long) ParNew collection is between 4 and 8. For the
> collection in question it jumps to 12 (we have 16 cores) - so not only
> the collection lasted longer but more CPU was used.
>
> For your review - I attached an extract from std out and gc log for
> the collection in question.
>
> Additionally we reran the test with the changed collector to
> ParallelOld and we did not notice comparable spikes in the young
> generation times.
> After that we took Java7 update 10 with the CMS and found out that the
> issue is still there (spikes in ParNew times) however is less
> noticeable, i.e., the max ParNew time was  113ms.
>
> The question of the day is: why it is happening? what else we can
> do/check/test to make our application run CMS on java6?
>
> Thanks in advance,
> Bartek
>
>
> $ java -version
> java version "1.6.0_38"
> Java(TM) SE Runtime Environment (build 1.6.0_38-b05)
> Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode)
>
> $less /etc/redhat-release
> Red Hat Enterprise Linux Server release 5.5 (Tikanga)
>
> JVM options
> -server -Xms2g -Xmx2g  -XX:PermSize=64m -XX:+UseConcMarkSweepGC
> -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50
> -XX:+UseCMSInitiatingOccupancyOnly -XX:NewSize=1700m
> -XX:MaxNewSize=1700m -verbose:gc -XX:+PrintGCDetails
> -XX:+PrintGCApplicationStoppedTime -XX:+PrintFlagsFinal
> -Xloggc:/apps/gcLog.log  -XX:+PrintGCDateStamps
> -XX:+PrintGCApplicationConcurrentTime -XX:PrintCMSStatistics=3
> -XX:+PrintCMSInitiationStatistics -XX:+PrintAdaptiveSizePolicy
> -XX:+PrintGCTaskTimeStamps -XX:+PrintSharedSpaces
> -XX:+PrintTenuringDistribution -XX:+PrintVMQWaitTime
> -XX:+PrintHeapAtGC -XX:+PrintSafepointStatistics
> -XX:PrintSafepointStatisticsCount=10 -XX:+TraceSafepointCleanupTime
> -XX:PrintFLSStatistics=2 -XX:+PrintReferenceGC
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130124/2950f7a7/attachment.html