unexpected full gc time spike

Thu May 16 21:39:56 PDT 2013

thanks very much indeed, hope we can see your patch soon


On 17 May 2013 12:38, Srinivas Ramakrishna <ysr1729 at gmail.com> wrote:

> Hi Leon --
>
> Yes, there are a couple of performance bugs related to promotion
> failure handling with ParNew+CMS that can cause this time to balloon.
> Here the unwind of the failed promotion took 177 s. I have at least a
> partial fix for this which I had written up a few months ago but never
> quite got around to collecting sufficient performance data to submit
> it as an official patch.
>
> I'll try and revive that patch and submit it... May be someone else
> can check if it helps sufficiently in the performance with promotion
> failure.
>
> -- ramki
>
>
> On Thu, May 16, 2013 at 9:19 PM, the.6th.month at gmail.com
> <the.6th.month at gmail.com> wrote:
> > hi, all:
> > We just had a situation that I don't quite understand with CMS gc. When I
> > examined the gc log, I found that there was a cms gc which resulted in a
> > parnew promotion failure and concurrent mode failure at the same time,
> and
> > then the full gc lasted for slightly over three minutes. Here is the gc
> log:
> > 2013-05-17T10:12:55.983+0800: 45168.774: [CMS-concurrent-mark:
> 7.056/7.860
> > secs] [Times: user=14.90 sys=0.45, real=7.86 secs]
> > 2013-05-17T10:12:55.984+0800: 45168.775: [CMS-concurrent-preclean-start]
> > 2013-05-17T10:12:56.753+0800: 45169.544: [CMS-concurrent-preclean:
> > 0.676/0.770 secs] [Times: user=0.83 sys=0.15, real=0.77 secs]
> > 2013-05-17T10:12:56.753+0800: 45169.544:
> > [CMS-concurrent-abortable-preclean-start]
> > 2013-05-17T10:12:58.460+0800: 45171.251: [GC 45171.252: [ParNew
> (promotion
> > failed)
> > Desired survivor size 67108864 bytes, new threshold 1 (max 6)
> > - age   1:   70527216 bytes,   70527216 total
> > : 917504K->917504K(917504K), 177.3558880 secs]45348.608: [CMS CMS: abort
> > preclean due to time 2013-05-17T10:15:56.197+0800: 45348.989:
> > [CMS-concurrent-abortable-preclean: 2.037/179.444 secs] [Times:
> user=44.72
> > sys=13.59, real=179.45 secs]
> >  (concurrent mode failure): 3017323K->2177093K(3047424K), 16.5528620
> secs]
> > 3879476K->2177093K(3964928K), [CMS Perm : 91333K->91008K(262144K)],
> > 193.9097970 secs] [Times: user=58.79 sys=13.55, real=193.91 secs]
> >
> > the usual cms full gc time was roughly 100ms-400ms, but this time it
> lasted
> > for 193 seconds. I understand that when there's a parnew gc happens
> during
> > cms and to space is not large enough to hold all survived objects, or the
> > remaining space in old gen cannot cope with memory allocation in old gen,
> > full gc happens. But I don't understand why it hangs so long.
> > I am using oracle jdk 1.6.0_37, and the jvm options we use are:
> > -Xms4000m -Xmx4000m -Xmn1G -Xss256k -XX:PermSize=256m -XX:SurvivorRatio=6
> > -XX:MaxTenuringThreshold=6 -XX:+DisableExplicitGC -Xnoclassgc
> -Xverify:none
> > -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> > -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods
> > -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
> > -XX:+UseCompressedOops -XX:CMSInitiatingOccupancyFraction=90
> > -XX:+UseCMSInitiatingOccupancyOnly
> >
> > Could it be a bug that results in the long full gc in case of promotion
> > failure or something else? Could anyone offer me some help, and I really
> > appreciate your help.
> >
> > Looking forward to any reply.
> >
> > All the best,
> > Leon
> >
> >
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130517/1fdd6f2a/attachment.html