unexpected full gc time spike

Thu May 16 21:38:44 PDT 2013

Hi Leon --

Yes, there are a couple of performance bugs related to promotion
failure handling with ParNew+CMS that can cause this time to balloon.
Here the unwind of the failed promotion took 177 s. I have at least a
partial fix for this which I had written up a few months ago but never
quite got around to collecting sufficient performance data to submit
it as an official patch.

I'll try and revive that patch and submit it... May be someone else
can check if it helps sufficiently in the performance with promotion
failure.

-- ramki

On Thu, May 16, 2013 at 9:19 PM, the.6th.month at gmail.com
<the.6th.month at gmail.com> wrote:
> hi, all:
> We just had a situation that I don't quite understand with CMS gc. When I
> examined the gc log, I found that there was a cms gc which resulted in a
> parnew promotion failure and concurrent mode failure at the same time, and
> then the full gc lasted for slightly over three minutes. Here is the gc log:
> 2013-05-17T10:12:55.983+0800: 45168.774: [CMS-concurrent-mark: 7.056/7.860
> secs] [Times: user=14.90 sys=0.45, real=7.86 secs]
> 2013-05-17T10:12:55.984+0800: 45168.775: [CMS-concurrent-preclean-start]
> 2013-05-17T10:12:56.753+0800: 45169.544: [CMS-concurrent-preclean:
> 0.676/0.770 secs] [Times: user=0.83 sys=0.15, real=0.77 secs]
> 2013-05-17T10:12:56.753+0800: 45169.544:
> [CMS-concurrent-abortable-preclean-start]
> 2013-05-17T10:12:58.460+0800: 45171.251: [GC 45171.252: [ParNew (promotion
> failed)
> Desired survivor size 67108864 bytes, new threshold 1 (max 6)
> - age   1:   70527216 bytes,   70527216 total
> : 917504K->917504K(917504K), 177.3558880 secs]45348.608: [CMS CMS: abort
> preclean due to time 2013-05-17T10:15:56.197+0800: 45348.989:
> [CMS-concurrent-abortable-preclean: 2.037/179.444 secs] [Times: user=44.72
> sys=13.59, real=179.45 secs]
>  (concurrent mode failure): 3017323K->2177093K(3047424K), 16.5528620 secs]
> 3879476K->2177093K(3964928K), [CMS Perm : 91333K->91008K(262144K)],
> 193.9097970 secs] [Times: user=58.79 sys=13.55, real=193.91 secs]
>
> the usual cms full gc time was roughly 100ms-400ms, but this time it lasted
> for 193 seconds. I understand that when there's a parnew gc happens during
> cms and to space is not large enough to hold all survived objects, or the
> remaining space in old gen cannot cope with memory allocation in old gen,
> full gc happens. But I don't understand why it hangs so long.
> I am using oracle jdk 1.6.0_37, and the jvm options we use are:
> -Xms4000m -Xmx4000m -Xmn1G -Xss256k -XX:PermSize=256m -XX:SurvivorRatio=6
> -XX:MaxTenuringThreshold=6 -XX:+DisableExplicitGC -Xnoclassgc -Xverify:none
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods
> -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
> -XX:+UseCompressedOops -XX:CMSInitiatingOccupancyFraction=90
> -XX:+UseCMSInitiatingOccupancyOnly
>
> Could it be a bug that results in the long full gc in case of promotion
> failure or something else? Could anyone offer me some help, and I really
> appreciate your help.
>
> Looking forward to any reply.
>
> All the best,
> Leon
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>