unexpected full gc time spike
the.6th.month at gmail.com
the.6th.month at gmail.com
Sat May 18 02:05:49 PDT 2013
Hi, Jon & Ramki:
Sorry I can't get access to that page, the browser says the webpage is
currently unavailable. But I did look through the whole mail thread
regarding this issue. I am wondering if I get it right. If each thread
failed fast and started to fall back to single-threaded full gc
immediately, there wouldn't be such a long pause. But under the current
mechanism, there's no such a global flag to coordinate threads to fall
back, and each thread tends to retry the allocation of new plab which could
result in highly active locking contention and hence the extremely long
pause.
Is that correct?
Leon
On 18 May 2013 01:29, Jon Masamitsu <jon.masamitsu at oracle.com> wrote:
> I found it
>
> 8005060: Promotion failure code does not scale<https://jbs.oracle.com/bugs/browse/JDK-8005060>
>
>
>
> On 5/17/2013 10:05 AM, Srinivas Ramakrishna wrote:
>
> Looks like the search functionality of bugs.sun.com is no longer
> available. I tried searching the new bugzilla portal for the bug I had
> submitted around that time, but that doesn't bring up the bug when i
> use the normal search terms, so I do not know if the bug report is
> still in review or not, and whether it ever made it into the set of
> hotspot/gc bugs or not, but the Review ID i recvd was:-
>
> "Your Report (Review ID: 2391561) - Promotion failure code does not scale "
>
> I'll try and dig up the (raw, tentative) patch and send it in soon.
>
> -- ramki
>
>
> On Fri, May 17, 2013 at 9:37 AM, Srinivas Ramakrishna <ysr1729 at gmail.com> <ysr1729 at gmail.com> wrote:
>
> Hi Leon --
>
> Here's the history of that discussion, starting with this email
> (follow subject thread):
> http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2012-October/001370.html
>
> On Thu, May 16, 2013 at 10:06 PM, the.6th.month at gmail.com<the.6th.month at gmail.com> <the.6th.month at gmail.com> wrote:
>
> hi, Ramki:
> btw, could you possibly explain what the bugs are and how those bugs affect
> the fallback fullgc time? I am really curious about the reason.
> thanks very much.
>
> all the best,
> Leon
>
> On 17 May 2013 12:39, "the.6th.month at gmail.com" <the.6th.month at gmail.com> <the.6th.month at gmail.com> <the.6th.month at gmail.com>
> wrote:
>
> thanks very much indeed, hope we can see your patch soon
>
>
> On 17 May 2013 12:38, Srinivas Ramakrishna <ysr1729 at gmail.com> <ysr1729 at gmail.com> wrote:
>
> Hi Leon --
>
> Yes, there are a couple of performance bugs related to promotion
> failure handling with ParNew+CMS that can cause this time to balloon.
> Here the unwind of the failed promotion took 177 s. I have at least a
> partial fix for this which I had written up a few months ago but never
> quite got around to collecting sufficient performance data to submit
> it as an official patch.
>
> I'll try and revive that patch and submit it... May be someone else
> can check if it helps sufficiently in the performance with promotion
> failure.
>
> -- ramki
>
>
> On Thu, May 16, 2013 at 9:19 PM, the.6th.month at gmail.com<the.6th.month at gmail.com> <the.6th.month at gmail.com> wrote:
>
> hi, all:
> We just had a situation that I don't quite understand with CMS gc. When
> I
> examined the gc log, I found that there was a cms gc which resulted in
> a
> parnew promotion failure and concurrent mode failure at the same time,
> and
> then the full gc lasted for slightly over three minutes. Here is the gc
> log:
> 2013-05-17T10:12:55.983+0800: 45168.774: [CMS-concurrent-mark:
> 7.056/7.860
> secs] [Times: user=14.90 sys=0.45, real=7.86 secs]
> 2013-05-17T10:12:55.984+0800: 45168.775:
> [CMS-concurrent-preclean-start]
> 2013-05-17T10:12:56.753+0800: 45169.544: [CMS-concurrent-preclean:
> 0.676/0.770 secs] [Times: user=0.83 sys=0.15, real=0.77 secs]
> 2013-05-17T10:12:56.753+0800: 45169.544:
> [CMS-concurrent-abortable-preclean-start]
> 2013-05-17T10:12:58.460+0800: 45171.251: [GC 45171.252: [ParNew
> (promotion
> failed)
> Desired survivor size 67108864 bytes, new threshold 1 (max 6)
> - age 1: 70527216 bytes, 70527216 total
> : 917504K->917504K(917504K), 177.3558880 secs]45348.608: [CMS CMS:
> abort
> preclean due to time 2013-05-17T10:15:56.197+0800: 45348.989:
> [CMS-concurrent-abortable-preclean: 2.037/179.444 secs] [Times:
> user=44.72
> sys=13.59, real=179.45 secs]
> (concurrent mode failure): 3017323K->2177093K(3047424K), 16.5528620
> secs]
> 3879476K->2177093K(3964928K), [CMS Perm : 91333K->91008K(262144K)],
> 193.9097970 secs] [Times: user=58.79 sys=13.55, real=193.91 secs]
>
> the usual cms full gc time was roughly 100ms-400ms, but this time it
> lasted
> for 193 seconds. I understand that when there's a parnew gc happens
> during
> cms and to space is not large enough to hold all survived objects, or
> the
> remaining space in old gen cannot cope with memory allocation in old
> gen,
> full gc happens. But I don't understand why it hangs so long.
> I am using oracle jdk 1.6.0_37, and the jvm options we use are:
> -Xms4000m -Xmx4000m -Xmn1G -Xss256k -XX:PermSize=256m
> -XX:SurvivorRatio=6
> -XX:MaxTenuringThreshold=6 -XX:+DisableExplicitGC -Xnoclassgc
> -Xverify:none
> -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
> -XX:+UseCMSCompactAtFullCollection -XX:+UseFastAccessorMethods
> -XX:+CMSPermGenSweepingEnabled -XX:+CMSClassUnloadingEnabled
> -XX:+UseCompressedOops -XX:CMSInitiatingOccupancyFraction=90
> -XX:+UseCMSInitiatingOccupancyOnly
>
> Could it be a bug that results in the long full gc in case of promotion
> failure or something else? Could anyone offer me some help, and I
> really
> appreciate your help.
>
> Looking forward to any reply.
>
> All the best,
> Leon
>
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
> _______________________________________________
> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130518/9c873d4f/attachment.html
More information about the hotspot-gc-use
mailing list