Periodic long minor GC pauses
Paul Hohensee
paul.hohensee at oracle.com
Thu May 5 19:24:25 UTC 2011
A long shot, but...
If this is a recent version of Linux, perhaps you're running the Completely
Fair Scheduler and this is an artifact. If so, try reverting to the old
scheduler.
Paul
On 5/5/11 3:11 PM, Shane Cox wrote:
> Jon,
>
> Thanks for the suggestion. Interestingly enough, adding
> -XX:+AlwaysPreTouch increased the frequency of the problem by 2x.
> Although this didn't solve the problem, I think it offers a clue.
>
> Touching all of the memory pages on startup will leave fewer pages on
> the free list, no? Could fewer pages on the free list trigger an OS
> behavior that interferes with an activity in the GC prologue? All
> that comes to mind is the linux page scanner (recycling pages or
> pruning the page cache) - not sure how this could interfere with the
> GC prologue.
>
> According to the free command, swap usage remained 0 throughout the
> test ... so I'm not suspecting swap at this time.
>
> FYI, I ran the application/test on a Solaris 10 host and was unable to
> recreate the problem. So the problem appears to be linux-specific (RHEL).
>
>
> Any additional thoughts would be appreciated.
>
> Shane
>
>
> On Wed, Apr 27, 2011 at 12:34 AM, Jon Masamitsu
> <jon.masamitsu at oracle.com <mailto:jon.masamitsu at oracle.com>> wrote:
>
> Shane,
>
> Have you tried running with -XX:+AlwaysPreTouch ? We've occasionally
> seen intermittent
> long pauses as the heap grows into newly committed pages. This flag
> causes pages
> to be touched as they are committed. I don't know how this fits into
> Ramki's
> observation but it might be worth a shot.
>
> Jon
>
>
> On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote:
> > Well-spotted; it's a version of the same problem as near as
> > i can tell. Please make sure to include a sizable GC log with
> > your bug report (starting from VM start-up, so we can see if
> > there is any clue in when the problem first starts during
> > the life of the VM).
> >
> > thanks.
> > -- ramki
> >
> > On 04/26/11 11:29, Shane Cox wrote:
> >> Below is an example from a Remark. Of the total 1.3 seconds of
> elapsed
> >> time, 1.2 seconds is found between the first two timestamps.
> However,
> >> I'm not savvy enough to know whether this is the same problem
> or simply
> >> the result of a long scavenge that occurs as part of the
> Remark. Is
> >> there any way to tell?
> >>
> >> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG occupancy: 712500 K
> >> (943744 K)]9467.353: [Rescan (parallel) , 0.0106370
> secs]9467.374: [weak
> >> refs processing, 0.0159250 secs]9467.390: [class unloading,
> 0.0180420
> >> secs]9467.408: [scrub symbol& string tables, 0.0458500 secs] [1
> >> CMS-remark: 12520949K(24117248K)] 13233450K(25060992K),
> 0.1052950 secs]
> >> [Times: user=0.13 sys=0.01, real=1.32 secs]
> >>
> >>
> >> On Tue, Apr 26, 2011 at 2:17 PM, Y. S. Ramakrishna
> >> <y.s.ramakrishna at oracle.com
> <mailto:y.s.ramakrishna at oracle.com><mailto:y.s.ramakrishna at oracle.com
> <mailto:y.s.ramakrishna at oracle.com>>> wrote:
> >>
> >> I had a quick look and all i could find was the GC prologue
> >> code (although i didn't look all that carefully).
> >> Bascially, GC is invoked, it prints this timestamp,
> >> does a bit of global book-keeping and some initialization,
> >> and then goes over each generation in the heap and
> >> says "i am going to do a collection, do whatever you need
> >> to do before i do the collection", and the generations
> each do a bit of
> >> book-keeping and any relevant initialization.
> >>
> >> The only thing i can see in the gc prologues other than a bit
> >> of lightweight book-keeping is some reporting code that could
> >> potentially be heavyweight. But you do not have any of those
> >> enabled in your option set, so there should not be anything
> >> obviously heavyweight going on.
> >>
> >> I'd suggest filing a bug under the category of
> >> jvm/hotspot/garbage_collector
> >> so someone in support can work with you to get this
> diagnosed...
> >>
> >> Three questions when you file the bug:
> >> (1) have you seen this start happening recently? (version?)
> >> (2) can you check if the longer pauses are "random" or do
> >> they always happen "during" CMS concurrent cycles or
> >> always outside of such cycles?
> >> (3) test set-up.
> >>
> >> -- ramki
> >>
> >>
> >> On 04/26/11 10:45, Y. S. Ramakrishna wrote:
> >>
> >> The pause is definitely in the beginning, before GC
> collection code
> >> itself runs; witness the timestamps:-
> >>
> >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075:
> [ParNew:
> >> 943744K->79296K(943744K), 0.0559560 secs]
> >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times:
> user=0.31
> >> sys=0.09, real=2.45 secs]
> >>
> >> The first timestamp is 2120.686 and the next one is
> 2123.075, so
> >> we have
> >> about 2.389 s between those two. If you add to that
> the GC time
> >> of 0.056 s,
> >> you get 2.445 which is close enough to the 2.45 s
> reported.
> >>
> >> So we need to figure out what happens in the JVM
> between those two
> >> time-stamps and we can at least bound the culprit.
> >>
> >> -- ramki
> >>
> >> On 04/26/11 10:36, Shane Cox wrote:
> >>
> >> Periodically, our Java app on Linux experiences a
> long Minor
> >> GC pause that cannot be accounted for by the GC
> time in the
> >> log file. Instead, the pause is captured as
> "real" (wall
> >> clock) time and is observable in our application
> logs. An
> >> example is below. The GC completed in 56ms, but the
> >> application was paused for 2.45 seconds.
> >>
> >> 2011-04-26T12:50:41.722-0400: 2117.157: [GC 2117.157:
> >> [ParNew: 943439K->104832K(943744K), 0.0481790 secs]
> >> 4909998K->4086751K(25060992K), 0.0485110 secs] [Times:
> >> user=0.34 sys=0.03, real=0.04 secs]
> >> 2011-04-26T12:50:43.882-0400: 2119.317: [GC 2119.317:
> >> [ParNew: 942852K->104832K(943744K), 0.0738000 secs]
> >> 4924772K->4150899K(25060992K), 0.0740980 secs] [Times:
> >> user=0.45 sys=0.12, real=0.07 secs]
> >> 2011-04-26T12:50:45.251-0400: 2120.686: [GC 2123.075:
> >> [ParNew: 943744K->79296K(943744K), 0.0559560 secs]
> >> 4989811K->4187520K(25060992K), 0.0563970 secs] [Times:
> >> user=0.31 sys=0.09, *real=2.45 secs]*
> >> 2011-04-26T12:50:48.493-0400: 2123.928: [GC 2123.928:
> >> [ParNew: 918208K->81040K(943744K), 0.0396620 secs]
> >> 5026432K->4189265K(25060992K), 0.0400030 secs] [Times:
> >> user=0.32 sys=0.00, real=0.04 secs]
> >> 2011-04-26T12:50:51.010-0400: 2126.445: [GC 2126.445:
> >> [ParNew: 919952K->104832K(943744K), 0.0845070 secs]
> >> 5028177K->4268050K(25060992K), 0.0848300 secs] [Times:
> >> user=0.52 sys=0.11, real=0.09 secs]
> >>
> >>
> >> Initially I suspected swapping, but according to
> the free
> >> command, 0 bytes of swap are in use.
> >> >free -m
> >> total used free shared
> >> buffers cached
> >> Mem: 32168 28118 4050 0
> >> 824 12652
> >> -/+ buffers/cache: 14641 17527
> >> Swap: 8191 0 8191
> >>
> >>
> >> Next, I read about a problem relating to
> mprotect() on Linux
> >> that can be worked around with -XX:+UseMember. I
> tried
> >> that, but I still see the same unexplainable pauses.
> >>
> >>
> >> Any suggestions/ideas? We've upgraded to the
> latest JDK,
> >> but no luck.
> >>
> >> Thanks,
> >> Shane
> >>
> >>
> >> java version "1.6.0_25"
> >> Java(TM) SE Runtime Environment (build 1.6.0_25-b06)
> >> Java HotSpot(TM) 64-Bit Server VM (build 20.0-b11,
> mixed mode)
> >>
> >>
> >> Linux 2.6.18-128.el5 #1 SMP Wed Jan 21 08:45:05
> EST 2009
> >> x86_64 x86_64 x86_64 GNU/Linux
> >>
> >>
> >> -verbose:gc -Xms24g -Xmx24g -Xmn1g -Xss256k
> >> -XX:PermSize=256m -XX:MaxPermSize=256m
> >> -XX:+PrintTenuringDistribution -XX:+UseConcMarkSweepGC
> >> -XX:+CMSParallelRemarkEnabled
> >> -XX:CMSInitiatingOccupancyFraction=70
> >> -XX:+CMSClassUnloadingEnabled -XX:+PrintGCDetails
> >> -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC
> >> -XX:+HeapDumpOnOutOfMemoryError
> -XX:+UseCompressedStrings
> >> -XX:+UseMembar
> >>
> >>
> >>
> ------------------------------------------------------------------------
> >>
> >> _______________________________________________
> >> hotspot-gc-use mailing list
> >> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> >> <mailto:hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>>
> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >>
> >> _______________________________________________
> >> hotspot-gc-use mailing list
> >> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> >> <mailto:hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>>
> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> >>
> >>
> > _______________________________________________
> > hotspot-gc-use mailing list
> > hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20110505/e1b945aa/attachment.htm>
More information about the hotspot-gc-dev
mailing list