<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Forgot Shane...<br>
<br>
Paul<br>
<br>
On 5/5/11 3:24 PM, Paul Hohensee wrote:
<blockquote cite="mid:4DC2F969.10902@oracle.com" type="cite">
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
A long shot, but...<br>
<br>
If this is a recent version of Linux, perhaps you're running the
Completely<br>
Fair Scheduler and this is an artifact. If so, try reverting to
the old scheduler.<br>
<br>
Paul<br>
<br>
On 5/5/11 3:11 PM, Shane Cox wrote:
<blockquote
cite="mid:BANLkTikTOhmFNQeViDr0Lr=3wMAhfjCYvA@mail.gmail.com"
type="cite">Jon,<br>
<br>
Thanks for the suggestion. Interestingly enough, adding
-XX:+AlwaysPreTouch increased the frequency of the problem by
2x. Although this didn't solve the problem, I think it offers a
clue.<br>
<br>
Touching all of the memory pages on startup will leave fewer
pages on the free list, no? Could fewer pages on the free list
trigger an OS behavior that interferes with an activity in the
GC prologue? All that comes to mind is the linux page scanner
(recycling pages or pruning the page cache) - not sure how this
could interfere with the GC prologue.<br>
<br>
According to the free command, swap usage remained 0 throughout
the test ... so I'm not suspecting swap at this time.<br>
<br>
FYI, I ran the application/test on a Solaris 10 host and was
unable to recreate the problem. So the problem appears to be
linux-specific (RHEL).<br>
<br>
<br>
Any additional thoughts would be appreciated.<br>
<br>
Shane<br>
<br>
<br>
<div class="gmail_quote">On Wed, Apr 27, 2011 at 12:34 AM, Jon
Masamitsu <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:jon.masamitsu@oracle.com">jon.masamitsu@oracle.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt
0.8ex; border-left: 1px solid rgb(204, 204, 204);
padding-left: 1ex;">Shane,<br>
<br>
Have you tried running with -XX:+AlwaysPreTouch ? We've
occasionally<br>
seen intermittent<br>
long pauses as the heap grows into newly committed pages.
This flag<br>
causes pages<br>
to be touched as they are committed. I don't know how this
fits into<br>
Ramki's<br>
observation but it might be worth a shot.<br>
<font color="#888888"><br>
Jon<br>
</font>
<div>
<div class="h5"><br>
<br>
On 4/26/2011 12:40 PM, Y. S. Ramakrishna wrote:<br>
> Well-spotted; it's a version of the same problem as
near as<br>
> i can tell. Please make sure to include a sizable
GC log with<br>
> your bug report (starting from VM start-up, so we
can see if<br>
> there is any clue in when the problem first starts
during<br>
> the life of the VM).<br>
><br>
> thanks.<br>
> -- ramki<br>
><br>
> On 04/26/11 11:29, Shane Cox wrote:<br>
>> Below is an example from a Remark. Of the
total 1.3 seconds of elapsed<br>
>> time, 1.2 seconds is found between the first
two timestamps. However,<br>
>> I'm not savvy enough to know whether this is
the same problem or simply<br>
>> the result of a long scavenge that occurs as
part of the Remark. Is<br>
>> there any way to tell?<br>
>><br>
>> 2011-04-25T14:38:40.215-0400: 9466.139: [GC[YG
occupancy: 712500 K<br>
>> (943744 K)]9467.353: [Rescan (parallel) ,
0.0106370 secs]9467.374: [weak<br>
>> refs processing, 0.0159250 secs]9467.390:
[class unloading, 0.0180420<br>
>> secs]9467.408: [scrub symbol& string
tables, 0.0458500 secs] [1<br>
>> CMS-remark: 12520949K(24117248K)]
13233450K(25060992K), 0.1052950 secs]<br>
>> [Times: user=0.13 sys=0.01, real=1.32 secs]<br>
>><br>
>><br>
>> On Tue, Apr 26, 2011 at 2:17 PM, Y. S.
Ramakrishna<br>
>> <<a moz-do-not-send="true"
href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a><mailto:<a
moz-do-not-send="true"
href="mailto:y.s.ramakrishna@oracle.com">y.s.ramakrishna@oracle.com</a>>>
wrote:<br>
>><br>
>> I had a quick look and all i could find
was the GC prologue<br>
>> code (although i didn't look all that
carefully).<br>
>> Bascially, GC is invoked, it prints this
timestamp,<br>
>> does a bit of global book-keeping and some
initialization,<br>
>> and then goes over each generation in the
heap and<br>
>> says "i am going to do a collection, do
whatever you need<br>
>> to do before i do the collection", and the
generations each do a bit of<br>
>> book-keeping and any relevant
initialization.<br>
>><br>
>> The only thing i can see in the gc
prologues other than a bit<br>
>> of lightweight book-keeping is some
reporting code that could<br>
>> potentially be heavyweight. But you do not
have any of those<br>
>> enabled in your option set, so there
should not be anything<br>
>> obviously heavyweight going on.<br>
>><br>
>> I'd suggest filing a bug under the
category of<br>
>> jvm/hotspot/garbage_collector<br>
>> so someone in support can work with you to
get this diagnosed...<br>
>><br>
>> Three questions when you file the bug:<br>
>> (1) have you seen this start happening
recently? (version?)<br>
>> (2) can you check if the longer pauses are
"random" or do<br>
>> they always happen "during" CMS
concurrent cycles or<br>
>> always outside of such cycles?<br>
>> (3) test set-up.<br>
>><br>
>> -- ramki<br>
>><br>
>><br>
>> On 04/26/11 10:45, Y. S. Ramakrishna
wrote:<br>
>><br>
>> The pause is definitely in the
beginning, before GC collection code<br>
>> itself runs; witness the timestamps:-<br>
>><br>
>> 2011-04-26T12:50:45.251-0400:
2120.686: [GC 2123.075: [ParNew:<br>
>> 943744K->79296K(943744K), 0.0559560
secs]<br>
>> 4989811K->4187520K(25060992K),
0.0563970 secs] [Times: user=0.31<br>
>> sys=0.09, real=2.45 secs]<br>
>><br>
>> The first timestamp is 2120.686 and
the next one is 2123.075, so<br>
>> we have<br>
>> about 2.389 s between those two. If
you add to that the GC time<br>
>> of 0.056 s,<br>
>> you get 2.445 which is close enough to
the 2.45 s reported.<br>
>><br>
>> So we need to figure out what happens
in the JVM between those two<br>
>> time-stamps and we can at least bound
the culprit.<br>
>><br>
>> -- ramki<br>
>><br>
>> On 04/26/11 10:36, Shane Cox wrote:<br>
>><br>
>> Periodically, our Java app on
Linux experiences a long Minor<br>
>> GC pause that cannot be accounted
for by the GC time in the<br>
>> log file. Instead, the pause is
captured as "real" (wall<br>
>> clock) time and is observable in
our application logs. An<br>
>> example is below. The GC
completed in 56ms, but the<br>
>> application was paused for 2.45
seconds.<br>
>><br>
>> 2011-04-26T12:50:41.722-0400:
2117.157: [GC 2117.157:<br>
>> [ParNew:
943439K->104832K(943744K), 0.0481790 secs]<br>
>> 4909998K->4086751K(25060992K),
0.0485110 secs] [Times:<br>
>> user=0.34 sys=0.03, real=0.04
secs]<br>
>> 2011-04-26T12:50:43.882-0400:
2119.317: [GC 2119.317:<br>
>> [ParNew:
942852K->104832K(943744K), 0.0738000 secs]<br>
>> 4924772K->4150899K(25060992K),
0.0740980 secs] [Times:<br>
>> user=0.45 sys=0.12, real=0.07
secs]<br>
>> 2011-04-26T12:50:45.251-0400:
2120.686: [GC 2123.075:<br>
>> [ParNew:
943744K->79296K(943744K), 0.0559560 secs]<br>
>> 4989811K->4187520K(25060992K),
0.0563970 secs] [Times:<br>
>> user=0.31 sys=0.09, *real=2.45
secs]*<br>
>> 2011-04-26T12:50:48.493-0400:
2123.928: [GC 2123.928:<br>
>> [ParNew:
918208K->81040K(943744K), 0.0396620 secs]<br>
>> 5026432K->4189265K(25060992K),
0.0400030 secs] [Times:<br>
>> user=0.32 sys=0.00, real=0.04
secs]<br>
>> 2011-04-26T12:50:51.010-0400:
2126.445: [GC 2126.445:<br>
>> [ParNew:
919952K->104832K(943744K), 0.0845070 secs]<br>
>> 5028177K->4268050K(25060992K),
0.0848300 secs] [Times:<br>
>> user=0.52 sys=0.11, real=0.09
secs]<br>
>><br>
>><br>
>> Initially I suspected swapping,
but according to the free<br>
>> command, 0 bytes of swap are in
use.<br>
>> >free -m<br>
>> total used
free shared<br>
>> buffers cached<br>
>> Mem: 32168 28118
4050 0<br>
>> 824 12652<br>
>> -/+ buffers/cache: 14641
17527<br>
>> Swap: 8191 0
8191<br>
>><br>
>><br>
>> Next, I read about a problem
relating to mprotect() on Linux<br>
>> that can be worked around with
-XX:+UseMember. I tried<br>
>> that, but I still see the same
unexplainable pauses.<br>
>><br>
>><br>
>> Any suggestions/ideas? We've
upgraded to the latest JDK,<br>
>> but no luck.<br>
>><br>
>> Thanks,<br>
>> Shane<br>
>><br>
>><br>
>> java version "1.6.0_25"<br>
>> Java(TM) SE Runtime Environment
(build 1.6.0_25-b06)<br>
>> Java HotSpot(TM) 64-Bit Server VM
(build 20.0-b11, mixed mode)<br>
>><br>
>><br>
>> Linux 2.6.18-128.el5 #1 SMP Wed
Jan 21 08:45:05 EST 2009<br>
>> x86_64 x86_64 x86_64 GNU/Linux<br>
>><br>
>><br>
>> -verbose:gc -Xms24g -Xmx24g
-Xmn1g -Xss256k<br>
>> -XX:PermSize=256m
-XX:MaxPermSize=256m<br>
>> -XX:+PrintTenuringDistribution
-XX:+UseConcMarkSweepGC<br>
>> -XX:+CMSParallelRemarkEnabled<br>
>>
-XX:CMSInitiatingOccupancyFraction=70<br>
>> -XX:+CMSClassUnloadingEnabled
-XX:+PrintGCDetails<br>
>> -XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC<br>
>> -XX:+HeapDumpOnOutOfMemoryError
-XX:+UseCompressedStrings<br>
>> -XX:+UseMembar<br>
>><br>
>><br>
>>
------------------------------------------------------------------------<br>
>><br>
>>
_______________________________________________<br>
>> hotspot-gc-use mailing list<br>
>> <a moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
>> <mailto:<a
moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>><br>
>> <a moz-do-not-send="true"
href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
>><br>
>>
_______________________________________________<br>
>> hotspot-gc-use mailing list<br>
>> <a moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
>> <mailto:<a moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>><br>
>> <a moz-do-not-send="true"
href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
>><br>
>><br>
> _______________________________________________<br>
> hotspot-gc-use mailing list<br>
> <a moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
> <a moz-do-not-send="true"
href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
_______________________________________________<br>
hotspot-gc-use mailing list<br>
<a moz-do-not-send="true"
href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a><br>
<a moz-do-not-send="true"
href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use"
target="_blank">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a><br>
</div>
</div>
</blockquote>
</div>
<br>
<pre wrap=""><fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
hotspot-gc-use mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:hotspot-gc-use@openjdk.java.net">hotspot-gc-use@openjdk.java.net</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use">http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use</a>
</pre>
</blockquote>
</blockquote>
</body>
</html>