<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-15"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
David, I missed one of the longer pauses that you'd specifically
drawn attention to:-<br>
<br>
<blockquote cite="mid:4E5C88D1.9040401@oracle.com" type="cite"> On
8/29/2011 2:25 AM, David Tavoularis wrote:
<blockquote cite="mid:op.v0yxs0vga5r2ku@nb07-spt1.mycom-int.fr"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-15">
<style type="text/css">
<!--
body
{font-family:'Calibri';
font-size:18px}
--></style>
<div><strong>1. Here is the first pattern : a <u>61-second
pause</u>, but I don't see any suspicious message in GC
logs:</strong><br>
<font face="'Consolas',monospace">...<small></small><small><br>
</small><small>2011-08-24T10:25:07.776+0100: 44242.537: [GC
44301.853: [ParNew</small><small><br>
</small><small>Desired survivor size 53673984 bytes, new
threshold 1 (max 4)</small><small><br>
</small><small>- age 1: 99505080 bytes, 99505080 total</small><small><br>
</small><small>: 943744K->104832K(943744K), 0.2010508
secs] 21542906K->20852742K(50226816K), 0.2022636 secs]
<font color="#ff0000"><strong>[Times: user=5.67 sys=0.02,
real=59.52 secs]</strong></font></small><small><br>
</small></font></div>
</blockquote>
</blockquote>
<br>
If you look at the timestamps above, the GC event starts off at
44242.537 seconds, but then the GC itself does not<br>
commence until 44301.853 seconds, i.e. a full 59.32 seconds later.
So the pause is associated not with<br>
GC work itself (which is correctly reported as 202 ms), but rather
with a preamble to the GC, perhaps<br>
with bringing threads to a safepoint, I am guessing. Once again
-XX:+PrintSafepointStatistics (which<br>
i mentioned in previous email wrt the 20 s pause in the middle of
noweher) would likely provide<br>
some clues. I have heard apocryphal stories of -XX:+UseMembar having
worked to get rid of<br>
overly long safepointing pauses,. and I have heard
-XX:-UseBiasedLocking for pauses associated<br>
with bulk bias revocations. But, without +PrintSafepointStatistics
data to draw inferences<br>
from, those incantations would just constitute superstitious
mumbo-jumbo.<br>
<br>
-- ramki<br>
<br>
<blockquote cite="mid:4E5C88D1.9040401@oracle.com" type="cite">
<blockquote cite="mid:op.v0yxs0vga5r2ku@nb07-spt1.mycom-int.fr"
type="cite">
<div><font face="'Consolas',monospace"><small> </small><small>Heap
after GC invocations=1188 (full 12):</small><small><br>
</small><small>par new generation total 943744K, used
104832K [0xfffffff353c00000, 0xfffffff393c00000,
0xfffffff393c00000)</small><small><br>
</small><small>eden space 838912K, 0% used
[0xfffffff353c00000, 0xfffffff353c00000,
0xfffffff386f40000)</small><small><br>
</small><small>from space 104832K, 100% used
[0xfffffff38d5a0000, 0xfffffff393c00000,
0xfffffff393c00000)</small><small><br>
</small><small>to space 104832K, 0% used
[0xfffffff386f40000, 0xfffffff386f40000,
0xfffffff38d5a0000)</small><small><br>
</small><small>concurrent mark-sweep generation total
49283072K, used 20747910K [0xfffffff393c00000,
0xffffffff53c00000, 0xffffffff53c00000)</small><small><br>
</small><small>concurrent-mark-sweep perm gen total 524288K,
used 42905K [0xffffffff53c00000, 0xffffffff73c00000,
0xffffffff73c00000)</small><small><br>
</small><small>}</small><small><br>
</small><small><strong><font color="#ff0000">Total time for
which application threads were stopped: 61.5576519
seconds</font></strong></small><small><br>
</small><small>Application time: 0.0245838 seconds</small><small><br>
</small><small>Total time for which application threads were
stopped: 9.8331189 seconds</small><small><br>
</small><small>Application time: 0.0012626 seconds</small><small><br>
</small><small>Total time for which application threads were
stopped: 0.0090404 seconds</small><small><br>
</small><small>Application time: 0.0008943 seconds</small><small><br>
</small><small>Total time for which application threads were
stopped: 0.0020415 seconds</small><small><br>
</small><small>Application time: 0.0008181 seconds</small><small><br>
</small><small>Total time for which application threads were
stopped: 0.2338605 seconds</small><small><br>
</small><small>Application time: 0.0018822 seconds</small></font><br>
<br>
The only suspicious thing is <font
face="'Consolas',monospace"><small>"[Times: user=5.67
sys=0.02, real=59.52 secs]"</small></font>, which means
that the "real" duration is a lot higher than "user" CPU time.<br>
Because "sys" duration is low, it also means that the server
is not swapping.<br>
</div>
<div>What could explain this 61 seconds pause ? <br>
</div>
</blockquote>
</blockquote>
</body>
</html>