long unaccounted pause during concurrent mark phase of ParNew cycle
atulksh
atulksh at hotmail.com
Fri Feb 5 16:47:51 UTC 2010
Hi Jon,
We are currently still using Update 2 on our servers. We see the same kind
of pause behavior on Update 2 as well as Update 17. The pause behavior does
not seem to change between these two versions.
Thank you,
Atul
Jon Masamitsu wrote:
>
> Atul,
>
> You mentioned that you were on jdk6 update 2 initially.
> Did you see this behavior with earlier releases?
>
> Jon
>
> On 02/04/10 20:02, atulksh wrote:
>> Hi Jon,
>> I have some more (probably more precise) information for you. I have been
>> able to reproduce this issue in our test environment (up till now we were
>> only seeing it in our production setup). Along with the parameters that I
>> have specified above I added the following.
>>
>>
>>
>> After loading up the app server (it is Tomcat 6.0.14) I got a pause
>> during
>> the CMS cycle of about 32 seconds. The way I noticed the pause is like
>> this.
>> I kept on running a script that does some basic page refresh (small data
>> page) in the application continously. This is like a ping to the server.
>> Then I parallely do operations that load up the memory in the application
>> server. These are the real hits that users do in whom memory requirement
>> is
>> usually high and that drives the app server to the CMS threshold.
>>
>> In my case once the CMS kicked in, I saw that the period where the
>> applilcation was in the "pause" no requests of my script were processed.
>> They were just queued up to Tomcat - meaning Apache got those requests
>> and
>> logged that they were sent to Tomcat. But Tomcat did not register of
>> receiving them. Then after the app went out of the "pause" then
>> everything
>> came back and all these pending requests got processed - like the app
>> just
>> had been unfrozen.
>>
>> During this freeze, what I see in the GC log is that the CMS cycle is
>> almost
>> at the end of the concurrent mark phase. It is doing the GC is doing the
>> "perm space scanning". The pause exactly coincides with the time reported
>> for the "perm space scanning". I am attaching the full gc log snippt
>> right
>> from the start of the CMS cycle. Below is just the place where the pause
>> appears.
>>
>> 28676.784: [GC 28676.784: [ParNew
>> Desired survivor size 4358144 bytes, new threshold 2 (max 4)
>> - age 1: 3742608 bytes, 3742608 total
>> - age 2: 3451152 bytes, 7193760 total
>> : 76672K->8512K(76672K), 0.2045430 secs] 20262376K->20197311K(27254464K),
>> 0.2048030 secs] [Times: user=0.77 sys=0.01, real=0.21 secs]
>> Total time for which application threads were stopped: 0.2055080 seconds
>> Finished perm space scanning in 2th thread: 0.367 sec
>> Finished perm space scanning in 0th thread: 0.362 sec
>> Application time: 0.1481860 seconds
>> 28708.626: [GC 28708.627: [ParNew
>> Desired survivor size 4358144 bytes, new threshold 2 (max 4)
>> - age 1: 2079760 bytes, 2079760 total
>> - age 2: 3657248 bytes, 5737008 total
>> : 76672K->6795K(76672K), 0.1937080 secs] 20265471K->20198965K(27254464K),
>> 0.1939480 secs] [Times: user=0.74 sys=0.00, real=0.20 secs]
>> Total time for which application threads were stopped: 0.1948820 seconds
>> Finished perm space scanning in 3th thread: 32.153 sec
>> Finished perm space scanning in 1th thread: 32.097 sec
>> Finished work stealing in 1th thread: 0.000 sec
>> Finished work stealing in 3th thread: 0.002 sec
>> Finished work stealing in 2th thread: 31.791 sec
>> Finished work stealing in 0th thread: 31.791 sec
>> 28708.825: [CMS-concurrent-mark: 88.906/158.952 secs]
>> (CMS-concurrent-mark
>> yielded 454 times)
>> [Times: user=544.33 sys=4.59, real=158.95 secs]
>>
>> What is "perm space scanning" and is this where the gc would freeze the
>> application (STW type). Is this a bug or something that is expected?
>>
>> Please let me know. This has become quite critical for our application.
>> Please see the full gc log file attached. Your help very much
>> appreciated.
>>
>> Thank you,
>>
>> Atul
>>
>>
>> Jon Masamitsu wrote:
>>> Atul,
>>>
>>> Do you mean the 36 secs here?
>>>
>>> > Thu Feb 4 10:13:35 2010: [CMS-concurrent-mark: 36.272/36.715 secs]
>>> [Times:
>>> > user=56.34 sys=0.34, real=36.71 secs]
>>>
>>> That is a concurrent phase (i.e., the application threads
>>> are still running although they have to share the hardware
>>> threads with the concurrent GC threads).
>>>
>>> Jon
>>>
>>> On 02/04/10 12:37, atulksh wrote:
>>>> Hi,
>>>> We have an application that is freezing (pausing) for a very long (tens
>>>> of
>>>> seconds) time during the concurrent mark phase of the ParNew cycle.
>>>> This
>>>> pause is consistently seen when there is good amount of activity on the
>>>> server. We tried to upgrade the JVM (JRE 6) from Update 02 to Update 17
>>>> keeping all the parameters same, but the pauses did not go away (they
>>>> became
>>>> more consistent and slightly longer). We even tried to enable
>>>> CompressedOops
>>>> option, thinking that that would reduce memory requirements, but that
>>>> has
>>>> not helped.
>>>>
>>>> We are running on 64bit SUSE 10.1 platform with 32Gb of RAM and 8 core
>>>> machine. 26g is allocated to the java process. The full set of VM
>>>> parameters
>>>> are here
>>>>
>>>>
>>>> -Xmx26g
>>>> -Xms26g
>>>> -server
>>>> -XX:+UseConcMarkSweepGC
>>>> -XX:PermSize=256m
>>>> -XX:MaxPermSize=256m
>>>> -XX:+PrintTenuringDistribution
>>>> -XX:+PrintGCDetails
>>>> -XX:+PrintGCTimeStamps
>>>> -XX:CMSInitiatingOccupancyFraction=70
>>>> -XX:-CMSParallelRemarkEnabled
>>>>
>>>> The nature of the pause is like this. All the application threads that
>>>> are
>>>> being processed seem to stop responding. New requests that come to our
>>>> app
>>>> server are not processed and get queued. When the pause ends then
>>>> everything
>>>> frees up and seems to go on normally henceforth. In the GC you see a
>>>> gap
>>>> between the ParNews. The following are two examples of the pauses.
>>>> Notice
>>>> the pause from 10:13:06 to 10:13:35 (29 seconds) in the first one and
>>>> 11:29:04 to 11:29:17 (13 seconds) in the second one. These pauses only
>>>> occur
>>>> in the concurrent mark phase. They do not seem to occur at any other
>>>> time
>>>> or
>>>> anywhere else in the CMS cycle.
>>>>
>>>> ....
>>>> Thu Feb 4 10:12:59 2010: [GC [1 CMS-initial-mark:
>>>> 18976170K(27092608K)]
>>>> 18989786K(27245952K), 0.0059140 secs] [Times: user=0.01 sys=0.00,
>>>> real=0.01
>>>> secs]
>>>> Thu Feb 4 10:12:59 2010: [CMS-concurrent-mark-start]
>>>> Thu Feb 4 10:12:59 2010 [18,980,009 K]: [GC 42035.293: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 1833064
>>>> bytes, 1833064 total - age 2: 18472 bytes, 1851536 total :
>>>> 148315K->3839K(153344K), 0.0400230 secs]
>>>> 19124486K->18980009K(27245952K),
>>>> 0.0401580 secs] [Times: user=0.25 sys=0.00, real=0.04 secs]
>>>> Thu Feb 4 10:12:59 2010 [18,979,036 K]: [GC 42035.415: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 2068376
>>>> bytes, 2068376 total - age 2: 23792 bytes, 2092168 total -
>>>> age
>>>> 3: 18472 bytes, 2110640 total : 140159K->2866K(153344K),
>>>> 0.0352200
>>>> secs] 19116329K->18979036K(27245952K), 0.0353520 secs] [Times:
>>>> user=0.29
>>>> sys=0.00, real=0.03 secs]
>>>> Thu Feb 4 10:12:59 2010 [18,979,103 K]: [GC 42035.535: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 1782608
>>>> bytes, 1782608 total - age 2: 22032 bytes, 1804640 total -
>>>> age
>>>> 3: 23792 bytes, 1828432 total - age 4: 18472 bytes,
>>>> 1846904 total : 139186K->2933K(153344K), 0.0498480 secs]
>>>> 19115356K->18979103K(27245952K), 0.0499850 secs] [Times: user=0.24
>>>> sys=0.00,
>>>> real=0.05 secs]
>>>> Thu Feb 4 10:12:59 2010 [18,979,181 K]: [GC 42035.673: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 2161000
>>>> bytes, 2161000 total - age 2: 22032 bytes, 2183032 total -
>>>> age
>>>> 3: 22032 bytes, 2205064 total - age 4: 23792 bytes,
>>>> 2228856 total : 139253K->2992K(153344K), 0.0400420 secs]
>>>> 19115423K->18979181K(27245952K), 0.0401730 secs] [Times: user=0.26
>>>> sys=0.00,
>>>> real=0.04 secs]
>>>> Thu Feb 4 10:13:06 2010 [18,979,876 K]: [GC 42042.243: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 2113272
>>>> bytes, 2113272 total - age 2: 23792 bytes, 2137064 total -
>>>> age
>>>> 3: 22032 bytes, 2159096 total - age 4: 22032 bytes,
>>>> 2181128 total : 138855K->3662K(153344K), 0.0465500 secs]
>>>> 19115044K->18979876K(27245952K), 0.0466840 secs] [Times: user=0.27
>>>> sys=0.00,
>>>> real=0.04 secs]
>>>> Thu Feb 4 10:13:35 2010 [18,979,051 K]: [GC 42071.809: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 219056
>>>> bytes, 219056 total - age 2: 1867632 bytes, 2086688 total -
>>>> age
>>>> 3: 23792 bytes, 2110480 total - age 4: 22032 bytes,
>>>> 2132512 total : 139982K->2814K(153344K), 0.0426920 secs]
>>>> 19116196K->18979051K(27245952K), 0.0428200 secs] [Times: user=0.28
>>>> sys=0.00,
>>>> real=0.05 secs]
>>>> Thu Feb 4 10:13:35 2010: [CMS-concurrent-mark: 36.272/36.715 secs]
>>>> [Times:
>>>> user=56.34 sys=0.34, real=36.71 secs]
>>>> ....
>>>>
>>>>
>>>> ....
>>>> Thu Feb 4 11:28:59 2010: [GC [1 CMS-initial-mark:
>>>> 19012252K(27092608K)]
>>>> 19032719K(27245952K), 0.0279440 secs] [Times: user=0.02 sys=0.00,
>>>> real=0.03
>>>> secs]
>>>> Thu Feb 4 11:28:59 2010: [CMS-concurrent-mark-start]
>>>> Thu Feb 4 11:28:59 2010 [19,154,661 K]: [GC 46595.614: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 17431976
>>>> bytes, 17431976 total : 153344K->17024K(153344K), 0.9259390 secs]
>>>> 19165596K->19154661K(27245952K), 0.9260890 secs] [Times: user=1.23
>>>> sys=0.05,
>>>> real=0.93 secs]
>>>> Thu Feb 4 11:29:00 2010 [19,199,696 K]: [GC 46596.973: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 17448432
>>>> bytes, 17448432 total : 153344K->17024K(153344K), 0.3054460 secs]
>>>> 19290981K->19199696K(27245952K), 0.3055930 secs] [Times: user=0.79
>>>> sys=0.07,
>>>> real=0.31 secs]
>>>> Thu Feb 4 11:29:01 2010 [19,229,358 K]: [GC 46597.722: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 10331408
>>>> bytes, 10331408 total : 153344K->17024K(153344K), 0.1449040 secs]
>>>> 19336016K->19229358K(27245952K), 0.1450630 secs] [Times: user=0.59
>>>> sys=0.02,
>>>> real=0.15 secs]
>>>> Thu Feb 4 11:29:02 2010 [19,240,015 K]: [GC 46598.038: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 17354120
>>>> bytes, 17354120 total : 153344K->17024K(153344K), 0.0935440 secs]
>>>> 19365678K->19240015K(27245952K), 0.0936870 secs] [Times: user=0.59
>>>> sys=0.00,
>>>> real=0.10 secs]
>>>> Thu Feb 4 11:29:02 2010 [19,302,509 K]: [GC 46598.347: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 17432360
>>>> bytes, 17432360 total : 151519K->17024K(153344K), 0.4619430 secs]
>>>> 19374510K->19302509K(27245952K), 0.4620890 secs] [Times: user=1.19
>>>> sys=0.02,
>>>> real=0.46 secs]
>>>> Thu Feb 4 11:29:03 2010 [19,329,594 K]: [GC 46599.030: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 1 (max 4) - age 1:
>>>> 11933296
>>>> bytes, 11933296 total : 153344K->17024K(153344K), 0.1492670 secs]
>>>> 19438829K->19329594K(27245952K), 0.1494200 secs] [Times: user=0.77
>>>> sys=0.03,
>>>> real=0.15 secs]
>>>> Thu Feb 4 11:29:03 2010 [19,337,176 K]: [GC 46599.423: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 5719032
>>>> bytes, 5719032 total : 153344K->12852K(153344K), 0.0675180 secs]
>>>> 19465914K->19337176K(27245952K), 0.0676670 secs] [Times: user=0.46
>>>> sys=0.02,
>>>> real=0.07 secs]
>>>> Thu Feb 4 11:29:03 2010 [19,338,256 K]: [GC 46599.754: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 2 (max 4) - age 1:
>>>> 6783616
>>>> bytes, 6783616 total - age 2: 5570080 bytes, 12353696 total :
>>>> 149172K->13931K(153344K), 0.0650520 secs]
>>>> 19473496K->19338256K(27245952K),
>>>> 0.0652060 secs] [Times: user=0.42 sys=0.01, real=0.07 secs]
>>>> Thu Feb 4 11:29:04 2010 [19,340,264 K]: [GC 46600.144: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 2 (max 4) - age 1:
>>>> 2190136
>>>> bytes, 2190136 total - age 2: 6762648 bytes, 8952784 total :
>>>> 150251K->10375K(153344K), 0.0709470 secs]
>>>> 19474576K->19340264K(27245952K),
>>>> 0.0710920 secs] [Times: user=0.48 sys=0.01, real=0.07 secs]
>>>> Thu Feb 4 11:29:17 2010 [19,345,928 K]: [GC 46613.628: [ParNew Desired
>>>> survivor size 8716288 bytes, new threshold 4 (max 4) - age 1:
>>>> 760488
>>>> bytes, 760488 total - age 2: 2173472 bytes, 2933960 total :
>>>> 146695K->9279K(153344K), 0.0689390 secs]
>>>> 19476584K->19345928K(27245952K),
>>>> 0.0691070 secs] [Times: user=0.45 sys=0.00, real=0.07 secs]
>>>> Thu Feb 4 11:29:17 2010: [CMS-concurrent-mark: 15.853/18.260 secs]
>>>> [Times:
>>>> user=29.26 sys=0.55, real=18.26 secs]
>>>> Thu Feb 4 11:29:17 2010: [CMS-concurrent-preclean-start]
>>>> ....
>>>>
>>>> Any idea what may causing these pauses and what possible JVM tuning
>>>> would
>>>> be
>>>> neccessary to avoid these pauses? Your help greatly appreciated.
>>>>
>>>> Thank you very much.
>>>>
>>>> Atul
>>>
>> http://old.nabble.com/file/p27463304/long.pause.in.cms.gc.log
>> long.pause.in.cms.gc.log
>
>
--
View this message in context: http://old.nabble.com/long-unaccounted-pause-during-concurrent-mark-phase-of-ParNew-cycle-tp27459365p27470597.html
Sent from the OpenJDK Hotspot Garbage Collection mailing list archive at Nabble.com.
More information about the hotspot-gc-dev
mailing list