Very long young gc pause (ParNew with CMS)

Mon Jan 9 21:21:48 PST 2012

Hi Florian,

Uh, you might want to try sar -r as well, that reports memory usage (and
man sar for other reporting options, and -f /var/log/sysstat/saXX where xx
is the day for older data is useful as well). Page in / out means reading
or writing to the swap file, usual cause here is one or more huge
background task / cron jobs taking up too much memory forcing other things
to swap out. You can try reducing the size of the heap and see if it helps
if you're just a little bit short, but otherwise I don't think you can
solve this with just VM options.

Here's the relevant section from the manual:

       -B     Report paging statistics. Some of the metrics below are
> available only with post 2.5 kernels. The following values are displayed:
>               pgpgin/s
>                      Total  number of kilobytes the system paged in from
> disk per second.  Note: With old kernels (2.2.x) this value is a number of
> blocks
>                      per second (and not kilobytes).
>               pgpgout/s
>                      Total number of kilobytes the system paged out to
> disk per second.  Note: With old kernels (2.2.x) this value is a number  of
>  blocks
>                      per second (and not kilobytes).
>               fault/s
>                      Number  of  page faults (major + minor) made by the
> system per second.  This is not a count of page faults that generate I/O,
> because
>                      some page faults can be resolved without I/O.
>               majflt/s
>                      Number of major faults the system has made per
> second, those which have required loading a memory page from disk.

I'm not sure what kernel you're on, but pgpgin / out being high is a bad
thing. Sar seems to report that all faults are minor, but that conflicts
with the first two columns.

Chi Ho Kwok

On Mon, Jan 9, 2012 at 8:47 PM, Florian Binder <java at java4.info> wrote:

>  Yes!
> You are right!
> I have a lot of page faults when gc is taking so much time.
>
> For example (sar -B):
> 00:00:01     pgpgin/s pgpgout/s   fault/s  majflt/s
> 00:50:01         0,01     45,18    162,29      0,00
> 01:00:01         0,02     46,58    170,45      0,00
> 01:10:02     25313,71  27030,39  27464,37      0,02
> 01:20:02     23456,85  25371,28  13621,92      0,01
> 01:30:01     22778,76  22918,60  10136,71      0,03
> 01:40:11     19020,44  22723,65   8617,42      0,15
> 01:50:01         5,52     44,22    147,26      0,05
>
> What is this meaning and how can I avoid it?
>
>
> Flo
>
>
>
> Am 09.01.2012 20:33, schrieb Chi Ho Kwok:
>
> Just making sure the obvious case is covered: is it just me or is 6s real
> > 3.5s user+sys with 13 threads just plain weird? That means there was 0.5
> thread actually running on the average during that collection.
>
>  Do a sar -B (requires package sysstat) and see if there were any major
> pagefaults (or indirectly via cacti and other monitoring tools via memory
> usage, load average etc, or even via cat /proc/vmstat and pgmajfault), I've
> seen those cause these kind of times during GC.
>
>
>  Chi Ho Kwok
>
>  On Mon, Jan 9, 2012 at 12:08 PM, Florian Binder <java at java4.info> wrote:
>
>> Hi everybody,
>>
>> I am using CMS (with ParNew) gc and have very long (> 6 seconds) young
>> gc pauses.
>> As you can see in the log below the old-gen-heap consists of one large
>> block, the new Size has 256m, it uses 13 worker threads and it has to
>> copy 27505761 words (~210mb) directly from eden to old gen.
>> I have seen that this problem occurs only after about one week of
>> uptime. Even thought we make a full (compacting) gc every night.
>> Since real-time > user-time I assume it might be a synchronization
>> problem. Can this be true?
>>
>> Do you have any Ideas how I can speed up this gcs?
>>
>> Please let me know, if you need more informations.
>>
>> Thank you,
>> Flo
>>
>>
>> ##### java -version #####
>> java version "1.6.0_29"
>> Java(TM) SE Runtime Environment (build 1.6.0_29-b11)
>> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode)
>>
>> ##### The startup parameters: #####
>> -Xms28G -Xmx28G
>> -XX:+UseConcMarkSweepGC \
>> -XX:CMSMaxAbortablePrecleanTime=10000 \
>> -XX:SurvivorRatio=8 \
>> -XX:TargetSurvivorRatio=90 \
>> -XX:MaxTenuringThreshold=31 \
>> -XX:CMSInitiatingOccupancyFraction=80 \
>> -XX:NewSize=256M \
>>
>> -verbose:gc \
>> -XX:+PrintFlagsFinal \
>> -XX:PrintFLSStatistics=1 \
>> -XX:+PrintGCDetails \
>> -XX:+PrintGCDateStamps \
>> -XX:-TraceClassUnloading \
>> -XX:+PrintGCApplicationConcurrentTime \
>> -XX:+PrintGCApplicationStoppedTime \
>> -XX:+PrintTenuringDistribution \
>> -XX:+CMSClassUnloadingEnabled \
>> -Dsun.rmi.dgc.server.gcInterval=9223372036854775807 \
>> -Dsun.rmi.dgc.client.gcInterval=9223372036854775807 \
>>
>> -Djava.awt.headless=true
>>
>> ##### From the out-file (as of +PrintFlagsFinal): #####
>> ParallelGCThreads                         = 13
>>
>> ##### The gc.log-excerpt: #####
>> Application time: 20,0617700 seconds
>> 2011-12-22T12:02:03.289+0100: [GC Before GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 1183290963
>> Max   Chunk Size: 1183290963
>> Number of Blocks: 1
>> Av.  Block  Size: 1183290963
>> Tree      Height: 1
>> Before GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 0
>> Max   Chunk Size: 0
>> Number of Blocks: 0
>> Tree      Height: 0
>> [ParNew
>> Desired survivor size 25480392 bytes, new threshold 1 (max 31)
>> - age   1:   28260160 bytes,   28260160 total
>> : 249216K->27648K(249216K), 6,1808130 secs]
>> 20061765K->20056210K(29332480K)After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 1155785202
>> Max   Chunk Size: 1155785202
>> Number of Blocks: 1
>> Av.  Block  Size: 1155785202
>> Tree      Height: 1
>> After GC:
>> Statistics for BinaryTreeDictionary:
>> ------------------------------------
>> Total Free Space: 0
>> Max   Chunk Size: 0
>> Number of Blocks: 0
>> Tree      Height: 0
>> , 6,1809440 secs] [Times: user=3,08 sys=0,51, real=6,18 secs]
>> Total time for which application threads were stopped: 6,1818730 seconds
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20120110/ea863255/attachment.html