Very long young gc pause (ParNew with CMS)

Florian Binder java at java4.info
Wed Jan 11 09:45:28 UTC 2012


I do not know why it has worked for a week.
Maybe it is because this was the xmas week ;-)

In the night there are a lot of disk operations (2 TB of data is 
written). Therefore the operating system caches a lot of files and tries 
to free memory for this, so unused pages are moved to swap space.
I assume heap fragmentation avoids swapping, since more pages are 
touched during the application is running. After a compacting gc there 
is one large (free) block which is not touched until young gc copies the 
objects from eden space. This will yield the operating system to move 
the pages of this one free block to swap and at every young gc it has to 
read it from swap.
After a CMS collection the following young gcs are much faster because 
the gaps in the heap are not swapped.

Yesterday, we have turned off the swap on this machine and now all young 
gcs take less than 200ms (instead of 6s) :-)
Thanks againt to Chi Ho Kwok for giving the key hint :-)

Flo


Am 11.01.2012 10:00, schrieb Srinivas Ramakrishna:
>
>
> On Mon, Jan 9, 2012 at 3:08 AM, Florian Binder <java at java4.info 
> <mailto:java at java4.info>> wrote:
>
>     ...
>     I have seen that this problem occurs only after about one week of
>     uptime. Even thought we make a full (compacting) gc every night.
>     Since real-time > user-time I assume it might be a synchronization
>     problem. Can this be true?
>
>
> Together with your and Chi-Ho's conclusion that this is possibly 
> related to paging,
> a question to ponder is why this happens only after a week. Since your 
> process'
> heap size is presumably fixed and you have seen multiple full GC's 
> (from which
> i assume that your heap's pages have all been touched), have you 
> checked to
> see if the size of either this process (i.e. its native size) or of 
> another process
> on the machine has grown during the week so that you start swapping?
>
> I also find it interesting that you state that whenever you see this 
> problem
> there's always a single block in the old gen, and that the problem 
> seems to go
> away when there are more than one block in the old gen. That would seem
> to throw out the paging theory, and point the finger of suspicion to 
> some kind
> of bottleneck in the allocation out of a large block. You also state 
> that you
> do a compacting collection every night, but the bad behaviour sets in only
> after a week.
>
> So let me ask you if you see that the slow scavenge happens to be the 
> first
> scavenge after a full gc, or does the condition persist for a long 
> time and
> is independent if whether a full gc has happened recently?
>
> Try turning on -XX:+PrintOldPLAB to see if it sheds any light...
>
> -- ramki

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20120111/19dc97a7/attachment.htm>
-------------- next part --------------
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use


More information about the hotspot-gc-dev mailing list