Strange long ParNew collections

Kaj Nygren kaj at mag.se
Tue May 19 04:27:40 PDT 2009


Hi,

Thanks for your help. I wanted to give an update and have a couple of 
more questions :)

We did change to -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=4  and now 
the ParNew collections are working a lot better. We have also turned off 
incremental mode as Jon suggested.

Now we do have one problem left though, it seems that the CMS-GC gets 
stuck sometimes during parallell rescan. The output looks like this:

24004,062: [CMS-concurrent-sweep: 1,755/1,932 secs]
24004,062: [CMS-concurrent-reset-start]
24004,163: [CMS-concurrent-reset: 0,101/0,101 secs]
24006,163: [GC [1 CMS-initial-mark: 10348978K(18432000K)] 
10521837K(19353600K), 0,1055176 secs]
24006,269: [CMS-concurrent-mark-start]
24010,642: [CMS-concurrent-mark: 4,373/4,373 secs]
24010,642: [CMS-concurrent-preclean-start]
24010,730: [CMS-concurrent-preclean: 0,085/0,088 secs]
24010,730: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 24011,824: 
[CMS-concurrent-abortable-preclean: 0,792/1,094 secs]
24011,828: [GC[YG occupancy: 332774 K (921600 K)]
24011,829: [Rescan (parallel)

Then the JVM hangs for more than 5 minutes at which time our watchdog 
determines that the server is hung and restarts it. It did happen with 
iCMS and apparently it still happens using CMS.

I searched around and found one similar report:

http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2009-January/000327.html

I know this is just a guess, but could we be experiencing the same bug? 
If so, am I correct in interpreting the bug-reports that it should be 
fixed in 1.5.0_18? (we are running 1.5.0_16 at the moment).

/Kaj

Y.S.Ramakrishna at Sun.COM skrev:
> This may or may not be a cause of the sudden spikes in
> ParNew collection, but I'd suggest using the survivor
> spaces to age objects for at least a few collections
> before promoting to the old gen. This reduces pressure
> on the old gen allocation (during scavenge) improving
> scavenges and on CMS collection as well.
>
> I'd suggest trying -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=4
> to start off with and iterate based on data from 
> -XX:+PrintTenuringDistribution.
>
> It is possible that the promotion of very short-lived objects fragments
> the old gen and causes old gen allocation (for promotion) to slow down.
> It can also cause lots of mutation in the old gen and adversely affect
> card-scanning times further slowing down scavenges.
>
> -- ramki
>
> On 05/06/09 05:06, Kaj Nygren wrote:
>> Hi,
>>
>> We are having problems with occational long ParNew pauses in our 
>> application running on a Jboss server. The server runs well for most 
>> of the time but sometimes we get ParNew collections that can take up 
>> to 120 seconds. Most of the time we have several consecutive slow 
>> collections, which are more or less killing the server under high load.
>>
>> We are using JDK 1.5.0_16 on Windows Server 2003, and are using 
>> ParNew + Concurrent CMS.
>>
>> Our GC configuration is:
>>
>> wrapper.java.additional.1=-Dprogram.name=run.bat
>> wrapper.java.additional.2=-server
>> wrapper.java.additional.3=-Xms19000m
>> wrapper.java.additional.4=-Xmx19000m
>> wrapper.java.additional.5="-XX:NewSize=1000m"
>> wrapper.java.additional.6="-XX:MaxNewSize=1000m"
>> wrapper.java.additional.7="-XX:MaxPermSize=256m"
>> wrapper.java.additional.8="-XX:PermSize=256m"
>> wrapper.java.additional.11="-Xloggc:c:\mygclog.gc"
>> wrapper.java.additional.12="-XX:+UseConcMarkSweepGC"
>> wrapper.java.additional.13="-XX:+PrintGCDetails"
>> wrapper.java.additional.14="-XX:-PrintTenuringDistribution"
>> wrapper.java.additional.15="-XX:+PrintHeapAtGC"
>> wrapper.java.additional.16="-XX:+DisableExplicitGC"
>> wrapper.java.additional.17="-XX:+PrintGCTimeStamps"
>> wrapper.java.additional.18="-XX:CMSInitiatingOccupancyFraction=50"
>> wrapper.java.additional.19="-XX:+CMSIncrementalMode"
>> wrapper.java.additional.20="-XX:+CMSIncrementalPacing"
>> wrapper.java.additional.21="-verbose:gc"
>>
>> Usually ParNew is quick as can be seen here:
>>
>> 614276,653: [ParNew: 935058K->0K(1023040K), 0,0624699 secs] 
>> 12383872K->11459948K(19455040K) icms_dc=22 Heap after gc 
>> invocations=57822:
>> par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>> 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K,   0% used [0x000000007fff0000, 
>> 0x000000007fff0000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>>  to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>> concurrent mark-sweep generation total 18432000K, used 11459948K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112069K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> }
>> , 0,0628860 secs]
>>
>> But suddenly it becomes really slow:
>>
>> 620659,717: [ParNew: 1022080K->0K(1023040K), 1,3313817 secs] 
>> 12688913K->11692925K(19455040K) icms_dc=10 Heap after gc 
>> invocations=58587:
>> par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>> 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K,   0% used [0x000000007fff0000, 
>> 0x000000007fff0000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>>  to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>> concurrent mark-sweep generation total 18432000K, used 11692925K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112098K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> }
>> , 1,3319554 secs]
>> 620663,221: [GC {Heap before gc invocations=58587:
>> par new generation   total 1023040K, used 1022080K 
>> [0x000000007fff0000, 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K, 100% used [0x000000007fff0000, 
>> 0x00000000be610000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>>  to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>> concurrent mark-sweep generation total 18432000K, used 11692925K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112106K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> 620663,222: [ParNew: 1022080K->0K(1023040K), 10,2909893 secs] 
>> 12715005K->11936864K(19455040K) icms_dc=10 Heap after gc 
>> invocations=58588:
>> par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>> 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K,   0% used [0x000000007fff0000, 
>> 0x000000007fff0000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>>  to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>> concurrent mark-sweep generation total 18432000K, used 11936864K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112106K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> }
>> , 10,2915548 secs]
>> 620676,392: [GC {Heap before gc invocations=58588:
>> par new generation   total 1023040K, used 1022080K 
>> [0x000000007fff0000, 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K, 100% used [0x000000007fff0000, 
>> 0x00000000be610000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>>  to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>> concurrent mark-sweep generation total 18432000K, used 11936864K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112106K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> 620676,392: [ParNew: 1022080K->0K(1023040K), 3,8905950 secs] 
>> 12958944K->12020225K(19455040K) icms_dc=10 Heap after gc 
>> invocations=58589:
>> par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>> 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K,   0% used [0x000000007fff0000, 
>> 0x000000007fff0000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>>  to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>> concurrent mark-sweep generation total 18432000K, used 12020225K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112106K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> }
>> , 3,8911113 secs]
>> 620682,667: [GC {Heap before gc invocations=58589:
>> par new generation   total 1023040K, used 1022080K 
>> [0x000000007fff0000, 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K, 100% used [0x000000007fff0000, 
>> 0x00000000be610000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>>  to   space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>> concurrent mark-sweep generation total 18432000K, used 12020225K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112112K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> 620682,668: [ParNew: 1022080K->0K(1023040K), 9,4186088 secs] 
>> 13042305K->12214571K(19455040K) icms_dc=10 Heap after gc 
>> invocations=58590:
>> par new generation   total 1023040K, used 0K [0x000000007fff0000, 
>> 0x00000000be7f0000, 0x00000000be7f0000)
>>  eden space 1022080K,   0% used [0x000000007fff0000, 
>> 0x000000007fff0000, 0x00000000be610000)
>>  from space 960K,   0% used [0x00000000be700000, 0x00000000be700000, 
>> 0x00000000be7f0000)
>>  to   space 960K,   0% used [0x00000000be610000, 0x00000000be610000, 
>> 0x00000000be700000)
>> concurrent mark-sweep generation total 18432000K, used 12214571K 
>> [0x00000000be7f0000, 0x00000005237f0000, 0x00000005237f0000)
>> concurrent-mark-sweep perm gen total 262144K, used 112112K 
>> [0x00000005237f0000, 0x00000005337f0000, 0x00000005337f0000)
>> }
>> , 9,4191551 secs]
>>
>> We also have problems with occational promotion failed, which I 
>> assume is due to fragmentation.
>> I have attatched a longer log, which shows this behaviour and ends 
>> with a promotion failed.
>>
>> Does anybody have a glue as to what is going wrong, and how we can 
>> change our gc configuration to reduce the problems?
>> Any help is greatly appreciated!
>>
>> Regards
>>
>> Kaj Nygren
>> kaj at mag.se
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use




More information about the hotspot-gc-use mailing list