64 bit CMS JDK 5.0 u14
Y.S.Ramakrishna at Sun.COM
Y.S.Ramakrishna at Sun.COM
Mon Dec 10 18:57:33 UTC 2007
Hi Keith --
Jon also has talks about this and related
issues in a few of his blogs; see:-
http://blogs.sun.com/jonthecollector/date/20060413
http://blogs.sun.com/jonthecollector/date/20060306
http://blogs.sun.com/jonthecollector/date/20060404
http://blogs.sun.com/jonthecollector/date/20070622
-- ramki
Y.S.Ramakrishna at Sun.COM wrote:
> Hi Keith --
>
>> I am running some midle-tier Portal load tests in WLS 9.2 MP2 with Sun
>> JDK 5.0 u14. I am running 100 concurrent users that logon, navigate,
>> open portlets, and eventually logoff; only to logon again and repeat
>> the cycle. My testing people have establish Load Runner scripts to put
>> the Portal software through an endurance test over 5 days with the 100
>> users.
>>
>> Normally on 32-bit WLS 8.1 SP6 with JDK 1.4.2_13 we run with the
>> following VM args; we attain some 3.4 million passed transactions with
>> zero failed transactions:
>>
>> -server -Xms1400m -Xmx1400m -XX:NewSize=64m -XX:MaxNewSize=64m
>> -XX:PermSize=128m -XX:MaxPermSize=128m -Xss128k -XX:-UseTLAB
>> -XX:+DisableExplicitGC
>> -Dsun.rmi.dgc.client.gcInterval=3600000
>> -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true
>>
>> CMS never works very well in the 32-bit environment; failing miserably
>> above; although at JDK 1.4.2_15, we see some 2 million passed
>> transactions with 1130+ failed transactions owing to 120 seconds
>> timeouts in concurrent mode failures.
>>
>> Now, in the 64 bit environment running on AMD Windows Server 2003, I
>> can run pretty successfully with CMS:
>>
>> -Xms1500m -Xmx3500m -XX:NewSize=320m -XX:MaxNewSize=320m -Xss256k
>> -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseConcMarkSweepGC
>> -XX:CMSFullGCsBeforeCompaction=0 -XX:+UseCMSInitiatingOccupancyOnly
>> -XX:CMSInitiatingOccupancyFraction=40
>> -Dsun.rmi.dgc.client.gcInterval=3600000
>> -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true
>> -Dcom.sun.management.jmxremote -verbosegc
>> -Xloggc:C:\keith\GCLogs\gc11.txt
>>
>> I can achieve some 3.4 million as for the throughput collector in 32
>> bit env.
>>
>> But, I do note several thousand failed transactions that correlate
>> with concurrent mode failures after some 24 hrs; pauses in the 400-600
>> seconds range when Full GC takes over.
>>
>
> The fact that you start seeing the concurrent mode failures after
> 24 hours indicates to me strongly that the old generation gets slowly
> fragmented over a period of time. (Recall that the CMS collector is
> non-moving.)
>
> Can you confirm that the heap occupancy itself is constant (or nearly
> so) following CMS collection cycles, and that the full gc that follows
> a concurrent mode failure does not unload classes? Recall that CMS will
> not, by default, unload classes during concurrent cycles unless
> explicitly instructed to do so via:
>
> -XX:+CMSClassUnloadingEnabled -XX:+PermGenSweepingEnabled
>
> (the second option is needed in pre-6.0 JVM's, but not in more
> recent JVM's).
>
>> I have tried varying the CMSInitiatingOccupancyFraction to 20%, but
>> the CMS mode failures still occur.
>
> It is usually a good idea to use survivor spaces to both reduce the
> pressure on the concurrent collector (by promoting less to the old
> gen), but also to reduce the spread in object sizes and lifetimes
> of the objects that do get promoted. I'd suggest using survivor
> spaces to make sure that survivors stay in the young gen for at least
> one scavenge (MaxTenuringThreshold = 1, possibly more, as experiments
> dictate), possibly more. A downside is possibly longer scavenges,
> but consider that the price for (possibly) avoiding concurrent
> mode failure.
>
> Prematurely promoting objects (besides the two points made above),
> can also reduce floating garbage and reduce CMS remark pause
> times (by reducing mutation rates in the old generation).
>
>>
>> I am now running with the incremental mode CMS; but anticipate further
>> very long pauses.
>
> From what you described above (running CMS all the time by setting the
> initiation threshold very low), it does not look as though iCMS will
> buy you anything.
>
>>
>> The VM always recovers very well after these sporadic Full GCs, but to
>> eradicate them, should I run with an 8 GB heap or something along
>> those lines.? I also read something about killing the swap file?
>>
>> My AMD 64 bit bx, unfortunately for now is restricted to 4 GB RAM; but
>> I am adding a further 4 GB soon. I am about to go to the Solaris SPARC
>> 64 bit and run the exact same scenario with a 7-8 GB heap.
>
> Increasing the heap size can indeed sometimes help you avoid
> concurrent mode failure from fragmentation. (But first make
> sure to enable survivor spaces and, if applicable, perm gen
> collection.)
>
>>
>> I read about the occupancy fration for OG and Perm Gen; do I need to
>> apply this patch. Our Perm Gen is always set to 128 MB and only ever
>> attains 108 MB.
>
> The webrev i posted late last week should not really apply directly to
> your case (except inasmuch as, in the event that you enable perm gen
> collection, it might allow you to get away with not collecting the
> perm gen per each cycle, and thus help keep cms remark pauses possibly
> shorter). I would not worry about this patch at the level at which you
> are tuning currently (which is mainly looking to avoid the concurrent
> mode failures).
>
> -- ramki
>
>>
>> Any feedback would help us in our endeavours to support our EBI apps
>> in a 64 bit env.
>>
>> keith
>>
>>
>> Keith R Holdaway
>> Java Development Technologies
>>
>> SAS... The Power to Know
>>
>> Carpe Diem ...
>>
>
>
More information about the hotspot-gc-dev
mailing list