Tuning advice

Wed Mar 12 17:41:18 UTC 2008

Hi Kurt --

> I really appreciate your advice, I've been asked a question that I can't
> answer.  Even though no one has noticed a "problem area" from the data
> I've sent we are looking at trying some of the suggestions you all sent.

The data indicates 22 scavenges @ 1.78 s each to 776 full collections @ 2.7 s each:-

gen0t(s)        22      39.224   1.78289    2.711   0.4661
gen1t(s)       776    1845.592   2.37834    7.894   1.5820

Firstly, depending on your response-time needs you may want to tune
the heap to reduce the scavenge pause times. Secondly, you are clearly
spending way too much time 1846 seconds in full gc to just 39 seconds in
scavenges. If, as was discussed before, the full gc's are all a result of
RMI, the current settings for RMI are indeed very wasteful.

Recall that distributed gc requires whole heap gc's to occur at each
of several jvm's communicating via RMI with a certain promptness to
allow the collection of distributed garbage. However, perhaps, as
mentioned below, that frequency is too high by default. Try tuning
it up via, for example :-

-Dsun.rmi.dgc.server.gcInterval=600000 -Dsun.rmi.dgc.client.gcInterval=600000

You will have to tune it up or down depending on the heap occupancy and
how well it works for your application configuration etc.

As stated in:-

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6200091

the value was increased to:-

    Default values for
	   sun.rmi.dgc.client.gcInterval
	   sun.rmi.dgc.server.gcInterval
   are now 3600000 (one hour).
   Posted Date : 2005-07-28 22:11:37.0

as part of 6.0.

> The question is what order if any should they be attempted in?  We have
> done two, setting the heap sizes larger and removed ParalletGCThreads 
> so
> it defaults to 4 rather than the setting it used to be of 20 (4 cpu

Setting parallel gc threads to 4 is a good first step (under the assumption  that
nothing else is running on the machine that can intefere). Setting the heap
sizes larger will certainly reduce scavenge overheads; however, you may need
to tune to meet your pause/responce time needs.

> system).  One of our Windchill apps person is saying that we are using
> RMI and there is not way of working around the fact it is forcing a full
> GC every 60 seconds except to upgrade to java 1.6.0.  He also stated

Please see the suggestion made above (and as already alluded to in previous
discussion).

> that suggestion 1 will only add another GC routine on top of the default
> full GC currently being done.  I've attached an updated output from

That is true. Pre-6.0 (where -XX:+ExplicitGCInvokesConcurrent was
introduced), just -XX:+UseConcMarkSweepGC would not work around
the RMI-induced full gc's.

HTHS.
-- ramki

> PrintGCStats at the bottom of the email.  Recap of suggestions:
> 
> 1) Set -XX:+UseConcMarkSweepGC  to turn on the 'low pause'
> ConcurentMarkSweep Collector of the old area. 
>  
> 2) By default this will also turn on -XX:+UseParNewGC for the new area,
> so you will see lots of smaller ParNew Collections happening. This is
> OK, a desirable frequency is perhaps 2-3 secs. You should probably
> specify NewSize, the default with a 1GB heap will be 16mb which is
> normally too small. Say try 48mb, ie -XX:NewSize=48m.
> 
> 3) I note you are using 1.4.2-13. I know there is a 'feature' in update
> 12 which causes CMS Collections to always be initiated when heap is half
> full. It is fixed in update 15, but I'm not sure about 13. This may not
> be a problem to you, except you are effectively just using approx half
> of the heap space. There is a workaround, specify
>   -XX:CMSInitiatingOccupancyFraction=nn
> -XX:+UseCMSInitiatingOccupancyOnly=true. 
> The default fraction is 68 (it is actually a percent).
> 
> 4)  Do you use RMI? By default, RMI calls Full GC every 60 seconds for
> the 
> correct operation of its distributed GC algorithm. The default 
> behavior 
> has been changed in 6.0 to be something less frequent (once an hour I 
> 
> believe...).
> 
> 5)  Note that in 1.4.2_13 only the minor (young generation) collections
> are done on many processors; full collections are still serial
> collections.  If your logs show those collections to be the problem,
> you might want to try the mostly-concurrent collector
> (-XX:+UseConcMarkSweepGC) instead, but that will require different
> tunings.
> 
> 
> 
> PrintGCStats output (4 cpus / 3 JVMs / Windchill application):
> 
> what         count       total      mean      max   stddev
> gen0(s)         22      39.217   1.78260    2.710   0.4661
> gen0t(s)        22      39.224   1.78289    2.711   0.4661
> gen1t(s)       776    1845.592   2.37834    7.894   1.5820
> GC(s)          798    1884.816   2.36192    7.894   1.5649
> alloc(MB)       22   10289.001  467.68186  482.000   3.1980
> promo(MB)       22    1427.789  64.89950   88.547  16.4298
> 
> alloc/elapsed_time    =  10289.001 MB /  47197.783 s  =   0.218 MB/s
> alloc/tot_cpu_time    =  10289.001 MB / 188791.132 s  =   0.054 MB/s
> alloc/mut_cpu_time    =  10289.001 MB / 181251.870 s  =   0.057 MB/s
> promo/elapsed_time    =   1427.789 MB /  47197.783 s  =   0.030 MB/s
> promo/gc0_time        =   1427.789 MB /     39.224 s  =  36.401 MB/s
> gc_seq_load           =   7539.262 s  / 188791.132 s  =   3.993%
> gc_conc_load          =      0.000 s  / 188791.132 s  =   0.000%
> gc_tot_load           =   7539.262 s  / 188791.132 s  =   3.993%
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use at openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use