G1GC fine tuning under heavy load

Stefan Johansson stefan.johansson at oracle.com
Thu Sep 13 12:39:33 UTC 2018


Hi,

Could you please provide the GC logs from the run as well, the reports 
give a good overview but some details from the logs might help us give 
better advise. It will also help to rule out that there are no Full GCs 
occurring as you say. Some more comments inline.

On 2018-09-12 15:26, Wajih Ahmed wrote:
> Hello,
> 
> I have an application running on two nodes in a kubernetes cluster. It 
> is handling about 70 million requests per day.  I have noticed a gradual 
> decline in the throughput so much so that in about 7 days the througput 
> falls about 50%.  Although large percent of this decline is in the first 
> hour and then a gradual decline.
> This graph 
> <https://drive.google.com/open?id=19pG4j2ezNj-jm69Br7HqGKtR_c7L_-r6> to 
> shows this pattern.  Some of the decline i can attribute to the 
> application and use case itself. As database starts growing rapidly the 
> system come under memory and cpu pressure and the database itself is 
> also a java application.  So perhaps ignoring the decline of the first 
> hour is prudent but i am still interested in seeing if i can tune the 
> jvm of the app so that the throughput is more linear after the first hour.
> 
> I am also providing a gceasy.io <http://gceasy.io> report 
> <https://drive.google.com/open?id=1s0akdn6ztj2-oeOHwEjFMqbDRpYOweJJ> 
> that will
> give the required information about GC activity.  You will see i have 
> done some rudementary tuning already.
> 
> What i am curious about is if the young gen size needs to be reduced by 
> tunring G1NewSizePercent to reduce the duration of the pauses in 
> particular the object copy stage.

This is a very hard question and answer, a smaller young gen of course 
mean less regions to collect but since the GCs will occur more 
frequently, less objects will have time to die, so it might be that a 
larger young gen is quicker to collect for some applications. And since 
long pause times doesn't seem to be the biggest problem, I wouldn't 
start the tuning here.

> 
> Secondly what GCEasy is calling "consecutive full gc" don't appear to be 
> full GC's.  But it might be CMS (initial-mark) activity which accouts 
> for most of the GC activity and has some long pause times.  Will 
> increasing InitiatingHeapOccupancyPercent be recommended to reduce this 
> activity and give the application more time?
> 

Looking at the report it looks like the old generation grows over time 
and it might be that a lot of it is live so the concurrent cycles don't 
free up that much and that you still are above the limit afterwards. If 
this is the case setting a higher InitiatingHeapOccupancyPercent could 
help.

Would also be helpful to know what version of Java you are running.

Cheers,
Stefan

> Any other advise will be helpful as i start to learn and unfold the 
> mystries of GC tuning :-)
> 
> Just in case you don't want to open the pdf report these are my JVM args
> 
> -XX:G1MixedGCCountTarget=12 -XX:InitialHeapSize=7516192768 
> -XX:MaxGCPauseMillis=200 -XX:MaxHeapSize=7516192768 - 
> XX:MetaspaceSize=268435456 -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC 
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
> -XX:+PrintPromotionFailure -XX:+PrintTenuringDistribution 
> -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC 
> -XX:- UseNUMA
> 
> 
> Regards
> 
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 


More information about the hotspot-gc-use mailing list