G1GC heap Adaptation, Long Pauses and review request

charlie hunt charlie.hunt at oracle.com
Tue Jan 6 16:46:21 UTC 2015


Hi Shyam,

For the benefit of the many G1 GC developers on this mailing list, (who can offer expertise and would like to hear your feedback) I am also posting information I also posted to jClarity.

I should also add that there are likely folks in this forum with extensive Java on SPARC experience.

I responded to Shyam’s post initially on jClarity since that is where it was posted first. For the others on this mailing list, here is my response that I had posted to on jClarity.  Feel free to offer additional suggestions for Shyam.

1.)  There are some high sys times with those high pause times. Possible causes could be swapping to virtual memory or if you are on Linux it could be transparent huge pages being enabled (on Linux you should disable transparent huge pages).
2.)  -XX:LargePageSizeInBytes=256m implies you are on Solaris SPARC. Some Solaris SPARC platforms (i.e. T-series) are the only ones that I know of that support 256m pages. Are you running on Solaris SPARC ?
3.)  Even with -XX:+ParallelRefProcEnabled there are still some rather high reference processing times.  And, I also noticed you have SoftRefLRUPolicyMSPerMB=1 which implies the application is making heavy use of SoftReference and would like them to be processed more aggressively.  These two observations makes me wonder what the application implementation looks like.  Are there a very large number of SoftReferences in use in this application, and a large number of Reference objects in general in use?  I am really curious what observation(s) drove you to set SoftRefLRUPolicyMSPerMB=1?  And, also how the application is using Reference objects since the reference processing times are rather high even with ParallelRefProcEnabled.
4.)  Also curious what observation(s) drove you to set G1ReservePercent=15?
5.)  Lastly, what observation drove you to set HeapRegionSize=32m?  With a 6 GB Java heap, at 32m region size, you have but just 192 regions for G1 to work with.  The default number of regions is 2048. 
6.)  Ooh almost forgot, back to the 256m pages, (that is if you are on SPARC where it is supported).  When a new (256m) page is touched during GC, that (256m) page will be zero'd which can introduce a small, but noticeable jump in pause time on those GCs where a new (256m) page is first touched.  To work around that you add -XX:+AlwaysPreTouch to touch all pages at JVM launch time.  I would not add that command line option just yet since you have some bigger issues going on here.

Let’s now take each of these a bit further.
- 256m pages
You have offered that you are indeed running on Solaris SPARC where 256m pages are a viable option.  One possible explanation for the larger than usual spikes in GC times could be the result of large page coalescing at the next 256m page touch.  The observation of high sys times on those GCs tend to support that hypothesis. You could help address this issue by adding -XX:+AlwaysPreTouch to the command line options.  This would touch all pages at JVM launch time, but it will increase JVM initialization time.
- 32 MB region size
You offered that the reason you did this was to accommodate large sized array objects. As I have mentioned, a region size of 32 MB leaves just 192 regions for G1 to work with. The target number of G1 regions is 2048.  In more recent JDK releases, there have been changes introduced to help address frequent large object allocations.  So the more recent the JDK release you can work with, the better.  So if you can migrate to the latest Java 8 JDK, that should improve your experience with G1.

Now continuing on …

For a next step, I think it would be good exercise to simplify your command line options and build up to the ones you can benefit from by running a sequence of experiments where you can compare the performance and behavior of the application to a baseline, where you started, and comparing one change at a time to that baseline until you find the best experience. That way you will have justification for each of the command line options.

Below I have listed out each of the command line options you are using along with a comment of whether I think it useful as a good starting place.

Your JVM command line options and my comments:
-d64 
* You should not need to specify this since you are working with a 64-bit JVM.
-server 
* You do not need to specify this since 64-bit JVMs are -server JVMs only.
-Xms6g -Xmx6g
* These two are fine, assuming you have sufficiently enough available physical memory.
-XX:PermSize=2g -XX:MaxPermSize=2g
* These two are fine so long as you need that much memory for perm gen. GC logs will offer this info on Full GC events, and at JVM exit if you use -XX:+PrintGCDetails or you are logging to a GC to a log file. You can also monitor with Console to observe Perm Gen occupancy. You might want to double check that you need this much space since it is a bit higher than commonly seen.
-XX:+HeapDumpOnOutOfMemoryError
* This one is ok. It should not have a negative or positive performance advantage, and having a heap dump on an OutOfMemoryError can be useful and you have sufficient file system space for a potential heap dump.
-XX+DisableExplicitGC
* This one is ok, assuming you do not have a need for a timely compaction of the Java heap.
-XX:+UseG1GC
* I wonder if you need this one. You have a pause time target of 2 seconds (2000 ms).  You might be able to reach that pause time target using -XX:+UseParallelOldGC.  So, you might consider an experiment that compares the performance of ParallelOldGC versus G1GC.
-XX:MaxGCPauseMillis=2000
* This one is ok if you are using G1 GC. If you use ParallelOldGC, then it is not needed.
-XX:InitiatingHeapOccupancyPercent=70
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it.
-XX:ConcGCThreads=12
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it.
-XX:G1ReservePercent=15
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it.
-XX:1HeapRegionSize=32m
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it.
-XX:-UseBiasedLocking
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it. I suggest you use -XX:+PrintSafepointStatistics and look at the output to see if you are seeing a lot of biased lock revocation safepoints. You could also add -XX:+PrintApplicationStoppedTime to get a sense of non-GC stop the world events which could indicate non-GC safe points, some of which could be biased lock revocations.
-XX:LargePageSizeInBytes=256m
* Suggest you not initially start with this command line option. Start with the defaults on Solars.  HotSpot on Solaris will automatically try to use large pages, though I don’t recall if it will automatically try to use 256m pages out of the box. SPARC does support several different pages sizes.  Again, let experiments and analysis of the data drive you to setting it.  If you see similar behavior with spikes in pauses with G1 as you see now, then adding -XX:+AlwaysPreTouch should help. But, adding -XX:+AlwaysPreTouch will increase JVM initialization time.
-XX:-UseThreadPriorities
* Suggest you not use this.  You can try an experiment with explicit disabling of it to see if it helps. I doubt it will.
-XX:ParallelGCThreads=12
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it. Since SPARC has a large number of hardware threads, and you may be sharing this system with other applications, you may want to tweak this one.
-XX:SoftRefLRUPolicyMSPerMB=1
* Suggest you not initially start with this command line option. Let experiments and analysis of the data drive you to tweaking it.  When you start exploring whether to do something with the command line option, add -XX:+PrintReferenceGC so you can see how many SoftReferences are being processed per GC. If there is a large number of them, especially when your heap occupancy is high, then you can start looking at tweaking this one. If you have a lot of Reference objects to process, then you will also want to enable -XX:+ParallelRefProcEnabled.
-XX:+ParallelRefProcEnabled
* Based on your existing GC logs, this one is probably of use. You could however do a comparison run with it disabled to see what happens with GC pause times and in particular the reference processing phases.

Here is what I would start with:
-Xms6g -Xmx6g
-XX:PermSize=2g -XX:MaxPermSize=2g  *** Double check that you need Perm Gen this large by looking at GC logs or tracking the app behavior with JConsole or VisualVM
-XX:+HeapDumpOnOutOfMemoryError
-XX+DisableExplicitGC
-XX:+UseG1GC
-XX:+ParallelRefProcEnabled
-XX:+PrintReferenceGC
* NOTE: you should also add -XX:+PrintGCDetails and also one of the GC timestamps, i.e. -XX:+PrintGCTimeStamps or -XX:+PrintGCDateStamps. IIRC, you will get this information if you directing GC info to a log file.

As mentioned above, I would do a comparison with -XX:+UseParallelOldGC since your GC pause time target is pretty high. You may be able to meet that with a 6 GB Java heap.  So, that would mean you swamp out -XX:+UseG1GC above and swap in -XX:+UseParallelOldGC.  ParallelOldGC will offer you the best throughput. So, if it can meet your pause time target, it will give you better performance overall.
Once you have decided between G1 or ParallelOld GC, you can start looking at whether you can benefit from the other command line options. This is where you can start an experiment, one each for:
- With and without 256m pages, and separately, with and without -XX:+AlwaysPreTouch
- Test with -XX:+PrintSafepointStatistics to see if there are large number of biased locking revocation safepoints, this will tell you whether adding -XX:-UseBiasedLocking is helpful
- If G1 GC is the route for you and you want to, or have a need to further tune, then you have some data gathering and analysis to do. I would suggest to add -XX:+PrintAdaptiveSizePolicy to get information on large object allocations and their sizes.  Looking at system performance will help you determine whether to tweak the default ParallelGCThreads and ConcGCThreads. 

hths,

charlie
 

> On Jan 6, 2015, at 1:16 AM, Kirk Pepperdine <kirk at kodewerk.com> wrote:
> 
> Hi Shyam,
> 
> You’ve already asked this question on my “friends of jClarity” list and I think you’ve gotten some excellent advice there from Charlie, Gil and a few others who are well regarded experts on this subject. I’d start with that advice first.
> 
> Kind regards,
> Kirk Pepperdine
> 
> On Jan 5, 2015, at 11:19 AM, megha shyam Jakkireddy <shyam21 at gmail.com <mailto:shyam21 at gmail.com>> wrote:
> 
>> Hi
>>  
>> Recently we have done some G1 Tuning. I've added a logfile. Could somebody explain and let me know your thoughts to make it optimal. There were no full GC's observed, Heap adaptation looks good and however there were long pause times at the initial hours. And there were premature promotions as well
>> 
>> Additional to the logging, we are using the below G1 parameters..
>> 
>> -d64 -server -Xms6g -Xmx6g -XX:PermSize=2g -XX:MaxPermSize=2g -XX:+HeapDumpOnOutOfMemoryError  -XX+DisableExplicitGC -XX:+UseG1GC -XX:MaxGCPauseMillis=2000 -XX:InitiatingHeapOccupancyPercent=70 -XX:ConcGCThreads=12 -XX:G1ReservePercent=15 -XX:1HeapRegionSize=32m -XX:-UseBiasedLocking -XX:LargePageSizeInBytes=256m -XX:-UseThreadPriorities -XX:ParallelGCThreads=12 -XX:SoftRefLRUPolicyMSPerMB=1 -XX:+ParallelRefProcEnabled "
>> 
>> We are using JDK 1.7.0_45.
>> 
>> Regards
>> 
>>  
>> <G1GC.TXT>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20150106/1c58f6e8/attachment.htm>


More information about the hotspot-gc-dev mailing list