G1 garbage collection Ext Root Scanning time increase linearly as application runs

Srinivas Ramakrishna ysr1729 at gmail.com
Tue Feb 19 22:24:55 PST 2013


Perhaps Ashley could build an instrumented jvm with time trace around the various external root groups scanned serially and the answer would be immediate?

ysr1729

On Feb 19, 2013, at 17:38, John Cuthbertson <john.cuthbertson at oracle.com> wrote:

> Hi Ashely,
> 
> Off the top of my head there's also the intern string table. I'll have to look at the code to figure out what else it could be.
> 
> Thanks for the info.
> 
> JohnC
> 
> On 2/19/2013 5:11 PM, Ashley Taylor       wrote:
>> Hi John
>>  
>> I reran my application with the JIT log turned on. It seems that once the application has been running for a while there is very little activity within the JIT log but the pause times keep climbing, I ran it for 4 hours and the ‘Ext Root Scan’ had climbed to 40ms.
>>  
>> At the 4 hour point I also performed a full gc to see how many classes would be unload and it was only 50. We have around 5500 loaded classes.
>> The number of loaded classes also does not increase once the application has run for a while.
>>  
>> I also used jstat to see how full the permanent memory region is, it is slowly climbing the full gc did not seem to reduce it at all, however the full gc did fix the pause time.
>>  
>> The permanent region is currently at 89.17% and seems to increase by 0.01% every couple of minutes.
>>  
>> Is there any other GC events that only happen at a full gc?
>>  
>> Cheers,
>> Ashley
>>  
>>  
>>  
>> From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of John Cuthbertson
>> Sent: Wednesday, 20 February 2013 9:10 a.m.
>> To: hotspot-gc-use at openjdk.java.net
>> Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs
>>  
>> Hi Ashley,
>> 
>> Basically as you surmise one the GC worker threads is being held up when processing a single root. I've seen s similar           issue that's caused by filling up the code cache (where JIT compiled methods are held). The code cache is treated as a single root and so is claimed in its entirety by a single GC worker thread. As a the code cache fills up, the thread that claims the code cache to scan starts getting held up.
>> 
>> A full GC clears the issue because that's where G1 currently does class unloading: the full GC unloads a whole bunch of classes allowing any the compiled code of any of the unloaded classes' methods to be freed by the nmethod sweeper. So after a a full GC the number of compiled methods in the code cache is less.
>> 
>> It could also be the just the sheer number of loaded classes as the system dictionary is also treated as a single claimable root.
>> 
>> I think there's a couple existing CRs to track this. I'll see if I can find the numbers.
>> 
>> Regards,
>> 
>> JohnC
>> 
>> On 2/19/2013 11:24 AM, Ashley Taylor wrote:
>> Hi Matt
>>  
>> Seems that the issue I’m experiencing is unrelated to JNI same issue with JNI calls mocked.
>> Reading that post I noticed that your gc pauses where still increasing after a full gc. In our case a full gc will fix the issue.
>> Will have to keep hunting for the cause in my application.
>>  
>> 
>> Cheers,
>> Ashley
>>  
>> From: Matt Fowles [mailto:matt.fowles at gmail.com] 
>> Sent: Tuesday, 19 February 2013 3:49 p.m.
>> To: Ashley Taylor
>> Cc: hotspot-gc-use at openjdk.java.net
>> Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs
>>  
>> Ashley~
>>  
>> The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI.  Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC.  If these build up without bound, you end up with growing GC times.
>>  
>> The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed.
>>  
>> You can find the full painful search here:
>>  
>> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja
>>  
>> My minimal reproduction is
>>  
>> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV
>>  
>> I sincerely hope my painful experience can save you time ;-)
>>  
>> Matt
>>  
>>  
>>  
>> 
>> On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor <ashley.taylor at sli-systems.com> wrote:
>> Hi Matt
>> Thanks for the quick response.
>>  
>> Yes we do have JNI in this setup, I will disable the JNI link and rerun the test.
>> If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that.
>>  
>> Cheers,
>> Ashley
>>  
>> From: Matt Fowles [mailto:matt.fowles at gmail.com] 
>> Sent: Tuesday, 19 February 2013 3:04 p.m.
>> To: Ashley Taylor
>> Cc: hotspot-gc-use at openjdk.java.net
>> Subject: Re: G1 garbage collection Ext                         Root Scanning time increase linearly as                         application runs
>>  
>> Ashley~
>>  
>> Do you have any JNI in the setup?  I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread.
>>  
>> Matt
>>  
>> 
>> On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor <ashley.taylor at sli-systems.com> wrote:
>> Hi,
>>  
>> We are testing the performance of the G1                                 garbage collection.
>> Our goal is to be able to remove the full gc pause that eventually happens when we CMS.
>>  
>> We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing.
>>  
>> Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning
>> Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms
>>   [Ext Root Scanning (ms):  11.5  0.8  1.5  1.8  1.6  4.8  1.2  1.5  1.2  1.4  1.1  1.6  1.2  1.1  1.1  1.1  1.2  1.2
>>        Avg:   2.1, Min:   0.8, Max:  11.5, Diff:  10.7]
>>  
>>  
>> Here is a snap shot after 19 hours. Here the pause is around 280ms
>>       [Ext Root Scanning (ms):  1.2  184.7  1.3  1.3  1.8  6.3  1.7  1.2  1.5  1.2  1.2  1.1  1.2  1.1  1.2  1.1  1.2  1.2
>>        Avg:  11.8, Min:   1.1, Max: 184.7, Diff: 183.6]
>>  
>> It seems that some task is linearly increasing with time, which only effects one thread.
>>  
>> After manually firing a full gc the total pause time returns back to around 80ms
>>  
>> After full GC
>> [Ext Root Scanning (ms):  2.4  1.7  4.5  2.6  4.6  2.1  2.1  1.7  2.1  1.8  1.8  2.2  0.6  0.0  0.0  0.0  0.0  0.0
>>        Avg:   1.7, Min:   0.0, Max:   4.6, Diff:   4.6]
>>  
>>  
>> The test is run with a constant load applied on the application that should hold the machine at around load 6.
>> We have around 3GB of data within the heap which  will very rarely become garbage, life of these objects would be several hours to days.
>> the rest will only live for 10s of milliseconds.
>> The JVM memory usage floats between 4-6gb.
>>  
>> Have checked a thread dump. There are no threads that have very large stack traces.
>> What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options?
>>  
>>  
>> Environment
>>  
>> JVM Arguments
>> -Xms8g
>> -Xmx8g
>> -XX:+UseG1GC
>> -XX:InitiatingHeapOccupancyPercent=0                                 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application
>> -XX:MaxGCPauseMillis=70
>> -XX:+UseLargePages
>>  
>>  
>> Environment
>> java version "1.7.0_13"
>> Java(TM) SE Runtime Environment (build 1.7.0_13-b20)
>> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
>>  
>>  
>> Operating System
>> redhat 5.8 machine.
>> The machine has 12 cores/ 24threads and 48gb of ram.
>>  
>>  
>>  
>> Cheers,
>> Ashley Taylor
>> Software Engineer
>> Email: ashley.taylor at sli-systems.com
>> Website: www.sli-systems.com
>> Blog: blog.sli-systems.com
>> Podcast: EcommercePodcast.com
>> Twitter: www.twitter.com/slisystems
>>  
>>  
>>  
>> 
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>  
>>  
>> 
>> 
>> 
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
> 
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/dd586629/attachment-0001.html 


More information about the hotspot-gc-use mailing list