G1 garbage collection Ext Root Scanning time increase linearly as application runs

John Cuthbertson john.cuthbertson at oracle.com
Tue Feb 19 12:01:54 PST 2013


Hi Ashley,

Basically as you surmise one the GC worker threads is being held up when 
processing a single root. I've seen s similar issue that's caused by 
filling up the code cache (where JIT compiled methods are held). The 
code cache is treated as a single root and so is claimed in its entirety 
by a single GC worker thread. As a the code cache fills up, the thread 
that claims the code cache to scan starts getting held up.

A full GC clears the issue because that's where G1 currently does class 
unloading: the full GC unloads a whole bunch of classes allowing any the 
compiled code of any of the unloaded classes' methods to be freed by the 
nmethod sweeper. So after a a full GC the number of compiled methods in 
the code cache is less.

It could also be the just the sheer number of loaded classes as the 
system dictionary is also treated as a single claimable root.

I think there's a couple existing CRs to track this. I'll see if I can 
find the numbers.

Regards,

JohnC

On 2/19/2013 11:24 AM, Ashley Taylor wrote:
>
> Hi Matt
>
> Seems that the issue I'm experiencing is unrelated to JNI same issue 
> with JNI calls mocked.
>
> Reading that post I noticed that your gc pauses where still increasing 
> after a full gc. In our case a full gc will fix the issue.
>
> Will have to keep hunting for the cause in my application.
>
>
> Cheers,
>
> Ashley
>
> *From:*Matt Fowles [mailto:matt.fowles at gmail.com]
> *Sent:* Tuesday, 19 February 2013 3:49 p.m.
> *To:* Ashley Taylor
> *Cc:* hotspot-gc-use at openjdk.java.net
> *Subject:* Re: G1 garbage collection Ext Root Scanning time increase 
> linearly as application runs
>
> Ashley~
>
> The issue I was seeing was actually in CMS not G1, but it was 
> eventually tracked down to leaking LocalReferences in the JNI.  Each 
> LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to 
> be scanned every GC.  If these build up without bound, you end up with 
> growing GC times.
>
> The issue that I found essentially boiled down to GetMethodID calls 
> creating a LocalRef and not being freed.
>
> You can find the full painful search here:
>
> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja
>
> My minimal reproduction is
>
> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV
>
> I sincerely hope my painful experience can save you time ;-)
>
> Matt
>
> On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor 
> <ashley.taylor at sli-systems.com <mailto:ashley.taylor at sli-systems.com>> 
> wrote:
>
> Hi Matt
>
> Thanks for the quick response.
>
> Yes we do have JNI in this setup, I will disable the JNI link and 
> rerun the test.
>
> If it is JNI can you elaborate what you mean by leaked handle in a JNI 
> thread and how we would go about identifying and fixing that.
>
> Cheers,
>
> Ashley
>
> *From:*Matt Fowles [mailto:matt.fowles at gmail.com 
> <mailto:matt.fowles at gmail.com>]
> *Sent:* Tuesday, 19 February 2013 3:04 p.m.
> *To:* Ashley Taylor
> *Cc:* hotspot-gc-use at openjdk.java.net 
> <mailto:hotspot-gc-use at openjdk.java.net>
> *Subject:* Re: G1 garbage collection Ext Root Scanning time increase 
> linearly as application runs
>
> Ashley~
>
> Do you have any JNI in the setup?  I saw a similar issue that was 
> painstakingly tracked down to a leaked handle in a JNI thread.
>
> Matt
>
> On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor 
> <ashley.taylor at sli-systems.com <mailto:ashley.taylor at sli-systems.com>> 
> wrote:
>
> Hi,
>
> We are testing the performance of the G1 garbage collection.
>
> Our goal is to be able to remove the full gc pause that eventually 
> happens when we CMS.
>
> We have noticed that the garbage collection pause time starts off 
> really well but over time it keeps climbing.
>
> Looking at the logs we see that the section that is increasing 
> linearly with time is the Ext Root Scanning
>
> Here is a Root Scanning 1 Hour into the application here the total gc 
> pause is around 80ms
>
> [Ext Root Scanning (ms):  11.5  0.8  1.5 1.8  1.6  4.8  1.2  1.5  1.2  
> 1.4  1.1 1.6  1.2  1.1  1.1  1.1  1.2  1.2
>
> Avg:   2.1, Min:   0.8, Max:  11.5, Diff: 10.7]
>
> Here is a snap shot after 19 hours. Here the pause is around 280ms
>
>       [Ext Root Scanning (ms):  1.2  184.7  1.3  1.3 1.8  6.3  1.7  
> 1.2  1.5  1.2  1.2  1.1 1.2  1.1  1.2  1.1  1.2  1.2
>
> Avg:  11.8, Min:   1.1, Max: 184.7, Diff: 183.6]
>
> It seems that some task is linearly increasing with time, which only 
> effects one thread.
>
> After manually firing a full gc the total pause time returns back to 
> around 80ms
>
> After full GC
>
> [Ext Root Scanning (ms):  2.4  1.7  4.5  2.6 4.6  2.1  2.1  1.7  2.1  
> 1.8  1.8  2.2 0.6  0.0  0.0  0.0  0.0  0.0
>
> Avg:   1.7, Min:   0.0, Max:   4.6, Diff:   4.6]
>
> The test is run with a constant load applied on the application that 
> should hold the machine at around load 6.
>
> We have around 3GB of data within the heap which  will very rarely 
> become garbage, life of these objects would be several hours to days.
>
> the rest will only live for 10s of milliseconds.
>
> The JVM memory usage floats between 4-6gb.
>
> Have checked a thread dump. There are no threads that have very large 
> stack traces.
>
> What could cause this increasing pause durations? Is there any way to 
> get more information out of what that thread is actually trying to do, 
> or any tuning options?
>
> Environment
>
> JVM Arguments
>
> -Xms8g
>
> -Xmx8g
>
> -XX:+UseG1GC
>
> -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero 
> has greatly reduced the frequency of GC pause over 500ms and the 
> overhead is not that noticeable to our application
>
> -XX:MaxGCPauseMillis=70
>
> -XX:+UseLargePages
>
> Environment
>
> java version "1.7.0_13"
>
> Java(TM) SE Runtime Environment (build 1.7.0_13-b20)
>
> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)
>
> Operating System
>
> redhat 5.8 machine.
>
> The machine has 12 cores/ 24threads and 48gb of ram.
>
> Cheers,
>
> *Ashley Taylor*
>
> Software Engineer
>
> Email:ashley.taylor at sli-systems.com <mailto:ashley.taylor at sli-systems.com>
>
> Website: www.sli-systems.com <http://www.sli-systems.com/>
>
> Blog: blog.sli-systems.com <http://blog.sli-systems.com/>
>
> Podcast: EcommercePodcast.com <http://ecommercepodcast.com/>
>
> Twitter: www.twitter.com/slisystems <http://www.twitter.com/slisystems>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use at openjdk.java.net>
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/a0e6cb98/attachment-0001.html 


More information about the hotspot-gc-use mailing list