G1 garbage collection Ext Root Scanning time increase linearly as application runs

Matt Fowles matt.fowles at gmail.com
Mon Feb 18 18:49:00 PST 2013


Ashley~

The issue I was seeing was actually in CMS not G1, but it was eventually
tracked down to leaking LocalReferences in the JNI.  Each LocalRef (or
likely GlobalRef) adds 4 bytes to a section that has to be scanned every
GC.  If these build up without bound, you end up with growing GC times.

The issue that I found essentially boiled down to GetMethodID calls
creating a LocalRef and not being freed.

You can find the full painful search here:

http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja

My minimal reproduction is

http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV

I sincerely hope my painful experience can save you time ;-)

Matt




On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor <
ashley.taylor at sli-systems.com> wrote:

>  Hi Matt****
>
> Thanks for the quick response.****
>
> ** **
>
> Yes we do have JNI in this setup, I will disable the JNI link and rerun
> the test.****
>
> If it is JNI can you elaborate what you mean by leaked handle in a JNI
> thread and how we would go about identifying and fixing that.****
>
> ** **
>
> Cheers,****
>
> Ashley****
>
> ** **
>
> *From:* Matt Fowles [mailto:matt.fowles at gmail.com]
> *Sent:* Tuesday, 19 February 2013 3:04 p.m.
> *To:* Ashley Taylor
> *Cc:* hotspot-gc-use at openjdk.java.net
> *Subject:* Re: G1 garbage collection Ext Root Scanning time increase
> linearly as application runs****
>
> ** **
>
> Ashley~****
>
> ** **
>
> Do you have any JNI in the setup?  I saw a similar issue that was
> painstakingly tracked down to a leaked handle in a JNI thread.****
>
> ** **
>
> Matt****
>
> ** **
>
> On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor <
> ashley.taylor at sli-systems.com> wrote:****
>
> Hi,****
>
>  ****
>
> We are testing the performance of the G1 garbage collection.****
>
> Our goal is to be able to remove the full gc pause that eventually happens
> when we CMS.****
>
>  ****
>
> We have noticed that the garbage collection pause time starts off really
> well but over time it keeps climbing.****
>
>  ****
>
> Looking at the logs we see that the section that is increasing linearly
> with time is the Ext Root Scanning****
>
> Here is a Root Scanning 1 Hour into the application here the total gc
> pause is around 80ms****
>
>   [Ext Root Scanning (ms):  11.5  0.8  1.5  1.8  1.6  4.8  1.2  1.5  1.2
> 1.4  1.1  1.6  1.2  1.1  1.1  1.1  1.2  1.2****
>
>        Avg:   2.1, Min:   0.8, Max:  11.5, Diff:  10.7]****
>
>  ****
>
>  ****
>
> Here is a snap shot after 19 hours. Here the pause is around 280ms ****
>
>       [Ext Root Scanning (ms):  1.2  184.7  1.3  1.3  1.8  6.3  1.7  1.2
> 1.5  1.2  1.2  1.1  1.2  1.1  1.2  1.1  1.2  1.2****
>
>        Avg:  11.8, Min:   1.1, Max: 184.7, Diff: 183.6]****
>
>  ****
>
> It seems that some task is linearly increasing with time, which only
> effects one thread.****
>
>  ****
>
> After manually firing a full gc the total pause time returns back to
> around 80ms****
>
>  ****
>
> After full GC****
>
> [Ext Root Scanning (ms):  2.4  1.7  4.5  2.6  4.6  2.1  2.1  1.7  2.1
> 1.8  1.8  2.2  0.6  0.0  0.0  0.0  0.0  0.0****
>
>        Avg:   1.7, Min:   0.0, Max:   4.6, Diff:   4.6]****
>
>  ****
>
>  ****
>
> The test is run with a constant load applied on the application that
> should hold the machine at around load 6.****
>
> We have around 3GB of data within the heap which  will very rarely become
> garbage, life of these objects would be several hours to days.****
>
> the rest will only live for 10s of milliseconds.****
>
> The JVM memory usage floats between 4-6gb.****
>
>  ****
>
> Have checked a thread dump. There are no threads that have very large
> stack traces.****
>
> What could cause this increasing pause durations? Is there any way to get
> more information out of what that thread is actually trying to do, or any
> tuning options?****
>
>  ****
>
>  ****
>
> Environment****
>
>  ****
>
> JVM Arguments****
>
> -Xms8g****
>
> -Xmx8g ****
>
> -XX:+UseG1GC ****
>
> -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has
> greatly reduced the frequency of GC pause over 500ms and the overhead is
> not that noticeable to our application****
>
> -XX:MaxGCPauseMillis=70****
>
> -XX:+UseLargePages****
>
>  ****
>
>  ****
>
> Environment****
>
> java version "1.7.0_13"****
>
> Java(TM) SE Runtime Environment (build 1.7.0_13-b20)****
>
> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)****
>
>  ****
>
>  ****
>
> Operating System****
>
> redhat 5.8 machine.****
>
> The machine has 12 cores/ 24threads and 48gb of ram.****
>
>  ****
>
>  ****
>
>  ****
>
> Cheers,****
>
> *Ashley Taylor*****
>
> Software Engineer****
>
> Email: ashley.taylor at sli-systems.com****
>
> Website: www.sli-systems.com****
>
> Blog: blog.sli-systems.com****
>
> Podcast: EcommercePodcast.com <http://ecommercepodcast.com/>****
>
> Twitter: www.twitter.com/slisystems****
>
>  ****
>
> ****
>
>  ****
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use****
>
>  ** **
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130218/ee77bd33/attachment-0001.html 


More information about the hotspot-gc-use mailing list