From john.cuthbertson at oracle.com Fri Feb 1 14:17:50 2013 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 01 Feb 2013 14:17:50 -0800 Subject: java 1.7.0u4 GarbageCollectionNotificationInfo API In-Reply-To: References: Message-ID: <510C3F0E.9030400@oracle.com> Hi Taras, I'm going to cc the serviceability alias. I think they might be best suited to answer some of your questions. I believe they own the API and the GC provides the data. Answer 1: It should be milliseconds, but there was a bug (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7087969) that is now fixed in hs24 and you could be running into that. Answer 2: This sounds like a bug. Do you have a test case you can share? Answer 3: I'll leave that to the serviceability guys. Regards, JohnC On 1/28/2013 1:11 PM, Taras Tielkes wrote: > > Hi, > > I'm playing around with the new(ish) GarbageCollectionNotificationInfo > API. We're using ParNew+CMS in all our systems, and my first goal is a > comparison between -XX:+PrintGCDetails -verbose:gc output and the > actual data coming through the notification API. I'm using Java > 1.7.0u6 for the experiments. > > So far, I have a number of questions: > 1) duration times > > The javadoc for gcInfo.getDuration() describes the returned value as > expressed in milliseconds. However, the values differ to the gc logs > by several orders of magnitude. How are they calculated? > > On a 1-core Linux x64 VM, the values actually look like microseconds, > but on a Win32 machines I still can't figure out any resemblance to gc > log timings. > > Apart from the unit, what should the value represent? Real time or > user time? > > 2) CMS events with cause "No GC" > > How exactly do the phases of CMS map to the notifications emitted for > the CMS collector? > > I sometimes get events with cause "No GC". Does this indicate a > background CMS cycle being initiated by hitting the occupancy fraction > threshold? > > 3) Eden/Survivor > > It seems that the MemoryUsage API treats Eden and Survivor separately, > i.e. survivor is not a subset of eden. This is different from the gc > log presentation. Is my understanding correct? > > In general, I think it would be useful to have a code sample for the > GC notification API that generates output as close as possible to > -XX:+PrintGCDetails -verbose:gc, as far as the data required to do so > is available. > > The API looks quite promising, it seems it could really benefit from a > bit of documentation love :) > > Thanks, > -tt > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130201/37dca1ab/attachment.html From reachbach at yahoo.com Sun Feb 3 23:26:58 2013 From: reachbach at yahoo.com (Bharath R) Date: Sun, 3 Feb 2013 23:26:58 -0800 (PST) Subject: G1 status in JDK1.6 Vs JDK1.7 In-Reply-To: <1359962096.18794.YahooMailNeo@web162101.mail.bf1.yahoo.com> References: <1359962096.18794.YahooMailNeo@web162101.mail.bf1.yahoo.com> Message-ID: <1359962818.10581.YahooMailNeo@web162103.mail.bf1.yahoo.com> Hi, Is the G1 GC 1.6 port on par with the 1.7 in terms of stability / quality? If that is true, I intend to begin experimenting with it in production and gradually roll it out across our deployment based on the outcome. On a related note, we intend to use G1 for an online system with a very low pause time requirement ( <10ms). The hardware is heterogeneous in terms of memory (ranges between 12G - 32G available to the application process) with comparable CPU configuration. CMS required considerable tuning to achieve acceptable results and I'm hoping G1 would fare better without myraid config options or overrides. I'd like to know of comparisons / experience operating G1 in production under such conditions. Thanks in advance. -Bharath P.S: Using RTJ is not an option for us :) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130203/14664a74/attachment.html From jesper.wilhelmsson at oracle.com Wed Feb 6 16:21:20 2013 From: jesper.wilhelmsson at oracle.com (Jesper Wilhelmsson) Date: Thu, 07 Feb 2013 01:21:20 +0100 Subject: G1 status in JDK1.6 Vs JDK1.7 In-Reply-To: <1359962818.10581.YahooMailNeo@web162103.mail.bf1.yahoo.com> References: <1359962096.18794.YahooMailNeo@web162101.mail.bf1.yahoo.com> <1359962818.10581.YahooMailNeo@web162103.mail.bf1.yahoo.com> Message-ID: <5112F380.5040403@oracle.com> Hi Bharath, The first supported release of G1 was with 7u4. The 7u4 version came with significant improvements and I do not recommend doing performance evaluations with earlier versions. If you decide to move to JDK 7 and try G1 please share your experiences. /Jesper On 4/2/13 8:26 AM, Bharath R wrote: > Hi, > > Is the G1 GC 1.6 port on par with the 1.7 in terms of stability / > quality? If that is true, I intend to begin experimenting with it in > production and gradually roll it out across our deployment based on the > outcome. On a related note, we intend to use G1 for an online system > with a very low pause time requirement ( <10ms). The hardware is > heterogeneous in terms of memory (ranges between 12G - 32G available to > the application process) with comparable CPU configuration. CMS required > considerable tuning to achieve acceptable results and I'm hoping G1 > would fare better without myraid config options or overrides. > I'd like to know of comparisons / experience operating G1 in production > under such conditions. Thanks in advance. > > -Bharath > > P.S: Using RTJ is not an option for us :) > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From dahouet at gmail.com Wed Feb 13 04:40:24 2013 From: dahouet at gmail.com (Nicolas VIAL) Date: Wed, 13 Feb 2013 13:40:24 +0100 Subject: ParNew Allocation Failure Message-ID: Hello I'm trying to use HugePages on a high memory server. Seems to be working fine except for this king of error messages : ParNew occured at 2013-02-13 13:34:57.469, took 77ms (Allocation Failure) eden(-838912) old(+103897) JVM started with : d64 XX:+UseCompressedOops server Xms30720M Xmx30720M XX:+UseLargePages XX:PermSize=512m XX:MaxPermSize=512m XX:NewSize=1024m XX:MaxNewSize=1024m XX:InitialCodeCacheSize=256m XX:ReservedCodeCacheSize=256m XX:CompileThreshold=1000 XX:+UseParNewGC XX:+PrintGCDetails Statistics : gc(ParNew)[count=20, time=2839], gc(MarkSweepCompact)[count=2, time=472], eden[used=405995, commited=838912], survivor[used=104832, commited=104832], old[used=2771853, commited=30408704], perm[used=126512, commited=1048576], code[used=24987, commited=262144], compile[count=7492, time=69911, invalidated=0, failed=4, threads=2], threads[count=391, daemon=25, total=397, internal=15], class[loaded=18682, unloaded=0, initialized=11504, loadtime=8524, inittime=2783, veriftime=4048], descriptors[open=335], os[loadavg=0%, physicalfree=4557952, swapfree=4194296, virtual=39611076], cpu[load=0%], disk[rate=75319, used=35%] Using : java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Hope i can get some help Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130213/1ed10867/attachment.html From bernd.eckenfels at googlemail.com Wed Feb 13 05:07:23 2013 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Wed, 13 Feb 2013 14:07:23 +0100 Subject: ParNew Allocation Failure In-Reply-To: References: Message-ID: Am 13.02.2013, 13:40 Uhr, schrieb Nicolas VIAL : > ParNew occured at 2013-02-13 13:34:57.469, took 77ms (Allocation Failure) > eden(-838912) old(+103897) 77ms does not seem like very long. How often do you see them? If you want to reduce that, you will need to reduce NewSize. Do you mean the deadline is only violated with UseLargePages but not without? Did you check for memory pressure on heap memory? Is that a 32GB system? 30GB looks a bit large for that. Gruss Bernd -- https://plus.google.com/u/1/108084227682171831683/about From taras.tielkes at gmail.com Sun Feb 17 03:15:07 2013 From: taras.tielkes at gmail.com (Taras Tielkes) Date: Sun, 17 Feb 2013 12:15:07 +0100 Subject: java 1.7.0u4 GarbageCollectionNotificationInfo API In-Reply-To: <510C3F0E.9030400@oracle.com> References: <510C3F0E.9030400@oracle.com> Message-ID: Hi John, Thanks for the feedback. The milliseconds/ticks issue indeed seems to be bug 7087969. Will the upcoming 7u14 contain hs24, and the fix? Regarding the "No GC" cause, I think http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8006954 might be the underlying issue. If I understand correctly, it's now fixed for hs24 as well, and will hopefully be part of 7u14. I'll post any follow-up questions regarding the GC Notification API to the serviceability-dev mailing list. Kind regards, -tt On Fri, Feb 1, 2013 at 11:17 PM, John Cuthbertson < john.cuthbertson at oracle.com> wrote: > Hi Taras, > > I'm going to cc the serviceability alias. I think they might be best > suited to answer some of your questions. I believe they own the API and the > GC provides the data. > > Answer 1: It should be milliseconds, but there was a bug ( > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7087969) that is now > fixed in hs24 and you could be running into that. > > Answer 2: This sounds like a bug. Do you have a test case you can share? > > Answer 3: I'll leave that to the serviceability guys. > > Regards, > > JohnC > > > On 1/28/2013 1:11 PM, Taras Tielkes wrote: > > > Hi, > > I'm playing around with the new(ish) GarbageCollectionNotificationInfo > API. We're using ParNew+CMS in all our systems, and my first goal is a > comparison between -XX:+PrintGCDetails -verbose:gc output and the actual > data coming through the notification API. I'm using Java 1.7.0u6 for the > experiments. > > So far, I have a number of questions: > 1) duration times > > The javadoc for gcInfo.getDuration() describes the returned value as > expressed in milliseconds. However, the values differ to the gc logs by > several orders of magnitude. How are they calculated? > > On a 1-core Linux x64 VM, the values actually look like microseconds, > but on a Win32 machines I still can't figure out any resemblance to gc log > timings. > > Apart from the unit, what should the value represent? Real time or user > time? > > 2) CMS events with cause "No GC" > > How exactly do the phases of CMS map to the notifications emitted for > the CMS collector? > > I sometimes get events with cause "No GC". Does this indicate a > background CMS cycle being initiated by hitting the occupancy fraction > threshold? > > 3) Eden/Survivor > > It seems that the MemoryUsage API treats Eden and Survivor separately, > i.e. survivor is not a subset of eden. This is different from the gc log > presentation. Is my understanding correct? > > In general, I think it would be useful to have a code sample for the GC > notification API that generates output as close as possible to > -XX:+PrintGCDetails -verbose:gc, as far as the data required to do so is > available. > > The API looks quite promising, it seems it could really benefit from a > bit of documentation love :) > > Thanks, > -tt > > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130217/3b971380/attachment.html From ashley.taylor at sli-systems.com Mon Feb 18 17:12:11 2013 From: ashley.taylor at sli-systems.com (Ashley Taylor) Date: Tue, 19 Feb 2013 01:12:11 +0000 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs Message-ID: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> Hi, We are testing the performance of the G1 garbage collection. Our goal is to be able to remove the full gc pause that eventually happens when we CMS. We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] Here is a snap shot after 19 hours. Here the pause is around 280ms [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] It seems that some task is linearly increasing with time, which only effects one thread. After manually firing a full gc the total pause time returns back to around 80ms After full GC [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] The test is run with a constant load applied on the application that should hold the machine at around load 6. We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. the rest will only live for 10s of milliseconds. The JVM memory usage floats between 4-6gb. Have checked a thread dump. There are no threads that have very large stack traces. What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? Environment JVM Arguments -Xms8g -Xmx8g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application -XX:MaxGCPauseMillis=70 -XX:+UseLargePages Environment java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Operating System redhat 5.8 machine. The machine has 12 cores/ 24threads and 48gb of ram. Cheers, Ashley Taylor Software Engineer Email: ashley.taylor at sli-systems.com Website: www.sli-systems.com Blog: blog.sli-systems.com Podcast: EcommercePodcast.com Twitter: www.twitter.com/slisystems [sli_logo_2011] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/2b6e9da6/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8602 bytes Desc: image001.png Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/2b6e9da6/image001.png From matt.fowles at gmail.com Mon Feb 18 18:03:38 2013 From: matt.fowles at gmail.com (Matt Fowles) Date: Mon, 18 Feb 2013 21:03:38 -0500 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> Message-ID: Ashley~ Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. Matt On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor < ashley.taylor at sli-systems.com> wrote: > Hi,**** > > ** ** > > We are testing the performance of the G1 garbage collection.**** > > Our goal is to be able to remove the full gc pause that eventually happens > when we CMS.**** > > ** ** > > We have noticed that the garbage collection pause time starts off really > well but over time it keeps climbing.**** > > ** ** > > Looking at the logs we see that the section that is increasing linearly > with time is the Ext Root Scanning**** > > Here is a Root Scanning 1 Hour into the application here the total gc > pause is around 80ms**** > > [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 > 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2**** > > Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7]**** > > ** ** > > ** ** > > Here is a snap shot after 19 hours. Here the pause is around 280ms **** > > [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 > 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2**** > > Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6]**** > > ** ** > > It seems that some task is linearly increasing with time, which only > effects one thread.**** > > ** ** > > After manually firing a full gc the total pause time returns back to > around 80ms**** > > ** ** > > After full GC**** > > [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 > 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0**** > > Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6]**** > > ** ** > > ** ** > > The test is run with a constant load applied on the application that > should hold the machine at around load 6.**** > > We have around 3GB of data within the heap which will very rarely become > garbage, life of these objects would be several hours to days.**** > > the rest will only live for 10s of milliseconds.**** > > The JVM memory usage floats between 4-6gb.**** > > ** ** > > Have checked a thread dump. There are no threads that have very large > stack traces.**** > > What could cause this increasing pause durations? Is there any way to get > more information out of what that thread is actually trying to do, or any > tuning options?**** > > ** ** > > ** ** > > Environment**** > > ** ** > > JVM Arguments**** > > -Xms8g**** > > -Xmx8g **** > > -XX:+UseG1GC **** > > -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has > greatly reduced the frequency of GC pause over 500ms and the overhead is > not that noticeable to our application**** > > -XX:MaxGCPauseMillis=70**** > > -XX:+UseLargePages**** > > ** ** > > ** ** > > Environment**** > > java version "1.7.0_13"**** > > Java(TM) SE Runtime Environment (build 1.7.0_13-b20)**** > > Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)**** > > ** ** > > ** ** > > Operating System**** > > redhat 5.8 machine.**** > > The machine has 12 cores/ 24threads and 48gb of ram.**** > > ** ** > > ** ** > > ** ** > > Cheers,**** > > *Ashley Taylor* > > Software Engineer**** > > Email: ashley.taylor at sli-systems.com**** > > Website: www.sli-systems.com**** > > Blog: blog.sli-systems.com**** > > Podcast: EcommercePodcast.com **** > > Twitter: www.twitter.com/slisystems**** > > ** ** > > [image: sli_logo_2011]** > > ** ** > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130218/a1e85188/attachment-0001.html From ashley.taylor at sli-systems.com Mon Feb 18 18:29:36 2013 From: ashley.taylor at sli-systems.com (Ashley Taylor) Date: Tue, 19 Feb 2013 02:29:36 +0000 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> Message-ID: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> Hi Matt Thanks for the quick response. Yes we do have JNI in this setup, I will disable the JNI link and rerun the test. If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:04 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. Matt On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > wrote: Hi, We are testing the performance of the G1 garbage collection. Our goal is to be able to remove the full gc pause that eventually happens when we CMS. We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] Here is a snap shot after 19 hours. Here the pause is around 280ms [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] It seems that some task is linearly increasing with time, which only effects one thread. After manually firing a full gc the total pause time returns back to around 80ms After full GC [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] The test is run with a constant load applied on the application that should hold the machine at around load 6. We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. the rest will only live for 10s of milliseconds. The JVM memory usage floats between 4-6gb. Have checked a thread dump. There are no threads that have very large stack traces. What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? Environment JVM Arguments -Xms8g -Xmx8g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application -XX:MaxGCPauseMillis=70 -XX:+UseLargePages Environment java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Operating System redhat 5.8 machine. The machine has 12 cores/ 24threads and 48gb of ram. Cheers, Ashley Taylor Software Engineer Email: ashley.taylor at sli-systems.com Website: www.sli-systems.com Blog: blog.sli-systems.com Podcast: EcommercePodcast.com Twitter: www.twitter.com/slisystems _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/44a34a6e/attachment.html From matt.fowles at gmail.com Mon Feb 18 18:49:00 2013 From: matt.fowles at gmail.com (Matt Fowles) Date: Mon, 18 Feb 2013 21:49:00 -0500 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> Message-ID: Ashley~ The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI. Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC. If these build up without bound, you end up with growing GC times. The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed. You can find the full painful search here: http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja My minimal reproduction is http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV I sincerely hope my painful experience can save you time ;-) Matt On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor < ashley.taylor at sli-systems.com> wrote: > Hi Matt**** > > Thanks for the quick response.**** > > ** ** > > Yes we do have JNI in this setup, I will disable the JNI link and rerun > the test.**** > > If it is JNI can you elaborate what you mean by leaked handle in a JNI > thread and how we would go about identifying and fixing that.**** > > ** ** > > Cheers,**** > > Ashley**** > > ** ** > > *From:* Matt Fowles [mailto:matt.fowles at gmail.com] > *Sent:* Tuesday, 19 February 2013 3:04 p.m. > *To:* Ashley Taylor > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 garbage collection Ext Root Scanning time increase > linearly as application runs**** > > ** ** > > Ashley~**** > > ** ** > > Do you have any JNI in the setup? I saw a similar issue that was > painstakingly tracked down to a leaked handle in a JNI thread.**** > > ** ** > > Matt**** > > ** ** > > On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor < > ashley.taylor at sli-systems.com> wrote:**** > > Hi,**** > > **** > > We are testing the performance of the G1 garbage collection.**** > > Our goal is to be able to remove the full gc pause that eventually happens > when we CMS.**** > > **** > > We have noticed that the garbage collection pause time starts off really > well but over time it keeps climbing.**** > > **** > > Looking at the logs we see that the section that is increasing linearly > with time is the Ext Root Scanning**** > > Here is a Root Scanning 1 Hour into the application here the total gc > pause is around 80ms**** > > [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 > 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2**** > > Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7]**** > > **** > > **** > > Here is a snap shot after 19 hours. Here the pause is around 280ms **** > > [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 > 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2**** > > Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6]**** > > **** > > It seems that some task is linearly increasing with time, which only > effects one thread.**** > > **** > > After manually firing a full gc the total pause time returns back to > around 80ms**** > > **** > > After full GC**** > > [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 > 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0**** > > Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6]**** > > **** > > **** > > The test is run with a constant load applied on the application that > should hold the machine at around load 6.**** > > We have around 3GB of data within the heap which will very rarely become > garbage, life of these objects would be several hours to days.**** > > the rest will only live for 10s of milliseconds.**** > > The JVM memory usage floats between 4-6gb.**** > > **** > > Have checked a thread dump. There are no threads that have very large > stack traces.**** > > What could cause this increasing pause durations? Is there any way to get > more information out of what that thread is actually trying to do, or any > tuning options?**** > > **** > > **** > > Environment**** > > **** > > JVM Arguments**** > > -Xms8g**** > > -Xmx8g **** > > -XX:+UseG1GC **** > > -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has > greatly reduced the frequency of GC pause over 500ms and the overhead is > not that noticeable to our application**** > > -XX:MaxGCPauseMillis=70**** > > -XX:+UseLargePages**** > > **** > > **** > > Environment**** > > java version "1.7.0_13"**** > > Java(TM) SE Runtime Environment (build 1.7.0_13-b20)**** > > Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)**** > > **** > > **** > > Operating System**** > > redhat 5.8 machine.**** > > The machine has 12 cores/ 24threads and 48gb of ram.**** > > **** > > **** > > **** > > Cheers,**** > > *Ashley Taylor***** > > Software Engineer**** > > Email: ashley.taylor at sli-systems.com**** > > Website: www.sli-systems.com**** > > Blog: blog.sli-systems.com**** > > Podcast: EcommercePodcast.com **** > > Twitter: www.twitter.com/slisystems**** > > **** > > **** > > **** > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use**** > > ** ** > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130218/ee77bd33/attachment-0001.html From ashley.taylor at sli-systems.com Tue Feb 19 11:24:16 2013 From: ashley.taylor at sli-systems.com (Ashley Taylor) Date: Tue, 19 Feb 2013 19:24:16 +0000 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> Message-ID: <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> Hi Matt Seems that the issue I'm experiencing is unrelated to JNI same issue with JNI calls mocked. Reading that post I noticed that your gc pauses where still increasing after a full gc. In our case a full gc will fix the issue. Will have to keep hunting for the cause in my application. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:49 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI. Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC. If these build up without bound, you end up with growing GC times. The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed. You can find the full painful search here: http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja My minimal reproduction is http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV I sincerely hope my painful experience can save you time ;-) Matt On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor > wrote: Hi Matt Thanks for the quick response. Yes we do have JNI in this setup, I will disable the JNI link and rerun the test. If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:04 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. Matt On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > wrote: Hi, We are testing the performance of the G1 garbage collection. Our goal is to be able to remove the full gc pause that eventually happens when we CMS. We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] Here is a snap shot after 19 hours. Here the pause is around 280ms [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] It seems that some task is linearly increasing with time, which only effects one thread. After manually firing a full gc the total pause time returns back to around 80ms After full GC [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] The test is run with a constant load applied on the application that should hold the machine at around load 6. We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. the rest will only live for 10s of milliseconds. The JVM memory usage floats between 4-6gb. Have checked a thread dump. There are no threads that have very large stack traces. What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? Environment JVM Arguments -Xms8g -Xmx8g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application -XX:MaxGCPauseMillis=70 -XX:+UseLargePages Environment java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Operating System redhat 5.8 machine. The machine has 12 cores/ 24threads and 48gb of ram. Cheers, Ashley Taylor Software Engineer Email: ashley.taylor at sli-systems.com Website: www.sli-systems.com Blog: blog.sli-systems.com Podcast: EcommercePodcast.com Twitter: www.twitter.com/slisystems _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/62cca774/attachment-0001.html From john.cuthbertson at oracle.com Tue Feb 19 12:01:54 2013 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 19 Feb 2013 12:01:54 -0800 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> Message-ID: <5123DA32.806@oracle.com> Hi Ashley, Basically as you surmise one the GC worker threads is being held up when processing a single root. I've seen s similar issue that's caused by filling up the code cache (where JIT compiled methods are held). The code cache is treated as a single root and so is claimed in its entirety by a single GC worker thread. As a the code cache fills up, the thread that claims the code cache to scan starts getting held up. A full GC clears the issue because that's where G1 currently does class unloading: the full GC unloads a whole bunch of classes allowing any the compiled code of any of the unloaded classes' methods to be freed by the nmethod sweeper. So after a a full GC the number of compiled methods in the code cache is less. It could also be the just the sheer number of loaded classes as the system dictionary is also treated as a single claimable root. I think there's a couple existing CRs to track this. I'll see if I can find the numbers. Regards, JohnC On 2/19/2013 11:24 AM, Ashley Taylor wrote: > > Hi Matt > > Seems that the issue I'm experiencing is unrelated to JNI same issue > with JNI calls mocked. > > Reading that post I noticed that your gc pauses where still increasing > after a full gc. In our case a full gc will fix the issue. > > Will have to keep hunting for the cause in my application. > > > Cheers, > > Ashley > > *From:*Matt Fowles [mailto:matt.fowles at gmail.com] > *Sent:* Tuesday, 19 February 2013 3:49 p.m. > *To:* Ashley Taylor > *Cc:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 garbage collection Ext Root Scanning time increase > linearly as application runs > > Ashley~ > > The issue I was seeing was actually in CMS not G1, but it was > eventually tracked down to leaking LocalReferences in the JNI. Each > LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to > be scanned every GC. If these build up without bound, you end up with > growing GC times. > > The issue that I found essentially boiled down to GetMethodID calls > creating a LocalRef and not being freed. > > You can find the full painful search here: > > http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja > > My minimal reproduction is > > http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV > > I sincerely hope my painful experience can save you time ;-) > > Matt > > On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor > > > wrote: > > Hi Matt > > Thanks for the quick response. > > Yes we do have JNI in this setup, I will disable the JNI link and > rerun the test. > > If it is JNI can you elaborate what you mean by leaked handle in a JNI > thread and how we would go about identifying and fixing that. > > Cheers, > > Ashley > > *From:*Matt Fowles [mailto:matt.fowles at gmail.com > ] > *Sent:* Tuesday, 19 February 2013 3:04 p.m. > *To:* Ashley Taylor > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: G1 garbage collection Ext Root Scanning time increase > linearly as application runs > > Ashley~ > > Do you have any JNI in the setup? I saw a similar issue that was > painstakingly tracked down to a leaked handle in a JNI thread. > > Matt > > On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > > > wrote: > > Hi, > > We are testing the performance of the G1 garbage collection. > > Our goal is to be able to remove the full gc pause that eventually > happens when we CMS. > > We have noticed that the garbage collection pause time starts off > really well but over time it keeps climbing. > > Looking at the logs we see that the section that is increasing > linearly with time is the Ext Root Scanning > > Here is a Root Scanning 1 Hour into the application here the total gc > pause is around 80ms > > [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 > 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 > > Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] > > Here is a snap shot after 19 hours. Here the pause is around 280ms > > [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 > 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 > > Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] > > It seems that some task is linearly increasing with time, which only > effects one thread. > > After manually firing a full gc the total pause time returns back to > around 80ms > > After full GC > > [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 > 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 > > Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] > > The test is run with a constant load applied on the application that > should hold the machine at around load 6. > > We have around 3GB of data within the heap which will very rarely > become garbage, life of these objects would be several hours to days. > > the rest will only live for 10s of milliseconds. > > The JVM memory usage floats between 4-6gb. > > Have checked a thread dump. There are no threads that have very large > stack traces. > > What could cause this increasing pause durations? Is there any way to > get more information out of what that thread is actually trying to do, > or any tuning options? > > Environment > > JVM Arguments > > -Xms8g > > -Xmx8g > > -XX:+UseG1GC > > -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero > has greatly reduced the frequency of GC pause over 500ms and the > overhead is not that noticeable to our application > > -XX:MaxGCPauseMillis=70 > > -XX:+UseLargePages > > Environment > > java version "1.7.0_13" > > Java(TM) SE Runtime Environment (build 1.7.0_13-b20) > > Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) > > Operating System > > redhat 5.8 machine. > > The machine has 12 cores/ 24threads and 48gb of ram. > > Cheers, > > *Ashley Taylor* > > Software Engineer > > Email:ashley.taylor at sli-systems.com > > Website: www.sli-systems.com > > Blog: blog.sli-systems.com > > Podcast: EcommercePodcast.com > > Twitter: www.twitter.com/slisystems > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/a0e6cb98/attachment-0001.html From ashley.taylor at sli-systems.com Tue Feb 19 17:11:04 2013 From: ashley.taylor at sli-systems.com (Ashley Taylor) Date: Wed, 20 Feb 2013 01:11:04 +0000 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <5123DA32.806@oracle.com> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> <5123DA32.806@oracle.com> Message-ID: <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> Hi John I reran my application with the JIT log turned on. It seems that once the application has been running for a while there is very little activity within the JIT log but the pause times keep climbing, I ran it for 4 hours and the 'Ext Root Scan' had climbed to 40ms. At the 4 hour point I also performed a full gc to see how many classes would be unload and it was only 50. We have around 5500 loaded classes. The number of loaded classes also does not increase once the application has run for a while. I also used jstat to see how full the permanent memory region is, it is slowly climbing the full gc did not seem to reduce it at all, however the full gc did fix the pause time. The permanent region is currently at 89.17% and seems to increase by 0.01% every couple of minutes. Is there any other GC events that only happen at a full gc? Cheers, Ashley From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of John Cuthbertson Sent: Wednesday, 20 February 2013 9:10 a.m. To: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Hi Ashley, Basically as you surmise one the GC worker threads is being held up when processing a single root. I've seen s similar issue that's caused by filling up the code cache (where JIT compiled methods are held). The code cache is treated as a single root and so is claimed in its entirety by a single GC worker thread. As a the code cache fills up, the thread that claims the code cache to scan starts getting held up. A full GC clears the issue because that's where G1 currently does class unloading: the full GC unloads a whole bunch of classes allowing any the compiled code of any of the unloaded classes' methods to be freed by the nmethod sweeper. So after a a full GC the number of compiled methods in the code cache is less. It could also be the just the sheer number of loaded classes as the system dictionary is also treated as a single claimable root. I think there's a couple existing CRs to track this. I'll see if I can find the numbers. Regards, JohnC On 2/19/2013 11:24 AM, Ashley Taylor wrote: Hi Matt Seems that the issue I'm experiencing is unrelated to JNI same issue with JNI calls mocked. Reading that post I noticed that your gc pauses where still increasing after a full gc. In our case a full gc will fix the issue. Will have to keep hunting for the cause in my application. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:49 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI. Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC. If these build up without bound, you end up with growing GC times. The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed. You can find the full painful search here: http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja My minimal reproduction is http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV I sincerely hope my painful experience can save you time ;-) Matt On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor > wrote: Hi Matt Thanks for the quick response. Yes we do have JNI in this setup, I will disable the JNI link and rerun the test. If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:04 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. Matt On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > wrote: Hi, We are testing the performance of the G1 garbage collection. Our goal is to be able to remove the full gc pause that eventually happens when we CMS. We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] Here is a snap shot after 19 hours. Here the pause is around 280ms [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] It seems that some task is linearly increasing with time, which only effects one thread. After manually firing a full gc the total pause time returns back to around 80ms After full GC [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] The test is run with a constant load applied on the application that should hold the machine at around load 6. We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. the rest will only live for 10s of milliseconds. The JVM memory usage floats between 4-6gb. Have checked a thread dump. There are no threads that have very large stack traces. What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? Environment JVM Arguments -Xms8g -Xmx8g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application -XX:MaxGCPauseMillis=70 -XX:+UseLargePages Environment java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Operating System redhat 5.8 machine. The machine has 12 cores/ 24threads and 48gb of ram. Cheers, Ashley Taylor Software Engineer Email: ashley.taylor at sli-systems.com Website: www.sli-systems.com Blog: blog.sli-systems.com Podcast: EcommercePodcast.com Twitter: www.twitter.com/slisystems _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130220/a0c199a3/attachment-0001.html From john.cuthbertson at oracle.com Tue Feb 19 17:38:24 2013 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 19 Feb 2013 17:38:24 -0800 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> <5123DA32.806@oracle.com> <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> Message-ID: <51242910.8080604@oracle.com> Hi Ashely, Off the top of my head there's also the intern string table. I'll have to look at the code to figure out what else it could be. Thanks for the info. JohnC On 2/19/2013 5:11 PM, Ashley Taylor wrote: > > Hi John > > I reran my application with the JIT log turned on. It seems that once > the application has been running for a while there is very little > activity within the JIT log but the pause times keep climbing, I ran > it for 4 hours and the 'Ext Root Scan' had climbed to 40ms. > > At the 4 hour point I also performed a full gc to see how many classes > would be unload and it was only 50. We have around 5500 loaded classes. > > The number of loaded classes also does not increase once the > application has run for a while. > > I also used jstat to see how full the permanent memory region is, it > is slowly climbing the full gc did not seem to reduce it at all, > however the full gc did fix the pause time. > > The permanent region is currently at 89.17% and seems to increase by > 0.01% every couple of minutes. > > Is there any other GC events that only happen at a full gc? > > Cheers, > > Ashley > > *From:*hotspot-gc-use-bounces at openjdk.java.net > [mailto:hotspot-gc-use-bounces at openjdk.java.net] *On Behalf Of *John > Cuthbertson > *Sent:* Wednesday, 20 February 2013 9:10 a.m. > *To:* hotspot-gc-use at openjdk.java.net > *Subject:* Re: G1 garbage collection Ext Root Scanning time increase > linearly as application runs > > Hi Ashley, > > Basically as you surmise one the GC worker threads is being held up > when processing a single root. I've seen s similar issue that's caused > by filling up the code cache (where JIT compiled methods are held). > The code cache is treated as a single root and so is claimed in its > entirety by a single GC worker thread. As a the code cache fills up, > the thread that claims the code cache to scan starts getting held up. > > A full GC clears the issue because that's where G1 currently does > class unloading: the full GC unloads a whole bunch of classes allowing > any the compiled code of any of the unloaded classes' methods to be > freed by the nmethod sweeper. So after a a full GC the number of > compiled methods in the code cache is less. > > It could also be the just the sheer number of loaded classes as the > system dictionary is also treated as a single claimable root. > > I think there's a couple existing CRs to track this. I'll see if I can > find the numbers. > > Regards, > > JohnC > > On 2/19/2013 11:24 AM, Ashley Taylor wrote: > > Hi Matt > > Seems that the issue I'm experiencing is unrelated to JNI same > issue with JNI calls mocked. > > Reading that post I noticed that your gc pauses where still > increasing after a full gc. In our case a full gc will fix the issue. > > Will have to keep hunting for the cause in my application. > > > Cheers, > > Ashley > > *From:*Matt Fowles [mailto:matt.fowles at gmail.com] > *Sent:* Tuesday, 19 February 2013 3:49 p.m. > *To:* Ashley Taylor > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: G1 garbage collection Ext Root Scanning time > increase linearly as application runs > > Ashley~ > > The issue I was seeing was actually in CMS not G1, but it was > eventually tracked down to leaking LocalReferences in the JNI. > Each LocalRef (or likely GlobalRef) adds 4 bytes to a section > that has to be scanned every GC. If these build up without bound, > you end up with growing GC times. > > The issue that I found essentially boiled down to GetMethodID > calls creating a LocalRef and not being freed. > > You can find the full painful search here: > > http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja > > My minimal reproduction is > > http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV > > I sincerely hope my painful experience can save you time ;-) > > Matt > > On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor > > wrote: > > Hi Matt > > Thanks for the quick response. > > Yes we do have JNI in this setup, I will disable the JNI link and > rerun the test. > > If it is JNI can you elaborate what you mean by leaked handle in a > JNI thread and how we would go about identifying and fixing that. > > Cheers, > > Ashley > > *From:*Matt Fowles [mailto:matt.fowles at gmail.com > ] > *Sent:* Tuesday, 19 February 2013 3:04 p.m. > *To:* Ashley Taylor > *Cc:* hotspot-gc-use at openjdk.java.net > > *Subject:* Re: G1 garbage collection Ext Root Scanning time > increase linearly as application runs > > Ashley~ > > Do you have any JNI in the setup? I saw a similar issue that was > painstakingly tracked down to a leaked handle in a JNI thread. > > Matt > > On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > > wrote: > > Hi, > > We are testing the performance of the G1 garbage collection. > > Our goal is to be able to remove the full gc pause that eventually > happens when we CMS. > > We have noticed that the garbage collection pause time starts off > really well but over time it keeps climbing. > > Looking at the logs we see that the section that is increasing > linearly with time is the Ext Root Scanning > > Here is a Root Scanning 1 Hour into the application here the total > gc pause is around 80ms > > [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 > 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 > > Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] > > Here is a snap shot after 19 hours. Here the pause is around 280ms > > [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 > 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 > > Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] > > It seems that some task is linearly increasing with time, which > only effects one thread. > > After manually firing a full gc the total pause time returns back > to around 80ms > > After full GC > > [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 > 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 > > Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] > > The test is run with a constant load applied on the application > that should hold the machine at around load 6. > > We have around 3GB of data within the heap which will very rarely > become garbage, life of these objects would be several hours to days. > > the rest will only live for 10s of milliseconds. > > The JVM memory usage floats between 4-6gb. > > Have checked a thread dump. There are no threads that have very > large stack traces. > > What could cause this increasing pause durations? Is there any way > to get more information out of what that thread is actually trying > to do, or any tuning options? > > Environment > > JVM Arguments > > -Xms8g > > -Xmx8g > > -XX:+UseG1GC > > -XX:InitiatingHeapOccupancyPercent=0 #found that having this at > zero has greatly reduced the frequency of GC pause over 500ms and > the overhead is not that noticeable to our application > > -XX:MaxGCPauseMillis=70 > > -XX:+UseLargePages > > Environment > > java version "1.7.0_13" > > Java(TM) SE Runtime Environment (build 1.7.0_13-b20) > > Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) > > Operating System > > redhat 5.8 machine. > > The machine has 12 cores/ 24threads and 48gb of ram. > > Cheers, > > *Ashley Taylor* > > Software Engineer > > Email:ashley.taylor at sli-systems.com > > > Website: www.sli-systems.com > > Blog: blog.sli-systems.com > > Podcast: EcommercePodcast.com > > Twitter: www.twitter.com/slisystems > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/0c13f81c/attachment-0001.html From ysr1729 at gmail.com Tue Feb 19 22:24:55 2013 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Tue, 19 Feb 2013 22:24:55 -0800 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <51242910.8080604@oracle.com> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> <5123DA32.806@oracle.com> <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> <51242910.8080604@oracle.com> Message-ID: <147D1F53-A270-4CBA-8490-A242477EB1E1@gmail.com> Perhaps Ashley could build an instrumented jvm with time trace around the various external root groups scanned serially and the answer would be immediate? ysr1729 On Feb 19, 2013, at 17:38, John Cuthbertson wrote: > Hi Ashely, > > Off the top of my head there's also the intern string table. I'll have to look at the code to figure out what else it could be. > > Thanks for the info. > > JohnC > > On 2/19/2013 5:11 PM, Ashley Taylor wrote: >> Hi John >> >> I reran my application with the JIT log turned on. It seems that once the application has been running for a while there is very little activity within the JIT log but the pause times keep climbing, I ran it for 4 hours and the ?Ext Root Scan? had climbed to 40ms. >> >> At the 4 hour point I also performed a full gc to see how many classes would be unload and it was only 50. We have around 5500 loaded classes. >> The number of loaded classes also does not increase once the application has run for a while. >> >> I also used jstat to see how full the permanent memory region is, it is slowly climbing the full gc did not seem to reduce it at all, however the full gc did fix the pause time. >> >> The permanent region is currently at 89.17% and seems to increase by 0.01% every couple of minutes. >> >> Is there any other GC events that only happen at a full gc? >> >> Cheers, >> Ashley >> >> >> >> From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of John Cuthbertson >> Sent: Wednesday, 20 February 2013 9:10 a.m. >> To: hotspot-gc-use at openjdk.java.net >> Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs >> >> Hi Ashley, >> >> Basically as you surmise one the GC worker threads is being held up when processing a single root. I've seen s similar issue that's caused by filling up the code cache (where JIT compiled methods are held). The code cache is treated as a single root and so is claimed in its entirety by a single GC worker thread. As a the code cache fills up, the thread that claims the code cache to scan starts getting held up. >> >> A full GC clears the issue because that's where G1 currently does class unloading: the full GC unloads a whole bunch of classes allowing any the compiled code of any of the unloaded classes' methods to be freed by the nmethod sweeper. So after a a full GC the number of compiled methods in the code cache is less. >> >> It could also be the just the sheer number of loaded classes as the system dictionary is also treated as a single claimable root. >> >> I think there's a couple existing CRs to track this. I'll see if I can find the numbers. >> >> Regards, >> >> JohnC >> >> On 2/19/2013 11:24 AM, Ashley Taylor wrote: >> Hi Matt >> >> Seems that the issue I?m experiencing is unrelated to JNI same issue with JNI calls mocked. >> Reading that post I noticed that your gc pauses where still increasing after a full gc. In our case a full gc will fix the issue. >> Will have to keep hunting for the cause in my application. >> >> >> Cheers, >> Ashley >> >> From: Matt Fowles [mailto:matt.fowles at gmail.com] >> Sent: Tuesday, 19 February 2013 3:49 p.m. >> To: Ashley Taylor >> Cc: hotspot-gc-use at openjdk.java.net >> Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs >> >> Ashley~ >> >> The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI. Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC. If these build up without bound, you end up with growing GC times. >> >> The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed. >> >> You can find the full painful search here: >> >> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja >> >> My minimal reproduction is >> >> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV >> >> I sincerely hope my painful experience can save you time ;-) >> >> Matt >> >> >> >> >> On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor wrote: >> Hi Matt >> Thanks for the quick response. >> >> Yes we do have JNI in this setup, I will disable the JNI link and rerun the test. >> If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that. >> >> Cheers, >> Ashley >> >> From: Matt Fowles [mailto:matt.fowles at gmail.com] >> Sent: Tuesday, 19 February 2013 3:04 p.m. >> To: Ashley Taylor >> Cc: hotspot-gc-use at openjdk.java.net >> Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs >> >> Ashley~ >> >> Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. >> >> Matt >> >> >> On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor wrote: >> Hi, >> >> We are testing the performance of the G1 garbage collection. >> Our goal is to be able to remove the full gc pause that eventually happens when we CMS. >> >> We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. >> >> Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning >> Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms >> [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 >> Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] >> >> >> Here is a snap shot after 19 hours. Here the pause is around 280ms >> [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 >> Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] >> >> It seems that some task is linearly increasing with time, which only effects one thread. >> >> After manually firing a full gc the total pause time returns back to around 80ms >> >> After full GC >> [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 >> Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] >> >> >> The test is run with a constant load applied on the application that should hold the machine at around load 6. >> We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. >> the rest will only live for 10s of milliseconds. >> The JVM memory usage floats between 4-6gb. >> >> Have checked a thread dump. There are no threads that have very large stack traces. >> What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? >> >> >> Environment >> >> JVM Arguments >> -Xms8g >> -Xmx8g >> -XX:+UseG1GC >> -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application >> -XX:MaxGCPauseMillis=70 >> -XX:+UseLargePages >> >> >> Environment >> java version "1.7.0_13" >> Java(TM) SE Runtime Environment (build 1.7.0_13-b20) >> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) >> >> >> Operating System >> redhat 5.8 machine. >> The machine has 12 cores/ 24threads and 48gb of ram. >> >> >> >> Cheers, >> Ashley Taylor >> Software Engineer >> Email: ashley.taylor at sli-systems.com >> Website: www.sli-systems.com >> Blog: blog.sli-systems.com >> Podcast: EcommercePodcast.com >> Twitter: www.twitter.com/slisystems >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130219/dd586629/attachment-0001.html From john.cuthbertson at oracle.com Wed Feb 20 10:56:07 2013 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 20 Feb 2013 10:56:07 -0800 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <147D1F53-A270-4CBA-8490-A242477EB1E1@gmail.com> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> <5123DA32.806@oracle.com> <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> <51242910.8080604@oracle.com> <147D1F53-A270-4CBA-8490-A242477EB1E1@gmail.com> Message-ID: <51251C47.6090009@oracle.com> Hi Ramki, This is what I was thinking. An internal group has also seen the same problem and has offered to run with an instrumented build. If Ashley is willing I could supply a temporary patch. JohnC On 2/19/2013 10:24 PM, Srinivas Ramakrishna wrote: > Perhaps Ashley could build an instrumented jvm with time trace around > the various external root groups scanned serially and the answer would > be immediate? > > ysr1729 > > On Feb 19, 2013, at 17:38, John Cuthbertson > > wrote: > >> Hi Ashely, >> >> Off the top of my head there's also the intern string table. I'll >> have to look at the code to figure out what else it could be. >> >> Thanks for the info. >> >> JohnC >> >> On 2/19/2013 5:11 PM, Ashley Taylor wrote: >>> >>> Hi John >>> >>> I reran my application with the JIT log turned on. It seems that >>> once the application has been running for a while there is very >>> little activity within the JIT log but the pause times keep >>> climbing, I ran it for 4 hours and the ?Ext Root Scan? had climbed >>> to 40ms. >>> >>> At the 4 hour point I also performed a full gc to see how many >>> classes would be unload and it was only 50. We have around 5500 >>> loaded classes. >>> >>> The number of loaded classes also does not increase once the >>> application has run for a while. >>> >>> I also used jstat to see how full the permanent memory region is, it >>> is slowly climbing the full gc did not seem to reduce it at all, >>> however the full gc did fix the pause time. >>> >>> The permanent region is currently at 89.17% and seems to increase by >>> 0.01% every couple of minutes. >>> >>> Is there any other GC events that only happen at a full gc? >>> >>> Cheers, >>> >>> Ashley >>> >>> *From:*hotspot-gc-use-bounces at openjdk.java.net >>> [mailto:hotspot-gc-use-bounces at openjdk.java.net] *On Behalf Of *John >>> Cuthbertson >>> *Sent:* Wednesday, 20 February 2013 9:10 a.m. >>> *To:* hotspot-gc-use at openjdk.java.net >>> *Subject:* Re: G1 garbage collection Ext Root Scanning time increase >>> linearly as application runs >>> >>> Hi Ashley, >>> >>> Basically as you surmise one the GC worker threads is being held up >>> when processing a single root. I've seen s similar issue that's >>> caused by filling up the code cache (where JIT compiled methods are >>> held). The code cache is treated as a single root and so is claimed >>> in its entirety by a single GC worker thread. As a the code cache >>> fills up, the thread that claims the code cache to scan starts >>> getting held up. >>> >>> A full GC clears the issue because that's where G1 currently does >>> class unloading: the full GC unloads a whole bunch of classes >>> allowing any the compiled code of any of the unloaded classes' >>> methods to be freed by the nmethod sweeper. So after a a full GC the >>> number of compiled methods in the code cache is less. >>> >>> It could also be the just the sheer number of loaded classes as the >>> system dictionary is also treated as a single claimable root. >>> >>> I think there's a couple existing CRs to track this. I'll see if I >>> can find the numbers. >>> >>> Regards, >>> >>> JohnC >>> >>> On 2/19/2013 11:24 AM, Ashley Taylor wrote: >>> >>> Hi Matt >>> >>> Seems that the issue I?m experiencing is unrelated to JNI same >>> issue with JNI calls mocked. >>> >>> Reading that post I noticed that your gc pauses where still >>> increasing after a full gc. In our case a full gc will fix the >>> issue. >>> >>> Will have to keep hunting for the cause in my application. >>> >>> >>> Cheers, >>> >>> Ashley >>> >>> *From:*Matt Fowles [mailto:matt.fowles at gmail.com] >>> *Sent:* Tuesday, 19 February 2013 3:49 p.m. >>> *To:* Ashley Taylor >>> *Cc:* hotspot-gc-use at openjdk.java.net >>> >>> *Subject:* Re: G1 garbage collection Ext Root Scanning time >>> increase linearly as application runs >>> >>> Ashley~ >>> >>> The issue I was seeing was actually in CMS not G1, but it was >>> eventually tracked down to leaking LocalReferences in the JNI. >>> Each LocalRef (or likely GlobalRef) adds 4 bytes to a section >>> that has to be scanned every GC. If these build up without >>> bound, you end up with growing GC times. >>> >>> The issue that I found essentially boiled down to GetMethodID >>> calls creating a LocalRef and not being freed. >>> >>> You can find the full painful search here: >>> >>> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja >>> >>> My minimal reproduction is >>> >>> http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV >>> >>> I sincerely hope my painful experience can save you time ;-) >>> >>> Matt >>> >>> On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor >>> >> > wrote: >>> >>> Hi Matt >>> >>> Thanks for the quick response. >>> >>> Yes we do have JNI in this setup, I will disable the JNI link >>> and rerun the test. >>> >>> If it is JNI can you elaborate what you mean by leaked handle in >>> a JNI thread and how we would go about identifying and fixing that. >>> >>> Cheers, >>> >>> Ashley >>> >>> *From:*Matt Fowles [mailto:matt.fowles at gmail.com >>> ] >>> *Sent:* Tuesday, 19 February 2013 3:04 p.m. >>> *To:* Ashley Taylor >>> *Cc:* hotspot-gc-use at openjdk.java.net >>> >>> *Subject:* Re: G1 garbage collection Ext Root Scanning time >>> increase linearly as application runs >>> >>> Ashley~ >>> >>> Do you have any JNI in the setup? I saw a similar issue that >>> was painstakingly tracked down to a leaked handle in a JNI thread. >>> >>> Matt >>> >>> On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor >>> >> > wrote: >>> >>> Hi, >>> >>> We are testing the performance of the G1 garbage collection. >>> >>> Our goal is to be able to remove the full gc pause that >>> eventually happens when we CMS. >>> >>> We have noticed that the garbage collection pause time starts >>> off really well but over time it keeps climbing. >>> >>> Looking at the logs we see that the section that is increasing >>> linearly with time is the Ext Root Scanning >>> >>> Here is a Root Scanning 1 Hour into the application here the >>> total gc pause is around 80ms >>> >>> [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 >>> 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 >>> >>> Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] >>> >>> Here is a snap shot after 19 hours. Here the pause is around 280ms >>> >>> [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 >>> 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 >>> >>> Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] >>> >>> It seems that some task is linearly increasing with time, which >>> only effects one thread. >>> >>> After manually firing a full gc the total pause time returns >>> back to around 80ms >>> >>> After full GC >>> >>> [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 >>> 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 >>> >>> Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] >>> >>> The test is run with a constant load applied on the application >>> that should hold the machine at around load 6. >>> >>> We have around 3GB of data within the heap which will very >>> rarely become garbage, life of these objects would be several >>> hours to days. >>> >>> the rest will only live for 10s of milliseconds. >>> >>> The JVM memory usage floats between 4-6gb. >>> >>> Have checked a thread dump. There are no threads that have very >>> large stack traces. >>> >>> What could cause this increasing pause durations? Is there any >>> way to get more information out of what that thread is actually >>> trying to do, or any tuning options? >>> >>> Environment >>> >>> JVM Arguments >>> >>> -Xms8g >>> >>> -Xmx8g >>> >>> -XX:+UseG1GC >>> >>> -XX:InitiatingHeapOccupancyPercent=0 #found that having this at >>> zero has greatly reduced the frequency of GC pause over 500ms >>> and the overhead is not that noticeable to our application >>> >>> -XX:MaxGCPauseMillis=70 >>> >>> -XX:+UseLargePages >>> >>> Environment >>> >>> java version "1.7.0_13" >>> >>> Java(TM) SE Runtime Environment (build 1.7.0_13-b20) >>> >>> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) >>> >>> Operating System >>> >>> redhat 5.8 machine. >>> >>> The machine has 12 cores/ 24threads and 48gb of ram. >>> >>> Cheers, >>> >>> *Ashley Taylor* >>> >>> Software Engineer >>> >>> Email:ashley.taylor at sli-systems.com >>> >>> >>> Website: www.sli-systems.com >>> >>> Blog: blog.sli-systems.com >>> >>> Podcast: EcommercePodcast.com >>> >>> Twitter: www.twitter.com/slisystems >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> >>> _______________________________________________ >>> >>> hotspot-gc-use mailing list >>> >>> hotspot-gc-use at openjdk.java.net >>> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130220/1ac5eedd/attachment-0001.html From ashley.taylor at sli-systems.com Wed Feb 20 11:12:54 2013 From: ashley.taylor at sli-systems.com (Ashley Taylor) Date: Wed, 20 Feb 2013 19:12:54 +0000 Subject: G1 garbage collection Ext Root Scanning time increase linearly as application runs In-Reply-To: <51251C47.6090009@oracle.com> References: <407A2CFDD3D8024187AFF7A7A4CC34344C5EE597@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5EE6C7@ex-nz1.globalbrain.net> <407A2CFDD3D8024187AFF7A7A4CC34344C5F03F9@ex-nz1.globalbrain.net> <5123DA32.806@oracle.com> <407A2CFDD3D8024187AFF7A7A4CC34344C5F085D@ex-nz1.globalbrain.net> <51242910.8080604@oracle.com> <147D1F53-A270-4CBA-8490-A242477EB1E1@gmail.com> <51251C47.6090009@oracle.com> Message-ID: <407A2CFDD3D8024187AFF7A7A4CC34344C5F1EF3@ex-nz1.globalbrain.net> Hi John I would be willing to run the instrumented build. Ran the test for 15 hours watching the intern string table using jmap It grew by about 10% Start 10917 interned Strings occupying 951752 bytes. End 11801 interned Strings occupying 1031976 bytes. over the same test the Ext Root Scanning pauses increased to 130ms Cheers, Ashley From: John Cuthbertson [mailto:john.cuthbertson at oracle.com] Sent: Thursday, 21 February 2013 7:56 a.m. To: Srinivas Ramakrishna Cc: Ashley Taylor; hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Hi Ramki, This is what I was thinking. An internal group has also seen the same problem and has offered to run with an instrumented build. If Ashley is willing I could supply a temporary patch. JohnC On 2/19/2013 10:24 PM, Srinivas Ramakrishna wrote: Perhaps Ashley could build an instrumented jvm with time trace around the various external root groups scanned serially and the answer would be immediate? ysr1729 On Feb 19, 2013, at 17:38, John Cuthbertson > wrote: Hi Ashely, Off the top of my head there's also the intern string table. I'll have to look at the code to figure out what else it could be. Thanks for the info. JohnC On 2/19/2013 5:11 PM, Ashley Taylor wrote: Hi John I reran my application with the JIT log turned on. It seems that once the application has been running for a while there is very little activity within the JIT log but the pause times keep climbing, I ran it for 4 hours and the ?Ext Root Scan? had climbed to 40ms. At the 4 hour point I also performed a full gc to see how many classes would be unload and it was only 50. We have around 5500 loaded classes. The number of loaded classes also does not increase once the application has run for a while. I also used jstat to see how full the permanent memory region is, it is slowly climbing the full gc did not seem to reduce it at all, however the full gc did fix the pause time. The permanent region is currently at 89.17% and seems to increase by 0.01% every couple of minutes. Is there any other GC events that only happen at a full gc? Cheers, Ashley From: hotspot-gc-use-bounces at openjdk.java.net [mailto:hotspot-gc-use-bounces at openjdk.java.net] On Behalf Of John Cuthbertson Sent: Wednesday, 20 February 2013 9:10 a.m. To: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Hi Ashley, Basically as you surmise one the GC worker threads is being held up when processing a single root. I've seen s similar issue that's caused by filling up the code cache (where JIT compiled methods are held). The code cache is treated as a single root and so is claimed in its entirety by a single GC worker thread. As a the code cache fills up, the thread that claims the code cache to scan starts getting held up. A full GC clears the issue because that's where G1 currently does class unloading: the full GC unloads a whole bunch of classes allowing any the compiled code of any of the unloaded classes' methods to be freed by the nmethod sweeper. So after a a full GC the number of compiled methods in the code cache is less. It could also be the just the sheer number of loaded classes as the system dictionary is also treated as a single claimable root. I think there's a couple existing CRs to track this. I'll see if I can find the numbers. Regards, JohnC On 2/19/2013 11:24 AM, Ashley Taylor wrote: Hi Matt Seems that the issue I?m experiencing is unrelated to JNI same issue with JNI calls mocked. Reading that post I noticed that your gc pauses where still increasing after a full gc. In our case a full gc will fix the issue. Will have to keep hunting for the cause in my application. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:49 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ The issue I was seeing was actually in CMS not G1, but it was eventually tracked down to leaking LocalReferences in the JNI. Each LocalRef (or likely GlobalRef) adds 4 bytes to a section that has to be scanned every GC. If these build up without bound, you end up with growing GC times. The issue that I found essentially boiled down to GetMethodID calls creating a LocalRef and not being freed. You can find the full painful search here: http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja My minimal reproduction is http://web.archiveorange.com/archive/v/Dp7Rf33tij5BFBNRpVja#YnJRjM4IVyt54TV I sincerely hope my painful experience can save you time ;-) Matt On Mon, Feb 18, 2013 at 9:29 PM, Ashley Taylor > wrote: Hi Matt Thanks for the quick response. Yes we do have JNI in this setup, I will disable the JNI link and rerun the test. If it is JNI can you elaborate what you mean by leaked handle in a JNI thread and how we would go about identifying and fixing that. Cheers, Ashley From: Matt Fowles [mailto:matt.fowles at gmail.com] Sent: Tuesday, 19 February 2013 3:04 p.m. To: Ashley Taylor Cc: hotspot-gc-use at openjdk.java.net Subject: Re: G1 garbage collection Ext Root Scanning time increase linearly as application runs Ashley~ Do you have any JNI in the setup? I saw a similar issue that was painstakingly tracked down to a leaked handle in a JNI thread. Matt On Mon, Feb 18, 2013 at 8:12 PM, Ashley Taylor > wrote: Hi, We are testing the performance of the G1 garbage collection. Our goal is to be able to remove the full gc pause that eventually happens when we CMS. We have noticed that the garbage collection pause time starts off really well but over time it keeps climbing. Looking at the logs we see that the section that is increasing linearly with time is the Ext Root Scanning Here is a Root Scanning 1 Hour into the application here the total gc pause is around 80ms [Ext Root Scanning (ms): 11.5 0.8 1.5 1.8 1.6 4.8 1.2 1.5 1.2 1.4 1.1 1.6 1.2 1.1 1.1 1.1 1.2 1.2 Avg: 2.1, Min: 0.8, Max: 11.5, Diff: 10.7] Here is a snap shot after 19 hours. Here the pause is around 280ms [Ext Root Scanning (ms): 1.2 184.7 1.3 1.3 1.8 6.3 1.7 1.2 1.5 1.2 1.2 1.1 1.2 1.1 1.2 1.1 1.2 1.2 Avg: 11.8, Min: 1.1, Max: 184.7, Diff: 183.6] It seems that some task is linearly increasing with time, which only effects one thread. After manually firing a full gc the total pause time returns back to around 80ms After full GC [Ext Root Scanning (ms): 2.4 1.7 4.5 2.6 4.6 2.1 2.1 1.7 2.1 1.8 1.8 2.2 0.6 0.0 0.0 0.0 0.0 0.0 Avg: 1.7, Min: 0.0, Max: 4.6, Diff: 4.6] The test is run with a constant load applied on the application that should hold the machine at around load 6. We have around 3GB of data within the heap which will very rarely become garbage, life of these objects would be several hours to days. the rest will only live for 10s of milliseconds. The JVM memory usage floats between 4-6gb. Have checked a thread dump. There are no threads that have very large stack traces. What could cause this increasing pause durations? Is there any way to get more information out of what that thread is actually trying to do, or any tuning options? Environment JVM Arguments -Xms8g -Xmx8g -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=0 #found that having this at zero has greatly reduced the frequency of GC pause over 500ms and the overhead is not that noticeable to our application -XX:MaxGCPauseMillis=70 -XX:+UseLargePages Environment java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) Operating System redhat 5.8 machine. The machine has 12 cores/ 24threads and 48gb of ram. Cheers, Ashley Taylor Software Engineer Email: ashley.taylor at sli-systems.com Website: www.sli-systems.com Blog: blog.sli-systems.com Podcast: EcommercePodcast.com Twitter: www.twitter.com/slisystems _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130220/e5959e52/attachment-0001.html From reachbach at yahoo.com Fri Feb 22 01:26:58 2013 From: reachbach at yahoo.com (Bharath R) Date: Fri, 22 Feb 2013 01:26:58 -0800 (PST) Subject: G1 status in JDK1.6 Vs JDK1.7 In-Reply-To: <5112F380.5040403@oracle.com> References: <1359962096.18794.YahooMailNeo@web162101.mail.bf1.yahoo.com> <1359962818.10581.YahooMailNeo@web162103.mail.bf1.yahoo.com> <5112F380.5040403@oracle.com> Message-ID: <1361525218.18687.YahooMailNeo@web162102.mail.bf1.yahoo.com> Jesper, Thanks for the clarification. I'm now running benchmarks against JDK7. -Bharath ________________________________ From: Jesper Wilhelmsson To: Bharath R Cc: "hotspot-gc-use at openjdk.java.net" Sent: Thursday, February 7, 2013 5:51 AM Subject: Re: G1 status in JDK1.6 Vs JDK1.7 Hi Bharath, The first supported release of G1 was with 7u4. The 7u4 version came with significant improvements and I do not recommend doing performance evaluations with earlier versions. If you decide to move to JDK 7 and try G1 please share your experiences. /Jesper On 4/2/13 8:26 AM, Bharath R wrote: > Hi, > > Is the G1 GC 1.6 port on par with the 1.7 in terms of stability / > quality? If that is true, I intend to begin experimenting with it in > production and gradually roll it out across our deployment based on the > outcome. On a related note, we intend to use G1 for an online system > with a very low pause time requirement ( <10ms). The hardware is > heterogeneous in terms of memory (ranges between 12G - 32G available to > the application process) with comparable CPU configuration. CMS required > considerable tuning to achieve acceptable results and I'm hoping G1 > would fare better without myraid config options or overrides. > I'd like to know of comparisons / experience operating G1 in production > under such conditions. Thanks in advance. > > -Bharath > > P.S: Using RTJ is not an option for us :) > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20130222/ee654e59/attachment.html