From Andreas.Mueller at mgm-tp.com Thu Nov 1 14:51:42 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Thu, 1 Nov 2012 21:51:42 +0000 Subject: G1 issue: falling over to Full GC Message-ID: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> Hi all, I have tested G1 for our portal (using Java7u7 on Solaris 10 /SPARC). The JVM is using a rather small heap of 1GB and the amount of garbage is moderate (in the range of 30-35 MB/s). ParallelGC and CMS have no problem to cope with that load, but to get rid of the Full GC pauses (around 4s with ParallelGC) and to avoid any fragmentation risk (uptime is many weeks) I tried G1, too. The good news is that G1 has improved a lot since Java6 and now looks more ready to compete with the proven collectors. The good case JPEG (I'll send in a second mail) shows the GC pauses (in seconds) as a function of time when I ran it with the following heap and GC settings: -Xms1024m -Xmx1024m -XX:NewSize=400m -XX:MaxNewSize=400m -XX:SurvivorRatio=18 -XX:+UseG1GC -XX:MaxGCPauseMillis=500. As a result, after an outlier during startup the longest GC pauses are shorter than with ParallelGC and the average pause is clearly shorter than the 500ms target. Some pauses are in the 1-2s range and I hoped to eliminate them by fine tuning the settings. Now, here starts the bad news: fine tuning proved more difficult than expected. I hoped to also half the longest pauses by reducing the pause time target to 250ms and therefore applied the following settings (leaving Xms, Xmx and NewSize unchanged): -XX:SurvivorRatio=6 -XX:MaxGCPauseMillis=250 -XX:GCPauseIntervalMillis=2000 -XX:InitiatingHeapOccupancyPercent=80 Making survivor spaces larger had proven positive with ParallelGC and CMS. So I also used it here. I also wanted to make better use of the available heap and therefore set the threshold to 80 percent. The result was kind of a disaster: after a benign start pause times quickly rose to the range of 1-4(!)s and later G1 fell over to Full GCs (have a glance at the bad case JPEG). Here are my questions: - What did I wrong? Which setting was my biggest error and why? - What settings would you suggest to reach my goal of having very few pauses above 1s? - Does SurvivorRatio have the same meaning for G1 as for the traditional collectors? - With G1, is it suitable to set the occupancy threshold to similar values as with CMS? (80 worked fine with CMS in the same test ) - Why are Full GC pauses with the failed G1 so much longer than with ParallelGC? - I noticed in the logs that Full GC pauses take sometimes 50s of real time and only 12s of usr time. How come? I have never seen the other collectors idling on their time. I will attach a GC log file with GCDetails to a third mail to avoid breaking the 100k limit on mails to this list. Thank you Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121101/2ceb98be/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: G1BadCase.jpg Type: image/jpeg Size: 52114 bytes Desc: G1BadCase.jpg Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121101/2ceb98be/G1BadCase-0001.jpg From Andreas.Mueller at mgm-tp.com Thu Nov 1 14:53:47 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Thu, 1 Nov 2012 21:53:47 +0000 Subject: AW: G1 issue: falling over to Full GC Message-ID: <46FF8393B58AD84D95E444264805D98F28F8498F@edata02.mgm-edv.de> Please find attached the good case plot (looks good compared to the bad case, doesn't it?). -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121101/dd8ecceb/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: G1GoodCase.jpg Type: image/jpeg Size: 41614 bytes Desc: G1GoodCase.jpg Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121101/dd8ecceb/G1GoodCase-0001.jpg From sbordet at intalio.com Thu Nov 1 15:27:33 2012 From: sbordet at intalio.com (Simone Bordet) Date: Thu, 1 Nov 2012 23:27:33 +0100 Subject: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> Message-ID: Hi, On Thu, Nov 1, 2012 at 10:51 PM, Andreas M?ller wrote: > Hi all, > > I have tested G1 for our portal (using Java7u7 on Solaris 10 /SPARC). > > The JVM is using a rather small heap of 1GB and the amount of garbage is > moderate (in the range of 30-35 MB/s). > > ParallelGC and CMS have no problem to cope with that load, but to get rid of > the Full GC pauses (around 4s with ParallelGC) and to avoid any > fragmentation risk (uptime is many weeks) I tried G1, too. > > The good news is that G1 has improved a lot since Java6 and now looks more > ready to compete with the proven collectors. > > The good case JPEG (I?ll send in a second mail) shows the GC pauses (in > seconds) as a function of time when I ran it with the following heap and GC > settings: > > -Xms1024m -Xmx1024m -XX:NewSize=400m -XX:MaxNewSize=400m > -XX:SurvivorRatio=18 -XX:+UseG1GC -XX:MaxGCPauseMillis=500. > > As a result, after an outlier during startup the longest GC pauses are > shorter than with ParallelGC and the average pause is clearly shorter than > the 500ms target. > > Some pauses are in the 1-2s range and I hoped to eliminate them by fine > tuning the settings. > > Now, here starts the bad news: fine tuning proved more difficult than > expected. > > I hoped to also half the longest pauses by reducing the pause time target to > 250ms and therefore applied the following settings (leaving Xms, Xmx and > NewSize unchanged): > > -XX:SurvivorRatio=6 -XX:MaxGCPauseMillis=250 -XX:GCPauseIntervalMillis=2000 > -XX:InitiatingHeapOccupancyPercent=80 > > Making survivor spaces larger had proven positive with ParallelGC and CMS. > So I also used it here. I also wanted to make better use of the available > heap and therefore set the threshold to 80 percent. The result was kind of a > disaster: after a benign start pause times quickly rose to the range of > 1-4(!)s and later G1 fell over to Full GCs (have a glance at the bad case > JPEG). > > Here are my questions: > > - What did I wrong? Which setting was my biggest error and why? When you specify manually the eden size, you basically disable G1's target GC pause, which is ignored (because if you set eden size, you know better than G1). Have you tried to *not* specify the newsize nor the survivor ratio and let G1 do its work ? > - What settings would you suggest to reach my goal of having very few > pauses above 1s? > > - Does SurvivorRatio have the same meaning for G1 as for the > traditional collectors? > > - With G1, is it suitable to set the occupancy threshold to similar > values as with CMS? (80 worked fine with CMS in the same test ) No. In CMS the threshold was for the old generation, in G1 is for the whole heap. 80% is probably a bit too high. > - Why are Full GC pauses with the failed G1 so much longer than with > ParallelGC? I /think/ G1's full GC are single threaded. > - I noticed in the logs that Full GC pauses take sometimes 50s of > real time and only 12s of usr time. How come? I have never seen the other > collectors idling on their time. Swapping ? > I will attach a GC log file with GCDetails to a third mail to avoid breaking > the 100k limit on mails to this list. Those would help. I'm using: -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy to print out interesting G1 logs. Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From sbordet at intalio.com Thu Nov 1 17:52:42 2012 From: sbordet at intalio.com (Simone Bordet) Date: Fri, 2 Nov 2012 01:52:42 +0100 Subject: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> Message-ID: Hi, On Fri, Nov 2, 2012 at 12:04 AM, Andreas M?ller wrote: > No, I haven't tried that. I usually tuned the heap sizes first using ParallelGC (-XX:-UseAdaptiveSizePolicy) or ParNewGC and then kept them when I switched to CMS, because I had found that this gave me the most robust GC behavior. So far, I haven't been convinced that heap size automatism delivers the best results. Usually survivor spaces are too small and promotion rates too high when you use them. They are good for reasonable out-of-the-box behavior though. > How come the heap size settings worked well in my good case? My experience so far is that G1 works better when I don't specify young gen sizes. I got weird behaviors when I tried (especially with older G1 versions). Also, G1 is different from other collectors in the way it treats generations, so the logic that applies to the other collectors does not necessary apply to G1. >>No. In CMS the threshold was for the old generation, in G1 is for the whole heap. 80% is probably a bit too high. > I suspected that. But when you use the default (45) and your live heap is 50% you will have the concurrent thread running all the time. Does this make sense? Sure, and that's why you need to tune it and make it 60%, for example. With 80% you may risk that a concurrent marking does not end in time (this can be detected from the logs). Will try to have a look at your logs tomorrow. Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From sbordet at intalio.com Fri Nov 2 03:25:15 2012 From: sbordet at intalio.com (Simone Bordet) Date: Fri, 2 Nov 2012 11:25:15 +0100 Subject: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> Message-ID: Hi, You have one Full GC at 3573.851, and just afterwards a concurrent-mark-abort. There was a concurrent marking going on, started at 3514.480, but it was not finished not even after ~60 s. The same goes for the second full GC at 5393.957, which had a concurrent-mark-start at 5267.191 and aborted. While concurrent marking was happening, you had several young GC until the heap was full, which triggered the full GC. Now, *why* concurrent marking on a 1 GB head takes > 60 s, I don't know; it seems an incredibly long time even for a full old gen. I can only think the GC thread was starved, but perhaps there are other explanations. There was only 1 concurrent mark that ended completely (between the 2 full GCs), followed by a: 4972.437: [GC pause (partial), 1.89505180 secs] that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. Are you sure you are actually using 1.7.0_u7 ? Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From Andreas.Mueller at mgm-tp.com Fri Nov 2 03:46:42 2012 From: Andreas.Mueller at mgm-tp.com (=?utf-8?B?QW5kcmVhcyBNw7xsbGVy?=) Date: Fri, 2 Nov 2012 10:46:42 +0000 Subject: AW: G1 issue: falling over to Full GC In-Reply-To: References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> Message-ID: <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> Hi Simone, >4972.437: [GC pause (partial), 1.89505180 secs] >that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. Oops, I understand that now: 'partial' used to be what 'mixed' is now! Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. But automatic deployments can overwrite the start script and thus switch back to 6u33. >Are you sure you are actually using 1.7.0_u7 ? I have checked that in the archived start scripts and the result, unfortunetaley, is: no. The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. That's the true reason why the results were so much worse and so incomprehensible. Thank you very much for looking at the log and for asking good questions! I'll try to repeat the test and post the results on this list. Regards Andreas From bernd.eckenfels at googlemail.com Fri Nov 2 03:55:52 2012 From: bernd.eckenfels at googlemail.com (Bernd Eckenfels) Date: Fri, 2 Nov 2012 11:55:52 +0100 Subject: AW: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> Message-ID: BTW: IBM GC Logs do contain a header which lists Version and settings. This is a big value, can that be added to hotspot as well? Bernd Am 02.11.2012 um 11:46 schrieb Andreas M?ller : > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? From sbordet at intalio.com Fri Nov 2 04:07:52 2012 From: sbordet at intalio.com (Simone Bordet) Date: Fri, 2 Nov 2012 12:07:52 +0100 Subject: AW: G1 issue: falling over to Full GC In-Reply-To: References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> Message-ID: Hi, On Fri, Nov 2, 2012 at 11:55 AM, Bernd Eckenfels wrote: > BTW: IBM GC Logs do contain a header which lists Version and settings. This is a big value, can that be added to hotspot as well? I use -showversion and -XX:+PrintCommandLineFlags, but I would not mind those be defaulted in the GC logs. Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From chunt at salesforce.com Fri Nov 2 05:34:21 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Fri, 2 Nov 2012 05:34:21 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> Message-ID: <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > But automatic deployments can overwrite the start script and thus switch back to 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? > I have checked that in the archived start scripts and the result, unfortunetaley, is: no. > The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. > That's the true reason why the results were so much worse and so incomprehensible. > Thank you very much for looking at the log and for asking good questions! > > I'll try to repeat the test and post the results on this list. > > Regards > Andreas > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From vitalyd at gmail.com Fri Nov 2 06:04:48 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 2 Nov 2012 09:04:48 -0400 Subject: G1 issue: falling over to Full GC In-Reply-To: <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> Message-ID: Hi Charlie, Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by default, as you say, until 7u4 so I'm unsure if that's for some good/specific reason or not. Thanks Sent from my phone On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote: > Jumping in a bit late ... > > Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. > And, even better if you use (as of this writing) 7u9, or the latest > production Java 7 HotSpot VM. > > Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one > issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), > which is currently slated to be back ported to a future Java 7, (thanks > Monica, John Cuthbertson and Bengt tackling this!). > > >From looking at your observations and others comments thus far, my > initial reaction is that with a 1G Java heap, you might get the best > results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or > -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is > -XX:+UseParallelOldGC automatically set for what's called "server class" > machines when you don't specify a GC. > > The lengthy concurrent mark could be the result of the implementation of > G1 in 6u*, or it could be that your system is swapping. Could you check if > your system is swapping? On Solaris you can monitor this using vmstat and > observing, not only just free memory, but also sr == scan rate along with > pi == page in and po == page out. Seeing sr (page scan activity) along > with low free memory along with pi & po activity are strong suggestions of > swapping. Seeing low free memory and no sr activity is ok, i.e. no > swapping. > > Additionally, you are right. "partial" was changed to "mixed" in the GC > logs. For those interested in a bit of history .... this change was made > since we felt "partial" was misleading. What partial was intended to mean > was a partial old gen collection, which did occur. But, on that same GC > event it also included a young gen GC. As a result, we changed the GC > event name to "mixed" since that GC event was really a combination of both > a young gen GC and portion of old gen GC. > > Simone also has a good suggestion with including -XX:+PrintFlagsFinal and > -showversion as part of the GC log data to collect, especially with G1 > continuing to be improve and evolve. > > Look forward to seeing your GC logs! > > hths, > > charlie .... > > On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > > > Hi Simone, > > > >> 4972.437: [GC pause (partial), 1.89505180 secs] > >> that I cannot decypher (to Monica - what "partial" means ?), and no > mixed GCs, which seems unusual as well. > > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 > because I had learned that G1 is far from mature in 6u33. > > But automatic deployments can overwrite the start script and thus switch > back to 6u33. > > > >> Are you sure you are actually using 1.7.0_u7 ? > > I have checked that in the archived start scripts and the result, > unfortunetaley, is: no. > > The 'good case' was actually running on 7u7 (that's why it was good), > but the 'bad case' was unwittingly run on 6u33 again. > > That's the true reason why the results were so much worse and so > incomprehensible. > > Thank you very much for looking at the log and for asking good questions! > > > > I'll try to repeat the test and post the results on this list. > > > > Regards > > Andreas > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121102/e005c3ac/attachment-0001.html From Andreas.Mueller at mgm-tp.com Fri Nov 2 06:48:08 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Fri, 2 Nov 2012 13:48:08 +0000 Subject: AW: G1 issue: falling over to Full GC In-Reply-To: <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> Message-ID: <46FF8393B58AD84D95E444264805D98F28F84B8A@edata02.mgm-edv.de> Hello Charlie, >Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. I agree and this very issue (which turned out to be a Java 6 issue) confirms that again. >Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently >slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). Good to know. 7u7 was looking promising already. >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with ->XX:+UseParallelOldGC. For the time being, I suggested to my project to stick with XX:+UseParallelGC (the default), because the results are in fact good. It's only that the CPUs on our system are not very fast (8x1800 MHz) and we have these Full GC runs which take 5s on average. By tuning the heap sizes well (400 MB NewGen, 50 MB survivors) I managed to make them so infrequent that this is not a problem. But our customer has bought some T3s (1650 Mhz) without asking development and in the future we will have to handle more users and more applications. As a result, there will be a need to go towards larger heaps, maybe 2 GB next year. Before we end up with Full GC pauses >10s I started looking for alternatives: - CMS gives good results as long as I add enough extra headroom. If not, fragmentation can hit very suddenly and deadly as I found out (we have uptimes of many weeks) - I tried G1 out of curiosity as a future solution for both problems (remembering what our former Sun colleague Tony Printezis told the audience many years ago at a customer meeting) >Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically >set for what's called >"server class" machines when you don't specify a GC. I use the default which is UseParallelGC, but I do not know for sure whether this also includes ParallelOldGC. I read it should as of JDK 6, but the cpu statistics (usr_time vs real_time) for Full GC runs suggest this is not the case ( I always see usr_time = real_time). I tried -XX:+UseParallelOldGC explicitly but it made no difference. To sum up, ParallelGC, ParallelOldGC and ParNewGC gave the same results once I got rid of the AdaptiveSizePolicy in ParallelGC (which was in fact important) >The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, Exact. We found out that the bad case was actually the 6u* case and the good case was 7u7 because some automatism decided to switch back to 6u33 between my two tests. >or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free >memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are >strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. I looked at pi and po using vmstat and therefore was pretty sure there was no swapping. >Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was >misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, >we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. It's a pity I didn't notice this word 'partial' in the curious log on my own and identified the problem earlier. It needed Simon's help to stick my nose into it. >Look forward to seeing your GC logs! Are you serious? We better forget about G1 in 6u* and look forward to its brighter future. It looks promising and could be where Tony wanted it to be pretty soon. Regards Andreas From chunt at salesforce.com Fri Nov 2 07:15:39 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Fri, 2 Nov 2012 07:15:39 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> Message-ID: <47962653-1D8D-45B0-B1E8-186F1429B191@salesforce.com> Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not auto-enabled. hths, charlie ... On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: Hi Charlie, Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by default, as you say, until 7u4 so I'm unsure if that's for some good/specific reason or not. Thanks Sent from my phone On Nov 2, 2012 8:36 AM, "Charlie Hunt" > wrote: Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > But automatic deployments can overwrite the start script and thus switch back to 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? > I have checked that in the archived start scripts and the result, unfortunetaley, is: no. > The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. > That's the true reason why the results were so much worse and so incomprehensible. > Thank you very much for looking at the log and for asking good questions! > > I'll try to repeat the test and post the results on this list. > > Regards > Andreas > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121102/31a13a2f/attachment.html From chunt at salesforce.com Fri Nov 2 07:29:43 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Fri, 2 Nov 2012 07:29:43 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F84B8A@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> <46FF8393B58AD84D95E444264805D98F28F84B8A@edata02.mgm-edv.de> Message-ID: <54F30191-904E-40B5-BEDE-F8B7478615FF@salesforce.com> Hi Andreas, Couple comments embedded below. charlie ... On Nov 2, 2012, at 8:48 AM, Andreas M?ller wrote: > Hello Charlie, > >> Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. > I agree and this very issue (which turned out to be a Java 6 issue) confirms that again. > >> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently >slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). > Good to know. 7u7 was looking promising already. :-) > >> From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with ->XX:+UseParallelOldGC. > For the time being, I suggested to my project to stick with XX:+UseParallelGC (the default), because the results are in fact good. > It's only that the CPUs on our system are not very fast (8x1800 MHz) and we have these Full GC runs which take 5s on average. By tuning the heap sizes well (400 MB NewGen, 50 MB survivors) I managed to make them so infrequent that this is not a problem. > But our customer has bought some T3s (1650 Mhz) without asking development and in the future we will have to handle more users and more applications. As a result, there will be a need to go towards larger heaps, maybe 2 GB next year. Before we end up with Full GC pauses >10s I started looking for alternatives: > - CMS gives good results as long as I add enough extra headroom. If not, fragmentation can hit very suddenly and deadly as I found out (we have uptimes of many weeks) > - I tried G1 out of curiosity as a future solution for both problems (remembering what our former Sun colleague Tony Printezis told the audience many years ago at a customer meeting) IMO, looking at G1 is the right thing to do. The usual recommendation I communicate is to start with +UseParallelOldGC, (so you get the multi-threaded full GC), and if you observe full GCs too frequently, or their duration is too long, then move to G1. Of course, you may have to do some fine tuning of ParallelOld GC, like what you've done. And, then as you move to G1, throw away the fine tuning of eden, survivor spaces and young gen heap sizing. Start with G1 by using the same -Xms & -Xmx, and optionally set a pause time target (-XX:MaxGCPauseMillis) along with possibly fine tuning the InitiatingHeapOccupancyPercent (which as Simone pointed out is the overall heap occupancy). There's a couple additional G1 tuning switches you need to investigate. See G1 Tuning session Monica & I gave at J1. If you do a search for the J1 2012 session content, I think you can find the slides and a recording. IIRC, Oracle may have made these generally available. Btw, if it's not obvious, -XX:+UseParalleOldGC will automatically enable -XX:+UseParallelGC, (you get both a multi-threaded young GC and full GC). With +UseParallelGC you get only a multi-threaded young GC, and a single threaded full GC. Ooh, and while I'm thinking of it, full GC is G1 is single threaded. There's an RFE to multi-thread it. But, to date it has not been implemented. > >> Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically >set for what's called >"server class" machines when you don't specify a GC. > I use the default which is UseParallelGC, but I do not know for sure whether this also includes ParallelOldGC. I read it should as of JDK 6, but the cpu statistics (usr_time vs real_time) for Full GC runs suggest this is not the case ( I always see usr_time = real_time). I tried -XX:+UseParallelOldGC explicitly but it made no difference. > To sum up, ParallelGC, ParallelOldGC and ParNewGC gave the same results once I got rid of the AdaptiveSizePolicy in ParallelGC (which was in fact important) As hinted above, +UseParalleGC does not enable +UseParallelOldGC. But, +UseParallelOldGC does auto-enable +UseParallelGC. I would suggest you go to +UseParallelOldGC instead of using +UseParallelGC. +UseParallelOldGC with the same fine tuning, i.e. disabled -UseAdaptiveSizePolicy, should result in lower full GC pause times. > >> The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, > Exact. We found out that the bad case was actually the 6u* case and the good case was 7u7 because some automatism decided to switch back to 6u33 between my two tests. >> or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free >memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are >strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. > I looked at pi and po using vmstat and therefore was pretty sure there was no swapping. Great. :-) > >> Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was >misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, >we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. > It's a pity I didn't notice this word 'partial' in the curious log on my own and identified the problem earlier. It needed Simon's help to stick my nose into it. It happens. ;-) > >> Look forward to seeing your GC logs! > Are you serious? We better forget about G1 in 6u* and look forward to its brighter future. > It looks promising and could be where Tony wanted it to be pretty soon. Sorry if I confused you. I intended to mean that I'm looking forward to seeing your 7u4+ (i.e. 7u7 or 7u9) G1 GC logs. hths, charlie ... > > Regards > Andreas > > From john.cuthbertson at oracle.com Fri Nov 2 09:35:29 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 02 Nov 2012 09:35:29 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> Message-ID: <5093F651.2030405@oracle.com> Hi Charlie, I'm jumping in here late as well and I'll try to answer Andreas' questions in a separate email later today. I just wanted to let you know what's happening with 7143858. The fix for this CR is already in hs24 which, I believe, is intended for jdk7u12. So fortunately no backporting is needed. With this fix about 90% of the premature evacuations due to the GC locker are eliminated (in one workload they went from around 30 to 3). The remainder are being tracked using 7181612. Looking at the code I can see a possible scenario that might result in an unexpected evacuation pause , but I haven't been able to prove it - yet. Regards, JohnC On 11/2/2012 5:34 AM, Charlie Hunt wrote: > Jumping in a bit late ... > > Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. > > Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). > > >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. > > The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. > > Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. > > Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. > > Look forward to seeing your GC logs! > > hths, > > charlie .... > > On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > >> Hi Simone, >> >>> 4972.437: [GC pause (partial), 1.89505180 secs] >>> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. >> Oops, I understand that now: 'partial' used to be what 'mixed' is now! >> Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. >> But automatic deployments can overwrite the start script and thus switch back to 6u33. >> >>> Are you sure you are actually using 1.7.0_u7 ? >> I have checked that in the archived start scripts and the result, unfortunetaley, is: no. >> The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. >> That's the true reason why the results were so much worse and so incomprehensible. >> Thank you very much for looking at the log and for asking good questions! >> >> I'll try to repeat the test and post the results on this list. >> >> Regards >> Andreas >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From chunt at salesforce.com Fri Nov 2 12:26:46 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Fri, 2 Nov 2012 12:26:46 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: <5093F651.2030405@oracle.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> <5093F651.2030405@oracle.com> Message-ID: <0A87BD5E-813D-442E-ACD9-E3E6EDDB8D75@salesforce.com> Thanks for the update John! Great news on isolating and fixing the issue. I haven't poked around just yet ... is there an OpenJDK workspace for HotSpot 24 that'll be included in 7u12 ? I'd be happy to build HotSpot from that repository and try out the changes to see if the changes rids the premature evacuations I've observed with 7u9. Love the progress you, Bengt and Monica are making with G1! charlie ... On Nov 2, 2012, at 11:35 AM, John Cuthbertson wrote: > Hi Charlie, > > I'm jumping in here late as well and I'll try to answer Andreas' > questions in a separate email later today. > > I just wanted to let you know what's happening with 7143858. The fix for > this CR is already in hs24 which, I believe, is intended for jdk7u12. So > fortunately no backporting is needed. With this fix about 90% of the > premature evacuations due to the GC locker are eliminated (in one > workload they went from around 30 to 3). The remainder are being > tracked using 7181612. Looking at the code I can see a possible scenario > that might result in an unexpected evacuation pause , but I haven't been > able to prove it - yet. > > Regards, > > JohnC > > On 11/2/2012 5:34 AM, Charlie Hunt wrote: >> Jumping in a bit late ... >> >> Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. >> >> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >> >>> From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. >> >> The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. >> >> Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. >> >> Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. >> >> Look forward to seeing your GC logs! >> >> hths, >> >> charlie .... >> >> On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: >> >>> Hi Simone, >>> >>>> 4972.437: [GC pause (partial), 1.89505180 secs] >>>> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. >>> Oops, I understand that now: 'partial' used to be what 'mixed' is now! >>> Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. >>> But automatic deployments can overwrite the start script and thus switch back to 6u33. >>> >>>> Are you sure you are actually using 1.7.0_u7 ? >>> I have checked that in the archived start scripts and the result, unfortunetaley, is: no. >>> The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. >>> That's the true reason why the results were so much worse and so incomprehensible. >>> Thank you very much for looking at the log and for asking good questions! >>> >>> I'll try to repeat the test and post the results on this list. >>> >>> Regards >>> Andreas >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From vitalyd at gmail.com Fri Nov 2 15:42:32 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 2 Nov 2012 18:42:32 -0400 Subject: G1 issue: falling over to Full GC In-Reply-To: <47962653-1D8D-45B0-B1E8-186F1429B191@salesforce.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> <47962653-1D8D-45B0-B1E8-186F1429B191@salesforce.com> Message-ID: Thanks Charlie. At a quick glance, I didn't see it benefit my case today (~5gb old) - wall clock time was roughly same as single threaded, but user time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon Linux server. I seem to vaguely recall reading that parallel old sometimes performs worse than single threaded old in some cases, perhaps due to some contention between GC threads. Anyway, I'll keep monitoring though. Thanks Sent from my phone On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote: > Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not > auto-enabled. > > hths, > > charlie ... > > On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: > > Hi Charlie, > > Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by > default, as you say, until 7u4 so I'm unsure if that's for some > good/specific reason or not. > > Thanks > > Sent from my phone > On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote: > >> Jumping in a bit late ... >> >> Strongly suggest to anyone evaluating G1 to not use anything prior to >> 7u4. And, even better if you use (as of this writing) 7u9, or the latest >> production Java 7 HotSpot VM. >> >> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one >> issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), >> which is currently slated to be back ported to a future Java 7, (thanks >> Monica, John Cuthbertson and Bengt tackling this!). >> >> >From looking at your observations and others comments thus far, my >> initial reaction is that with a 1G Java heap, you might get the best >> results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or >> -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is >> -XX:+UseParallelOldGC automatically set for what's called "server class" >> machines when you don't specify a GC. >> >> The lengthy concurrent mark could be the result of the implementation of >> G1 in 6u*, or it could be that your system is swapping. Could you check if >> your system is swapping? On Solaris you can monitor this using vmstat and >> observing, not only just free memory, but also sr == scan rate along with >> pi == page in and po == page out. Seeing sr (page scan activity) along >> with low free memory along with pi & po activity are strong suggestions of >> swapping. Seeing low free memory and no sr activity is ok, i.e. no >> swapping. >> >> Additionally, you are right. "partial" was changed to "mixed" in the GC >> logs. For those interested in a bit of history .... this change was made >> since we felt "partial" was misleading. What partial was intended to mean >> was a partial old gen collection, which did occur. But, on that same GC >> event it also included a young gen GC. As a result, we changed the GC >> event name to "mixed" since that GC event was really a combination of both >> a young gen GC and portion of old gen GC. >> >> Simone also has a good suggestion with including -XX:+PrintFlagsFinal and >> -showversion as part of the GC log data to collect, especially with G1 >> continuing to be improve and evolve. >> >> Look forward to seeing your GC logs! >> >> hths, >> >> charlie .... >> >> On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: >> >> > Hi Simone, >> > >> >> 4972.437: [GC pause (partial), 1.89505180 secs] >> >> that I cannot decypher (to Monica - what "partial" means ?), and no >> mixed GCs, which seems unusual as well. >> > Oops, I understand that now: 'partial' used to be what 'mixed' is now! >> > Our portal usually runs on Java 6u33. For the G1 tests I switched to >> 7u7 because I had learned that G1 is far from mature in 6u33. >> > But automatic deployments can overwrite the start script and thus >> switch back to 6u33. >> > >> >> Are you sure you are actually using 1.7.0_u7 ? >> > I have checked that in the archived start scripts and the result, >> unfortunetaley, is: no. >> > The 'good case' was actually running on 7u7 (that's why it was good), >> but the 'bad case' was unwittingly run on 6u33 again. >> > That's the true reason why the results were so much worse and so >> incomprehensible. >> > Thank you very much for looking at the log and for asking good >> questions! >> > >> > I'll try to repeat the test and post the results on this list. >> > >> > Regards >> > Andreas >> > _______________________________________________ >> > hotspot-gc-use mailing list >> > hotspot-gc-use at openjdk.java.net >> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121102/db0d89fc/attachment.html From chunt at salesforce.com Fri Nov 2 16:13:59 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Fri, 2 Nov 2012 16:13:59 -0700 Subject: G1 issue: falling over to Full GC In-Reply-To: References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> <47962653-1D8D-45B0-B1E8-186F1429B191@salesforce.com> Message-ID: <2904F4E2-62A4-4C77-917E-E2B0FCC3E444@salesforce.com> Do you have GC logs you could share? We probably are gonna need more info on what's going on within ParallelOld. We might get some additional info from +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall how intrusive they are though. If you've got a lot of threads, we'll probably get a lot of data too. But, hopefully there's something in there that lends a clue as to issue. If there's contention, that suggests to me some contention in work stealing. IIRC, there's a way to get work stealing info in +ParallelOld GC. But, my mind is drawing a blank. :-| Just off the top of my head, do you know if this app makes heavy use of Reference objects, i.e. < Weak | Soft | Phantom | Final > References? Adding +PrintReferenceGC will tell us what kind of overhead you're experiencing with reference processing. If you're seeing high values of reference processing, then you'll probably want to add -XX:+ParallelRefProcEnabled. I'd look at reference processing first before looking at the +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps. Ooh, another thought, are there other Java apps running on the same system? If so, how many GC threads and application threads tend to be active at any given time? hths, charlie ... On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote: Thanks Charlie. At a quick glance, I didn't see it benefit my case today (~5gb old) - wall clock time was roughly same as single threaded, but user time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon Linux server. I seem to vaguely recall reading that parallel old sometimes performs worse than single threaded old in some cases, perhaps due to some contention between GC threads. Anyway, I'll keep monitoring though. Thanks Sent from my phone On Nov 2, 2012 10:15 AM, "Charlie Hunt" > wrote: Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not auto-enabled. hths, charlie ... On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: Hi Charlie, Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by default, as you say, until 7u4 so I'm unsure if that's for some good/specific reason or not. Thanks Sent from my phone On Nov 2, 2012 8:36 AM, "Charlie Hunt" > wrote: Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > But automatic deployments can overwrite the start script and thus switch back to 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? > I have checked that in the archived start scripts and the result, unfortunetaley, is: no. > The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. > That's the true reason why the results were so much worse and so incomprehensible. > Thank you very much for looking at the log and for asking good questions! > > I'll try to repeat the test and post the results on this list. > > Regards > Andreas > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121102/248506ff/attachment-0001.html From vitalyd at gmail.com Fri Nov 2 16:29:52 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 2 Nov 2012 19:29:52 -0400 Subject: G1 issue: falling over to Full GC In-Reply-To: <2904F4E2-62A4-4C77-917E-E2B0FCC3E444@salesforce.com> References: <46FF8393B58AD84D95E444264805D98F28F8497C@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F849CE@edata02.mgm-edv.de> <46FF8393B58AD84D95E444264805D98F28F84B04@edata02.mgm-edv.de> <80BC16AF-0F80-403E-932A-0C5E2C42C54B@salesforce.com> <47962653-1D8D-45B0-B1E8-186F1429B191@salesforce.com> <2904F4E2-62A4-4C77-917E-E2B0FCC3E444@salesforce.com> Message-ID: To be honest, I didn't dig in yet as I got the set up running in our plant towards the end of the day, and only casually looked at basic GC timestamps for the full GCs. We do use some weak refs (no soft/phantom though), but I wouldn't call it heavy (or even medium) for that matter. However, I'd have to look at what GC reports, as you mention, to make sure, but I'm pretty confident that it's not heavy. :) The server is dedicated to this sole java process, and nothing else of significance (mem or cpu) is running on there. I'll try to investigate next week to see if anything sticks out. Regular old GC is sufficient for my use case now, so I'm merely trying to see if I can get some really cheap gains purely by enabling the parallel collector. :) Generally speaking though, what sort of (ballpark) speedup is expected for parallel old vs single threaded? Let's say on a machine with a modest CPU count (8-16 hardware threads). I'd imagine any contention would significantly reduce the speedup factor for hugely parallel machines, but curious about the modest space. Are there any known issues/scenarios that would nullify its benefit, other than what you've already mentioned? Thanks for all the advice and info. Sent from my phone On Nov 2, 2012 7:14 PM, "Charlie Hunt" wrote: > Do you have GC logs you could share? > > We probably are gonna need more info on what's going on within > ParallelOld. We might get some additional info from > +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall > how intrusive they are though. If you've got a lot of threads, we'll > probably get a lot of data too. But, hopefully there's something in there > that lends a clue as to issue. If there's contention, that suggests to me > some contention in work stealing. IIRC, there's a way to get work stealing > info in +ParallelOld GC. But, my mind is drawing a blank. :-| > > Just off the top of my head, do you know if this app makes heavy use of > Reference objects, i.e. < Weak | Soft | Phantom | Final > References? > > Adding +PrintReferenceGC will tell us what kind of overhead you're > experiencing with reference processing. If you're seeing high values of > reference processing, then you'll probably want to add > -XX:+ParallelRefProcEnabled. > > I'd look at reference processing first before looking at the > +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps. > > Ooh, another thought, are there other Java apps running on the same > system? If so, how many GC threads and application threads tend to be > active at any given time? > > hths, > > charlie ... > > On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote: > > Thanks Charlie. At a quick glance, I didn't see it benefit my case today > (~5gb old) - wall clock time was roughly same as single threaded, but user > time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon > Linux server. > > I seem to vaguely recall reading that parallel old sometimes performs > worse than single threaded old in some cases, perhaps due to some > contention between GC threads. > > Anyway, I'll keep monitoring though. > > Thanks > > Sent from my phone > On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote: > >> Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not >> auto-enabled. >> >> hths, >> >> charlie ... >> >> On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: >> >> Hi Charlie, >> >> Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off >> by default, as you say, until 7u4 so I'm unsure if that's for some >> good/specific reason or not. >> >> Thanks >> >> Sent from my phone >> On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote: >> >>> Jumping in a bit late ... >>> >>> Strongly suggest to anyone evaluating G1 to not use anything prior to >>> 7u4. And, even better if you use (as of this writing) 7u9, or the latest >>> production Java 7 HotSpot VM. >>> >>> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on >>> one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), >>> which is currently slated to be back ported to a future Java 7, (thanks >>> Monica, John Cuthbertson and Bengt tackling this!). >>> >>> >From looking at your observations and others comments thus far, my >>> initial reaction is that with a 1G Java heap, you might get the best >>> results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or >>> -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is >>> -XX:+UseParallelOldGC automatically set for what's called "server class" >>> machines when you don't specify a GC. >>> >>> The lengthy concurrent mark could be the result of the implementation of >>> G1 in 6u*, or it could be that your system is swapping. Could you check if >>> your system is swapping? On Solaris you can monitor this using vmstat and >>> observing, not only just free memory, but also sr == scan rate along with >>> pi == page in and po == page out. Seeing sr (page scan activity) along >>> with low free memory along with pi & po activity are strong suggestions of >>> swapping. Seeing low free memory and no sr activity is ok, i.e. no >>> swapping. >>> >>> Additionally, you are right. "partial" was changed to "mixed" in the GC >>> logs. For those interested in a bit of history .... this change was made >>> since we felt "partial" was misleading. What partial was intended to mean >>> was a partial old gen collection, which did occur. But, on that same GC >>> event it also included a young gen GC. As a result, we changed the GC >>> event name to "mixed" since that GC event was really a combination of both >>> a young gen GC and portion of old gen GC. >>> >>> Simone also has a good suggestion with including -XX:+PrintFlagsFinal >>> and -showversion as part of the GC log data to collect, especially with G1 >>> continuing to be improve and evolve. >>> >>> Look forward to seeing your GC logs! >>> >>> hths, >>> >>> charlie .... >>> >>> On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: >>> >>> > Hi Simone, >>> > >>> >> 4972.437: [GC pause (partial), 1.89505180 secs] >>> >> that I cannot decypher (to Monica - what "partial" means ?), and no >>> mixed GCs, which seems unusual as well. >>> > Oops, I understand that now: 'partial' used to be what 'mixed' is now! >>> > Our portal usually runs on Java 6u33. For the G1 tests I switched to >>> 7u7 because I had learned that G1 is far from mature in 6u33. >>> > But automatic deployments can overwrite the start script and thus >>> switch back to 6u33. >>> > >>> >> Are you sure you are actually using 1.7.0_u7 ? >>> > I have checked that in the archived start scripts and the result, >>> unfortunetaley, is: no. >>> > The 'good case' was actually running on 7u7 (that's why it was good), >>> but the 'bad case' was unwittingly run on 6u33 again. >>> > That's the true reason why the results were so much worse and so >>> incomprehensible. >>> > Thank you very much for looking at the log and for asking good >>> questions! >>> > >>> > I'll try to repeat the test and post the results on this list. >>> > >>> > Regards >>> > Andreas >>> > _______________________________________________ >>> > hotspot-gc-use mailing list >>> > hotspot-gc-use at openjdk.java.net >>> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121102/d5fadb57/attachment.html From ysr1729 at gmail.com Sat Nov 3 00:03:30 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Sat, 3 Nov 2012 00:03:30 -0700 Subject: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) Message-ID: [Edited subject line to show actual subject of discussion in last few emails in the thread] One issue I have found with ParallelOld vs Serial for sufficiently large heaps is that if there are large oop-rich objects, the deferred updates phase which is single-threaded and slow greatly dominates the pause time. There's discussion of this in an earlier thread (late last year or early this year), and I promised to work on a patch although never got around to it. We partially worked around it by preventing full compaction (i.e. compaction below dense prefix), but it doesn't work for all cases, for instance when an application churns large oop-rich objects (i.e. object arrays) through the old generation. Don't know if a CR was filed tracking that sighting and discussion. Other than those anomalies, I have usually seen user/elapsed time ratios of 10-12 using 18 worker threads in the cases I recall. That doesnot however mean a speed up of 10-12 versus serial. More like 5-6 x. YMMV of course. -- ramki On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich wrote: > To be honest, I didn't dig in yet as I got the set up running in our plant > towards the end of the day, and only casually looked at basic GC timestamps > for the full GCs. > > We do use some weak refs (no soft/phantom though), but I wouldn't call it > heavy (or even medium) for that matter. However, I'd have to look at what > GC reports, as you mention, to make sure, but I'm pretty confident that > it's not heavy. :) > > The server is dedicated to this sole java process, and nothing else of > significance (mem or cpu) is running on there. > > I'll try to investigate next week to see if anything sticks out. Regular > old GC is sufficient for my use case now, so I'm merely trying to see if I > can get some really cheap gains purely by enabling the parallel collector. > :) > > Generally speaking though, what sort of (ballpark) speedup is expected for > parallel old vs single threaded? Let's say on a machine with a modest CPU > count (8-16 hardware threads). I'd imagine any contention would > significantly reduce the speedup factor for hugely parallel machines, but > curious about the modest space. Are there any known issues/scenarios that > would nullify its benefit, other than what you've already mentioned? > > Thanks for all the advice and info. > > Sent from my phone > On Nov 2, 2012 7:14 PM, "Charlie Hunt" wrote: > >> Do you have GC logs you could share? >> >> We probably are gonna need more info on what's going on within >> ParallelOld. We might get some additional info from >> +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall >> how intrusive they are though. If you've got a lot of threads, we'll >> probably get a lot of data too. But, hopefully there's something in there >> that lends a clue as to issue. If there's contention, that suggests to me >> some contention in work stealing. IIRC, there's a way to get work stealing >> info in +ParallelOld GC. But, my mind is drawing a blank. :-| >> >> Just off the top of my head, do you know if this app makes heavy use of >> Reference objects, i.e. < Weak | Soft | Phantom | Final > References? >> >> Adding +PrintReferenceGC will tell us what kind of overhead you're >> experiencing with reference processing. If you're seeing high values of >> reference processing, then you'll probably want to add >> -XX:+ParallelRefProcEnabled. >> >> I'd look at reference processing first before looking at the >> +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps. >> >> Ooh, another thought, are there other Java apps running on the same >> system? If so, how many GC threads and application threads tend to be >> active at any given time? >> >> hths, >> >> charlie ... >> >> On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote: >> >> Thanks Charlie. At a quick glance, I didn't see it benefit my case today >> (~5gb old) - wall clock time was roughly same as single threaded, but user >> time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon >> Linux server. >> >> I seem to vaguely recall reading that parallel old sometimes performs >> worse than single threaded old in some cases, perhaps due to some >> contention between GC threads. >> >> Anyway, I'll keep monitoring though. >> >> Thanks >> >> Sent from my phone >> On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote: >> >>> Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not >>> auto-enabled. >>> >>> hths, >>> >>> charlie ... >>> >>> On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: >>> >>> Hi Charlie, >>> >>> Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off >>> by default, as you say, until 7u4 so I'm unsure if that's for some >>> good/specific reason or not. >>> >>> Thanks >>> >>> Sent from my phone >>> On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote: >>> >>>> Jumping in a bit late ... >>>> >>>> Strongly suggest to anyone evaluating G1 to not use anything prior to >>>> 7u4. And, even better if you use (as of this writing) 7u9, or the latest >>>> production Java 7 HotSpot VM. >>>> >>>> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on >>>> one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), >>>> which is currently slated to be back ported to a future Java 7, (thanks >>>> Monica, John Cuthbertson and Bengt tackling this!). >>>> >>>> >From looking at your observations and others comments thus far, my >>>> initial reaction is that with a 1G Java heap, you might get the best >>>> results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or >>>> -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is >>>> -XX:+UseParallelOldGC automatically set for what's called "server class" >>>> machines when you don't specify a GC. >>>> >>>> The lengthy concurrent mark could be the result of the implementation >>>> of G1 in 6u*, or it could be that your system is swapping. Could you check >>>> if your system is swapping? On Solaris you can monitor this using vmstat >>>> and observing, not only just free memory, but also sr == scan rate along >>>> with pi == page in and po == page out. Seeing sr (page scan activity) >>>> along with low free memory along with pi & po activity are strong >>>> suggestions of swapping. Seeing low free memory and no sr activity is ok, >>>> i.e. no swapping. >>>> >>>> Additionally, you are right. "partial" was changed to "mixed" in the >>>> GC logs. For those interested in a bit of history .... this change was >>>> made since we felt "partial" was misleading. What partial was intended to >>>> mean was a partial old gen collection, which did occur. But, on that same >>>> GC event it also included a young gen GC. As a result, we changed the GC >>>> event name to "mixed" since that GC event was really a combination of both >>>> a young gen GC and portion of old gen GC. >>>> >>>> Simone also has a good suggestion with including -XX:+PrintFlagsFinal >>>> and -showversion as part of the GC log data to collect, especially with G1 >>>> continuing to be improve and evolve. >>>> >>>> Look forward to seeing your GC logs! >>>> >>>> hths, >>>> >>>> charlie .... >>>> >>>> On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: >>>> >>>> > Hi Simone, >>>> > >>>> >> 4972.437: [GC pause (partial), 1.89505180 secs] >>>> >> that I cannot decypher (to Monica - what "partial" means ?), and no >>>> mixed GCs, which seems unusual as well. >>>> > Oops, I understand that now: 'partial' used to be what 'mixed' is now! >>>> > Our portal usually runs on Java 6u33. For the G1 tests I switched to >>>> 7u7 because I had learned that G1 is far from mature in 6u33. >>>> > But automatic deployments can overwrite the start script and thus >>>> switch back to 6u33. >>>> > >>>> >> Are you sure you are actually using 1.7.0_u7 ? >>>> > I have checked that in the archived start scripts and the result, >>>> unfortunetaley, is: no. >>>> > The 'good case' was actually running on 7u7 (that's why it was good), >>>> but the 'bad case' was unwittingly run on 6u33 again. >>>> > That's the true reason why the results were so much worse and so >>>> incomprehensible. >>>> > Thank you very much for looking at the log and for asking good >>>> questions! >>>> > >>>> > I'll try to repeat the test and post the results on this list. >>>> > >>>> > Regards >>>> > Andreas >>>> > _______________________________________________ >>>> > hotspot-gc-use mailing list >>>> > hotspot-gc-use at openjdk.java.net >>>> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>> >>> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121103/cf892289/attachment-0001.html From vitalyd at gmail.com Sat Nov 3 07:42:51 2012 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Sat, 3 Nov 2012 10:42:51 -0400 Subject: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) In-Reply-To: References: Message-ID: Thanks Ramki (and thanks for moving the thread to a new subject - should've done that myself to avoid conflating it). Hopefully I'll have time next week to investigate further. If I do and find anything of interest, I'll be sure to report back. Have a good weekend, Vitaly Sent from my phone On Nov 3, 2012 3:03 AM, "Srinivas Ramakrishna" wrote: > [Edited subject line to show actual subject of discussion in last few > emails in the thread] > > One issue I have found with ParallelOld vs Serial for sufficiently large > heaps is that if there are large oop-rich objects, > the deferred updates phase which is single-threaded and slow greatly > dominates the pause time. There's discussion of this > in an earlier thread (late last year or early this year), and I promised > to work on a patch although never got around to it. We partially > worked around it by preventing full compaction (i.e. compaction below > dense prefix), but it doesn't work for all cases, > for instance when an application churns large oop-rich objects (i.e. > object arrays) through the old generation. > Don't know if a CR was filed tracking that sighting and discussion. > > Other than those anomalies, I have usually seen user/elapsed time ratios > of 10-12 using 18 worker threads in > the cases I recall. That doesnot however mean a speed up of 10-12 versus > serial. More like 5-6 x. YMMV of course. > > -- ramki > > On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich wrote: > >> To be honest, I didn't dig in yet as I got the set up running in our >> plant towards the end of the day, and only casually looked at basic GC >> timestamps for the full GCs. >> >> We do use some weak refs (no soft/phantom though), but I wouldn't call it >> heavy (or even medium) for that matter. However, I'd have to look at what >> GC reports, as you mention, to make sure, but I'm pretty confident that >> it's not heavy. :) >> >> The server is dedicated to this sole java process, and nothing else of >> significance (mem or cpu) is running on there. >> >> I'll try to investigate next week to see if anything sticks out. Regular >> old GC is sufficient for my use case now, so I'm merely trying to see if I >> can get some really cheap gains purely by enabling the parallel collector. >> :) >> >> Generally speaking though, what sort of (ballpark) speedup is expected >> for parallel old vs single threaded? Let's say on a machine with a modest >> CPU count (8-16 hardware threads). I'd imagine any contention would >> significantly reduce the speedup factor for hugely parallel machines, but >> curious about the modest space. Are there any known issues/scenarios that >> would nullify its benefit, other than what you've already mentioned? >> >> Thanks for all the advice and info. >> >> Sent from my phone >> On Nov 2, 2012 7:14 PM, "Charlie Hunt" wrote: >> >>> Do you have GC logs you could share? >>> >>> We probably are gonna need more info on what's going on within >>> ParallelOld. We might get some additional info from >>> +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall >>> how intrusive they are though. If you've got a lot of threads, we'll >>> probably get a lot of data too. But, hopefully there's something in there >>> that lends a clue as to issue. If there's contention, that suggests to me >>> some contention in work stealing. IIRC, there's a way to get work stealing >>> info in +ParallelOld GC. But, my mind is drawing a blank. :-| >>> >>> Just off the top of my head, do you know if this app makes heavy use of >>> Reference objects, i.e. < Weak | Soft | Phantom | Final > References? >>> >>> Adding +PrintReferenceGC will tell us what kind of overhead you're >>> experiencing with reference processing. If you're seeing high values of >>> reference processing, then you'll probably want to add >>> -XX:+ParallelRefProcEnabled. >>> >>> I'd look at reference processing first before looking at the >>> +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps. >>> >>> Ooh, another thought, are there other Java apps running on the same >>> system? If so, how many GC threads and application threads tend to be >>> active at any given time? >>> >>> hths, >>> >>> charlie ... >>> >>> On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote: >>> >>> Thanks Charlie. At a quick glance, I didn't see it benefit my case >>> today (~5gb old) - wall clock time was roughly same as single threaded, but >>> user time was quite high (7 secs wall, 37 sec user). This is on an 8 way >>> Xeon Linux server. >>> >>> I seem to vaguely recall reading that parallel old sometimes performs >>> worse than single threaded old in some cases, perhaps due to some >>> contention between GC threads. >>> >>> Anyway, I'll keep monitoring though. >>> >>> Thanks >>> >>> Sent from my phone >>> On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote: >>> >>>> Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not >>>> auto-enabled. >>>> >>>> hths, >>>> >>>> charlie ... >>>> >>>> On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: >>>> >>>> Hi Charlie, >>>> >>>> Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off >>>> by default, as you say, until 7u4 so I'm unsure if that's for some >>>> good/specific reason or not. >>>> >>>> Thanks >>>> >>>> Sent from my phone >>>> On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote: >>>> >>>>> Jumping in a bit late ... >>>>> >>>>> Strongly suggest to anyone evaluating G1 to not use anything prior to >>>>> 7u4. And, even better if you use (as of this writing) 7u9, or the latest >>>>> production Java 7 HotSpot VM. >>>>> >>>>> Fwiw, I'm really liking what I am seeing in 7u9 with the exception on >>>>> one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), >>>>> which is currently slated to be back ported to a future Java 7, (thanks >>>>> Monica, John Cuthbertson and Bengt tackling this!). >>>>> >>>>> >From looking at your observations and others comments thus far, my >>>>> initial reaction is that with a 1G Java heap, you might get the best >>>>> results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or >>>>> -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is >>>>> -XX:+UseParallelOldGC automatically set for what's called "server class" >>>>> machines when you don't specify a GC. >>>>> >>>>> The lengthy concurrent mark could be the result of the implementation >>>>> of G1 in 6u*, or it could be that your system is swapping. Could you check >>>>> if your system is swapping? On Solaris you can monitor this using vmstat >>>>> and observing, not only just free memory, but also sr == scan rate along >>>>> with pi == page in and po == page out. Seeing sr (page scan activity) >>>>> along with low free memory along with pi & po activity are strong >>>>> suggestions of swapping. Seeing low free memory and no sr activity is ok, >>>>> i.e. no swapping. >>>>> >>>>> Additionally, you are right. "partial" was changed to "mixed" in the >>>>> GC logs. For those interested in a bit of history .... this change was >>>>> made since we felt "partial" was misleading. What partial was intended to >>>>> mean was a partial old gen collection, which did occur. But, on that same >>>>> GC event it also included a young gen GC. As a result, we changed the GC >>>>> event name to "mixed" since that GC event was really a combination of both >>>>> a young gen GC and portion of old gen GC. >>>>> >>>>> Simone also has a good suggestion with including -XX:+PrintFlagsFinal >>>>> and -showversion as part of the GC log data to collect, especially with G1 >>>>> continuing to be improve and evolve. >>>>> >>>>> Look forward to seeing your GC logs! >>>>> >>>>> hths, >>>>> >>>>> charlie .... >>>>> >>>>> On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: >>>>> >>>>> > Hi Simone, >>>>> > >>>>> >> 4972.437: [GC pause (partial), 1.89505180 secs] >>>>> >> that I cannot decypher (to Monica - what "partial" means ?), and no >>>>> mixed GCs, which seems unusual as well. >>>>> > Oops, I understand that now: 'partial' used to be what 'mixed' is >>>>> now! >>>>> > Our portal usually runs on Java 6u33. For the G1 tests I switched to >>>>> 7u7 because I had learned that G1 is far from mature in 6u33. >>>>> > But automatic deployments can overwrite the start script and thus >>>>> switch back to 6u33. >>>>> > >>>>> >> Are you sure you are actually using 1.7.0_u7 ? >>>>> > I have checked that in the archived start scripts and the result, >>>>> unfortunetaley, is: no. >>>>> > The 'good case' was actually running on 7u7 (that's why it was >>>>> good), but the 'bad case' was unwittingly run on 6u33 again. >>>>> > That's the true reason why the results were so much worse and so >>>>> incomprehensible. >>>>> > Thank you very much for looking at the log and for asking good >>>>> questions! >>>>> > >>>>> > I'll try to repeat the test and post the results on this list. >>>>> > >>>>> > Regards >>>>> > Andreas >>>>> > _______________________________________________ >>>>> > hotspot-gc-use mailing list >>>>> > hotspot-gc-use at openjdk.java.net >>>>> > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>> >>>> >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121103/aabe9d04/attachment.html From Andreas.Mueller at mgm-tp.com Sun Nov 4 00:13:34 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Sun, 4 Nov 2012 07:13:34 +0000 Subject: AW: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) In-Reply-To: References: Message-ID: <46FF8393B58AD84D95E444264805D98F28F84C44@edata02.mgm-edv.de> Hi, I can confirm Vitaly's observation that ParallelOldGC in many cases does not bring about much benefit. Sometimes I saw usr/real time ratio stayed close to 1 and sometimes it was higher but with very little effect on the Full GC pause times. BTW, do you expect much effect with that option on a 2-CPU-machine? What percentage range? I also found that presentation again which claimed that "-XX:+UseParallelOldGC (on by default with ParallelGC in JDK 6": http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf which had confused me for a while because I could not get usr/real>1 during Full GC runs without adding -XX:+UseParallelOldGC explicitly. Best regards Andreas Von: Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] Gesendet: Samstag, 3. November 2012 08:04 An: Vitaly Davidovich Cc: Charlie Hunt; Andreas M?ller; hotspot-gc-use; Simone Bordet Betreff: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) [Edited subject line to show actual subject of discussion in last few emails in the thread] One issue I have found with ParallelOld vs Serial for sufficiently large heaps is that if there are large oop-rich objects, the deferred updates phase which is single-threaded and slow greatly dominates the pause time. There's discussion of this in an earlier thread (late last year or early this year), and I promised to work on a patch although never got around to it. We partially worked around it by preventing full compaction (i.e. compaction below dense prefix), but it doesn't work for all cases, for instance when an application churns large oop-rich objects (i.e. object arrays) through the old generation. Don't know if a CR was filed tracking that sighting and discussion. Other than those anomalies, I have usually seen user/elapsed time ratios of 10-12 using 18 worker threads in the cases I recall. That doesnot however mean a speed up of 10-12 versus serial. More like 5-6 x. YMMV of course. -- ramki On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich > wrote: To be honest, I didn't dig in yet as I got the set up running in our plant towards the end of the day, and only casually looked at basic GC timestamps for the full GCs. We do use some weak refs (no soft/phantom though), but I wouldn't call it heavy (or even medium) for that matter. However, I'd have to look at what GC reports, as you mention, to make sure, but I'm pretty confident that it's not heavy. :) The server is dedicated to this sole java process, and nothing else of significance (mem or cpu) is running on there. I'll try to investigate next week to see if anything sticks out. Regular old GC is sufficient for my use case now, so I'm merely trying to see if I can get some really cheap gains purely by enabling the parallel collector. :) Generally speaking though, what sort of (ballpark) speedup is expected for parallel old vs single threaded? Let's say on a machine with a modest CPU count (8-16 hardware threads). I'd imagine any contention would significantly reduce the speedup factor for hugely parallel machines, but curious about the modest space. Are there any known issues/scenarios that would nullify its benefit, other than what you've already mentioned? Thanks for all the advice and info. Sent from my phone On Nov 2, 2012 7:14 PM, "Charlie Hunt" > wrote: Do you have GC logs you could share? We probably are gonna need more info on what's going on within ParallelOld. We might get some additional info from +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall how intrusive they are though. If you've got a lot of threads, we'll probably get a lot of data too. But, hopefully there's something in there that lends a clue as to issue. If there's contention, that suggests to me some contention in work stealing. IIRC, there's a way to get work stealing info in +ParallelOld GC. But, my mind is drawing a blank. :-| Just off the top of my head, do you know if this app makes heavy use of Reference objects, i.e. < Weak | Soft | Phantom | Final > References? Adding +PrintReferenceGC will tell us what kind of overhead you're experiencing with reference processing. If you're seeing high values of reference processing, then you'll probably want to add -XX:+ParallelRefProcEnabled. I'd look at reference processing first before looking at the +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps. Ooh, another thought, are there other Java apps running on the same system? If so, how many GC threads and application threads tend to be active at any given time? hths, charlie ... On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote: Thanks Charlie. At a quick glance, I didn't see it benefit my case today (~5gb old) - wall clock time was roughly same as single threaded, but user time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon Linux server. I seem to vaguely recall reading that parallel old sometimes performs worse than single threaded old in some cases, perhaps due to some contention between GC threads. Anyway, I'll keep monitoring though. Thanks Sent from my phone On Nov 2, 2012 10:15 AM, "Charlie Hunt" > wrote: Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not auto-enabled. hths, charlie ... On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote: Hi Charlie, Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by default, as you say, until 7u4 so I'm unsure if that's for some good/specific reason or not. Thanks Sent from my phone On Nov 2, 2012 8:36 AM, "Charlie Hunt" > wrote: Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > But automatic deployments can overwrite the start script and thus switch back to 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? > I have checked that in the archived start scripts and the result, unfortunetaley, is: no. > The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. > That's the true reason why the results were so much worse and so incomprehensible. > Thank you very much for looking at the log and for asking good questions! > > I'll try to repeat the test and post the results on this list. > > Regards > Andreas > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121104/aa996a32/attachment-0001.html From rednaxelafx at gmail.com Sun Nov 4 01:14:49 2012 From: rednaxelafx at gmail.com (Krystal Mok) Date: Sun, 4 Nov 2012 16:14:49 +0800 Subject: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F84C44@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F84C44@edata02.mgm-edv.de> Message-ID: Hi Andreas, UseParallelOldGC is turned on by default starting from 6679764 [1]. You should find it working in JDK7u4 and above. Just ran a test with JDK7u6 and it worked. This change was never backported to JDK6. - Kris [1]: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-February/004045.html On Sun, Nov 4, 2012 at 3:13 PM, Andreas M?ller wrote: > Hi,**** > > ** ** > > I can confirm Vitaly?s observation that ParallelOldGC in many cases does > not bring about much benefit.**** > > Sometimes I saw usr/real time ratio stayed close to 1 and sometimes it was > higher but with very little effect on the Full GC pause times. **** > > BTW, do you expect much effect with that option on a 2-CPU-machine? What > percentage range? **** > > ** ** > > I also found that presentation again which claimed that ?-XX:+UseParallelOldGC > (on by default with ParallelGC in JDK 6?:**** > > http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf**** > > which had confused me for a while because I could not get usr/real>1 > during Full GC runs without adding ?XX:+UseParallelOldGC explicitly. **** > > ** ** > > Best regards**** > > Andreas**** > > ** ** > > ***Von:* Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] > *Gesendet:* Samstag, 3. November 2012 08:04 > *An:* Vitaly Davidovich > *Cc:* Charlie Hunt; Andreas M?ller; hotspot-gc-use; Simone Bordet > *Betreff:* Parallel vs Serial Old (was Re: G1 issue: falling over to Full > GC)**** > > ** ** > > [Edited subject line to show actual subject of discussion in last few > emails in the thread] > > One issue I have found with ParallelOld vs Serial for sufficiently large > heaps is that if there are large oop-rich objects, > the deferred updates phase which is single-threaded and slow greatly > dominates the pause time. There's discussion of this > in an earlier thread (late last year or early this year), and I promised > to work on a patch although never got around to it. We partially > worked around it by preventing full compaction (i.e. compaction below > dense prefix), but it doesn't work for all cases, > for instance when an application churns large oop-rich objects (i.e. > object arrays) through the old generation. > Don't know if a CR was filed tracking that sighting and discussion. > > Other than those anomalies, I have usually seen user/elapsed time ratios > of 10-12 using 18 worker threads in > the cases I recall. That doesnot however mean a speed up of 10-12 versus > serial. More like 5-6 x. YMMV of course. > > -- ramki**** > > On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich > wrote:**** > > To be honest, I didn't dig in yet as I got the set up running in our plant > towards the end of the day, and only casually looked at basic GC timestamps > for the full GCs.**** > > We do use some weak refs (no soft/phantom though), but I wouldn't call it > heavy (or even medium) for that matter. However, I'd have to look at what > GC reports, as you mention, to make sure, but I'm pretty confident that > it's not heavy. :)**** > > The server is dedicated to this sole java process, and nothing else of > significance (mem or cpu) is running on there.**** > > I'll try to investigate next week to see if anything sticks out. Regular > old GC is sufficient for my use case now, so I'm merely trying to see if I > can get some really cheap gains purely by enabling the parallel collector. > :)**** > > Generally speaking though, what sort of (ballpark) speedup is expected for > parallel old vs single threaded? Let's say on a machine with a modest CPU > count (8-16 hardware threads). I'd imagine any contention would > significantly reduce the speedup factor for hugely parallel machines, but > curious about the modest space. Are there any known issues/scenarios that > would nullify its benefit, other than what you've already mentioned?**** > > Thanks for all the advice and info.**** > > Sent from my phone**** > > On Nov 2, 2012 7:14 PM, "Charlie Hunt" wrote:**** > > Do you have GC logs you could share?**** > > ** ** > > We probably are gonna need more info on what's going on within > ParallelOld. We might get some additional info from > +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall > how intrusive they are though. If you've got a lot of threads, we'll > probably get a lot of data too. But, hopefully there's something in there > that lends a clue as to issue. If there's contention, that suggests to me > some contention in work stealing. IIRC, there's a way to get work stealing > info in +ParallelOld GC. But, my mind is drawing a blank. :-|**** > > ** ** > > Just off the top of my head, do you know if this app makes heavy use of > Reference objects, i.e. < Weak | Soft | Phantom | Final > References?**** > > ** ** > > Adding +PrintReferenceGC will tell us what kind of overhead you're > experiencing with reference processing. If you're seeing high values of > reference processing, then you'll probably want to add > -XX:+ParallelRefProcEnabled.**** > > ** ** > > I'd look at reference processing first before looking at the > +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps.**** > > ** ** > > Ooh, another thought, are there other Java apps running on the same > system? If so, how many GC threads and application threads tend to be > active at any given time?**** > > ** ** > > hths,**** > > ** ** > > charlie ...**** > > ** ** > > On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote:**** > > > > **** > > Thanks Charlie. At a quick glance, I didn't see it benefit my case today > (~5gb old) - wall clock time was roughly same as single threaded, but user > time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon > Linux server.**** > > I seem to vaguely recall reading that parallel old sometimes performs > worse than single threaded old in some cases, perhaps due to some > contention between GC threads.**** > > Anyway, I'll keep monitoring though.**** > > Thanks**** > > Sent from my phone**** > > On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote:**** > > Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not > auto-enabled.**** > > ** ** > > hths,**** > > ** ** > > charlie ...**** > > ** ** > > On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote:**** > > > > **** > > Hi Charlie,**** > > Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by > default, as you say, until 7u4 so I'm unsure if that's for some > good/specific reason or not.**** > > Thanks**** > > Sent from my phone**** > > On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote:**** > > Jumping in a bit late ... > > Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. > And, even better if you use (as of this writing) 7u9, or the latest > production Java 7 HotSpot VM. > > Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one > issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), > which is currently slated to be back ported to a future Java 7, (thanks > Monica, John Cuthbertson and Bengt tackling this!). > > >From looking at your observations and others comments thus far, my > initial reaction is that with a 1G Java heap, you might get the best > results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or > -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is > -XX:+UseParallelOldGC automatically set for what's called "server class" > machines when you don't specify a GC. > > The lengthy concurrent mark could be the result of the implementation of > G1 in 6u*, or it could be that your system is swapping. Could you check if > your system is swapping? On Solaris you can monitor this using vmstat and > observing, not only just free memory, but also sr == scan rate along with > pi == page in and po == page out. Seeing sr (page scan activity) along > with low free memory along with pi & po activity are strong suggestions of > swapping. Seeing low free memory and no sr activity is ok, i.e. no > swapping. > > Additionally, you are right. "partial" was changed to "mixed" in the GC > logs. For those interested in a bit of history .... this change was made > since we felt "partial" was misleading. What partial was intended to mean > was a partial old gen collection, which did occur. But, on that same GC > event it also included a young gen GC. As a result, we changed the GC > event name to "mixed" since that GC event was really a combination of both > a young gen GC and portion of old gen GC. > > Simone also has a good suggestion with including -XX:+PrintFlagsFinal and > -showversion as part of the GC log data to collect, especially with G1 > continuing to be improve and evolve. > > Look forward to seeing your GC logs! > > hths, > > charlie .... > > On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > > > Hi Simone, > > > >> 4972.437: [GC pause (partial), 1.89505180 secs] > >> that I cannot decypher (to Monica - what "partial" means ?), and no > mixed GCs, which seems unusual as well. > > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 > because I had learned that G1 is far from mature in 6u33. > > But automatic deployments can overwrite the start script and thus switch > back to 6u33. > > > >> Are you sure you are actually using 1.7.0_u7 ? > > I have checked that in the archived start scripts and the result, > unfortunetaley, is: no. > > The 'good case' was actually running on 7u7 (that's why it was good), > but the 'bad case' was unwittingly run on 6u33 again. > > That's the true reason why the results were so much worse and so > incomprehensible. > > Thank you very much for looking at the log and for asking good questions! > > > > I'll try to repeat the test and post the results on this list. > > > > Regards > > Andreas > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use**** > > ** ** > > ** ** > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use**** > > ** ** > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121104/a0d0681b/attachment-0001.html From ysr1729 at gmail.com Sun Nov 4 10:27:11 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Sun, 4 Nov 2012 10:27:11 -0800 Subject: Parallel vs Serial Old (was Re: G1 issue: falling over to Full GC) In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F84C44@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F84C44@edata02.mgm-edv.de> Message-ID: Hi Andreas -- Great to hear from you; it's been a while! Hope you are doing well! On Sun, Nov 4, 2012 at 12:13 AM, Andreas M?ller wrote: > Hi,**** > > ** ** > > I can confirm Vitaly?s observation that ParallelOldGC in many cases does > not bring about much benefit.**** > > Sometimes I saw usr/real time ratio stayed close to 1 and sometimes it was > higher but with very little effect on the Full GC pause times. **** > > BTW, do you expect much effect with that option on a 2-CPU-machine? What > percentage range? > For a 2 virtual cpu's, i've found it a wash. But anything above that definitely improves average performance for whole heap gc's, in my experience. The occasoional longer whole heap GC can, however, make the overall experience negative because that pause is sometimes worse than a serial gc pause (see remarks in my previous email). > **** > > ** ** > > I also found that presentation again which claimed that ?-XX:+UseParallelOldGC > (on by default with ParallelGC in JDK 6?:**** > > http://www.austinjug.org/presentations/JDK6PerfUpdate_Dec2009.pdf**** > > which had confused me for a while because I could not get usr/real>1 > during Full GC runs without adding ?XX:+UseParallelOldGC explicitly. > Yes, that presentation was probably written when there may have been some debate on the default, and the change of defaults never made it because of a performance anomaly caught late in the release cycle. I can confirm for example that with 7u5, ParallelOld is the default, and with 6u29 it not the default. $ /usr/lib/jvm/jdk1.6.0_29/bin/java -XX:+PrintFlagsFinal -version | grep ParallelOldGC bool PrintParallelOldGCPhaseTimes = false {product} bool TraceParallelOldGCTasks = false {product} bool UseParallelOldGC = false {product} bool UseParallelOldGCCompacting = true {product} bool UseParallelOldGCDensePrefix = true {product} java version "1.6.0_29" Java(TM) SE Runtime Environment (build 1.6.0_29-b11) Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02, mixed mode) $ /usr/lib/jvm/jdk1.7.0_05/bin/java -XX:+PrintFlagsFinal -version | grep ParallelOldGC bool PrintParallelOldGCPhaseTimes = false {product} bool TraceParallelOldGCTasks = false {product} bool UseParallelOldGC = true {product} java version "1.7.0_05" Java(TM) SE Runtime Environment (build 1.7.0_05-b06) Java HotSpot(TM) 64-Bit Server VM (build 23.1-b03, mixed mode) best regards. -- ramki **** > > ** ** > > Best regards**** > > Andreas**** > > ** ** > > ***Von:* Srinivas Ramakrishna [mailto:ysr1729 at gmail.com] > *Gesendet:* Samstag, 3. November 2012 08:04 > *An:* Vitaly Davidovich > *Cc:* Charlie Hunt; Andreas M?ller; hotspot-gc-use; Simone Bordet > *Betreff:* Parallel vs Serial Old (was Re: G1 issue: falling over to Full > GC)**** > > ** ** > > [Edited subject line to show actual subject of discussion in last few > emails in the thread] > > One issue I have found with ParallelOld vs Serial for sufficiently large > heaps is that if there are large oop-rich objects, > the deferred updates phase which is single-threaded and slow greatly > dominates the pause time. There's discussion of this > in an earlier thread (late last year or early this year), and I promised > to work on a patch although never got around to it. We partially > worked around it by preventing full compaction (i.e. compaction below > dense prefix), but it doesn't work for all cases, > for instance when an application churns large oop-rich objects (i.e. > object arrays) through the old generation. > Don't know if a CR was filed tracking that sighting and discussion. > > Other than those anomalies, I have usually seen user/elapsed time ratios > of 10-12 using 18 worker threads in > the cases I recall. That doesnot however mean a speed up of 10-12 versus > serial. More like 5-6 x. YMMV of course. > > -- ramki**** > > On Fri, Nov 2, 2012 at 4:29 PM, Vitaly Davidovich > wrote:**** > > To be honest, I didn't dig in yet as I got the set up running in our plant > towards the end of the day, and only casually looked at basic GC timestamps > for the full GCs.**** > > We do use some weak refs (no soft/phantom though), but I wouldn't call it > heavy (or even medium) for that matter. However, I'd have to look at what > GC reports, as you mention, to make sure, but I'm pretty confident that > it's not heavy. :)**** > > The server is dedicated to this sole java process, and nothing else of > significance (mem or cpu) is running on there.**** > > I'll try to investigate next week to see if anything sticks out. Regular > old GC is sufficient for my use case now, so I'm merely trying to see if I > can get some really cheap gains purely by enabling the parallel collector. > :)**** > > Generally speaking though, what sort of (ballpark) speedup is expected for > parallel old vs single threaded? Let's say on a machine with a modest CPU > count (8-16 hardware threads). I'd imagine any contention would > significantly reduce the speedup factor for hugely parallel machines, but > curious about the modest space. Are there any known issues/scenarios that > would nullify its benefit, other than what you've already mentioned?**** > > Thanks for all the advice and info.**** > > Sent from my phone**** > > On Nov 2, 2012 7:14 PM, "Charlie Hunt" wrote:**** > > Do you have GC logs you could share?**** > > ** ** > > We probably are gonna need more info on what's going on within > ParallelOld. We might get some additional info from > +PrintGCTaskTimeStamps or +PrintParallelOldGCPhaseTimes. I don't recall > how intrusive they are though. If you've got a lot of threads, we'll > probably get a lot of data too. But, hopefully there's something in there > that lends a clue as to issue. If there's contention, that suggests to me > some contention in work stealing. IIRC, there's a way to get work stealing > info in +ParallelOld GC. But, my mind is drawing a blank. :-|**** > > ** ** > > Just off the top of my head, do you know if this app makes heavy use of > Reference objects, i.e. < Weak | Soft | Phantom | Final > References?**** > > ** ** > > Adding +PrintReferenceGC will tell us what kind of overhead you're > experiencing with reference processing. If you're seeing high values of > reference processing, then you'll probably want to add > -XX:+ParallelRefProcEnabled.**** > > ** ** > > I'd look at reference processing first before looking at the > +PrintParallelOldGCPhaseTimes or +PrintGCTaskTimeStamps.**** > > ** ** > > Ooh, another thought, are there other Java apps running on the same > system? If so, how many GC threads and application threads tend to be > active at any given time?**** > > ** ** > > hths,**** > > ** ** > > charlie ...**** > > ** ** > > On Nov 2, 2012, at 5:42 PM, Vitaly Davidovich wrote:**** > > > > **** > > Thanks Charlie. At a quick glance, I didn't see it benefit my case today > (~5gb old) - wall clock time was roughly same as single threaded, but user > time was quite high (7 secs wall, 37 sec user). This is on an 8 way Xeon > Linux server.**** > > I seem to vaguely recall reading that parallel old sometimes performs > worse than single threaded old in some cases, perhaps due to some > contention between GC threads.**** > > Anyway, I'll keep monitoring though.**** > > Thanks**** > > Sent from my phone**** > > On Nov 2, 2012 10:15 AM, "Charlie Hunt" wrote:**** > > Yes, I'd recommend +UseParallelOldGC on 6u23 even though it's not > auto-enabled.**** > > ** ** > > hths,**** > > ** ** > > charlie ...**** > > ** ** > > On Nov 2, 2012, at 8:04 AM, Vitaly Davidovich wrote:**** > > > > **** > > Hi Charlie,**** > > Out of curiosity, is UseParallelOldGC advisable on, say, 6u23? It's off by > default, as you say, until 7u4 so I'm unsure if that's for some > good/specific reason or not.**** > > Thanks**** > > Sent from my phone**** > > On Nov 2, 2012 8:36 AM, "Charlie Hunt" wrote:**** > > Jumping in a bit late ... > > Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. > And, even better if you use (as of this writing) 7u9, or the latest > production Java 7 HotSpot VM. > > Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one > issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), > which is currently slated to be back ported to a future Java 7, (thanks > Monica, John Cuthbertson and Bengt tackling this!). > > >From looking at your observations and others comments thus far, my > initial reaction is that with a 1G Java heap, you might get the best > results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or > -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is > -XX:+UseParallelOldGC automatically set for what's called "server class" > machines when you don't specify a GC. > > The lengthy concurrent mark could be the result of the implementation of > G1 in 6u*, or it could be that your system is swapping. Could you check if > your system is swapping? On Solaris you can monitor this using vmstat and > observing, not only just free memory, but also sr == scan rate along with > pi == page in and po == page out. Seeing sr (page scan activity) along > with low free memory along with pi & po activity are strong suggestions of > swapping. Seeing low free memory and no sr activity is ok, i.e. no > swapping. > > Additionally, you are right. "partial" was changed to "mixed" in the GC > logs. For those interested in a bit of history .... this change was made > since we felt "partial" was misleading. What partial was intended to mean > was a partial old gen collection, which did occur. But, on that same GC > event it also included a young gen GC. As a result, we changed the GC > event name to "mixed" since that GC event was really a combination of both > a young gen GC and portion of old gen GC. > > Simone also has a good suggestion with including -XX:+PrintFlagsFinal and > -showversion as part of the GC log data to collect, especially with G1 > continuing to be improve and evolve. > > Look forward to seeing your GC logs! > > hths, > > charlie .... > > On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > > > Hi Simone, > > > >> 4972.437: [GC pause (partial), 1.89505180 secs] > >> that I cannot decypher (to Monica - what "partial" means ?), and no > mixed GCs, which seems unusual as well. > > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 > because I had learned that G1 is far from mature in 6u33. > > But automatic deployments can overwrite the start script and thus switch > back to 6u33. > > > >> Are you sure you are actually using 1.7.0_u7 ? > > I have checked that in the archived start scripts and the result, > unfortunetaley, is: no. > > The 'good case' was actually running on 7u7 (that's why it was good), > but the 'bad case' was unwittingly run on 6u33 again. > > That's the true reason why the results were so much worse and so > incomprehensible. > > Thank you very much for looking at the log and for asking good questions! > > > > I'll try to repeat the test and post the results on this list. > > > > Regards > > Andreas > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use**** > > ** ** > > ** ** > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use**** > > ** ** > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121104/4a21bd1a/attachment-0001.html From the.6th.month at gmail.com Mon Nov 5 00:43:15 2012 From: the.6th.month at gmail.com (the.6th.month at gmail.com) Date: Mon, 5 Nov 2012 16:43:15 +0800 Subject: young gc time Message-ID: hi,all: I am using ParNew gc in our prod env, and I notice that the young gc time is consistently high. I am wondering whether there is a way to lower young gc time. Through monitoring JMX garbage collection metrics, we found that the par new gc time was roughly 150~200ms per par new gc, and the par new gc happened every 15-20 seconds. I paste a snippet of gc log below: : 864507K->87635K(917504K), 0.1231380 secs] 3699003K->2931949K(4476928K), 0.1236430 secs] [Times: user=0.43 sys=0.00, real=0.12 secs] 2012-11-05T16:23:23.949+0800: 236482.439: [GC 236482.439: [ParNew Desired survivor size 67108864 bytes, new threshold 6 (max 6) - age 1: 17253464 bytes, 17253464 total - age 2: 14360944 bytes, 31614408 total - age 3: 25430792 bytes, 57045200 total - age 4: 2902680 bytes, 59947880 total - age 5: 5009480 bytes, 64957360 total - age 6: 5909656 bytes, 70867016 total : 874067K->90642K(917504K), 0.1477390 secs] 3718381K->2938612K(4476928K), 0.1481920 secs] [Times: user=0.53 sys=0.00, real=0.15 secs] 2012-11-05T16:23:44.631+0800: 236503.121: [GC 236503.121: [ParNew Desired survivor size 67108864 bytes, new threshold 6 (max 6) - age 1: 15747824 bytes, 15747824 total - age 2: 15988992 bytes, 31736816 total - age 3: 10270928 bytes, 42007744 total - age 4: 20606448 bytes, 62614192 total - age 5: 2049256 bytes, 64663448 total - age 6: 4744976 bytes, 69408424 total : 877074K->100618K(917504K), 0.1410630 secs] 3725044K->2954405K(4476928K), 0.1414400 secs] [Times: user=0.48 sys=0.00, real=0.14 secs] 2012-11-05T16:24:15.290+0800: 236533.780: [GC 236533.781: [ParNew Desired survivor size 67108864 bytes, new threshold 5 (max 6) - age 1: 19065656 bytes, 19065656 total - age 2: 12890184 bytes, 31955840 total - age 3: 15095912 bytes, 47051752 total - age 4: 8298736 bytes, 55350488 total - age 5: 15055264 bytes, 70405752 total - age 6: 1900328 bytes, 72306080 total and here is the output from jmap -heap: Server compiler detected. JVM version is 20.1-b02 using parallel threads in the new generation. using thread-local object allocation. Concurrent Mark-Sweep GC Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 4718592000 (4500.0MB) NewSize = 1073741824 (1024.0MB) MaxNewSize = 1073741824 (1024.0MB) OldSize = 5439488 (5.1875MB) NewRatio = 2 SurvivorRatio = 6 PermSize = 205520896 (196.0MB) MaxPermSize = 205520896 (196.0MB) Heap Usage: New Generation (Eden + 1 Survivor Space): capacity = 939524096 (896.0MB) used = 668354776 (637.3927841186523MB) free = 271169320 (258.60721588134766MB) 71.13758751324245% used Eden Space: capacity = 805306368 (768.0MB) used = 602192224 (574.2952575683594MB) free = 203114144 (193.70474243164062MB) 74.77802832921346% used >From Space: capacity = 134217728 (128.0MB) used = 66162552 (63.09752655029297MB) free = 68055176 (64.90247344970703MB) 49.29494261741638% used To Space: capacity = 134217728 (128.0MB) used = 0 (0.0MB) free = 134217728 (128.0MB) 0.0% used concurrent mark-sweep generation: capacity = 3644850176 (3476.0MB) used = 3004386160 (2865.2059173583984MB) free = 640464016 (610.7940826416016MB) 82.42824848556957% used Perm Generation: capacity = 205520896 (196.0MB) used = 110269304 (105.16100311279297MB) free = 95251592 (90.83899688720703MB) 53.65357301673111% use Given the card-table mechanism hotspot adopts for young gc, as our old gen is around 3.5 gigabytes, each young gc should go through 7m card table scans, I am wondering whether that's the reason behind the relatively slow young gc? Any advice? Thanks All the best, Leon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121105/cb912cce/attachment.html From sbordet at intalio.com Mon Nov 5 06:54:08 2012 From: sbordet at intalio.com (Simone Bordet) Date: Mon, 5 Nov 2012 15:54:08 +0100 Subject: G1 log output interpretation Message-ID: Hi, I am using 1.7.0_09 with: -XX:InitialHeapSize=1073741824 -XX:MaxHeapSize=1073741824 -XX:MaxPermSize=268435456 -XX:ReservedCodeCacheSize=67108864 -XX:InitiatingHeapOccupancyPercent=60 -XX:MaxGCPauseMillis=200 -XX:+PrintAdaptiveSizePolicy -XX:+PrintCommandLineFlags -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+UseCompressedOops -XX:+UseG1GC and I get this log, from time to time: 2012-11-05T12:09:02.422+0100: [GC pause (young) 2341.660: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 65.42 ms, remaining time: 134.58 ms, target pause time: 200.00 ms] 2341.660: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 198 regions, survivors: 6 regions, predicted young region time: 27.03 ms] 2341.660: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 198 regions, survivors: 6 regions, old: 0 regions, predicted pause time: 92.45 ms, target pause time: 200.00 ms] 2341.739: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.81 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)] , 0.07907900 secs] The relevant section is the last line, marked as G1Ergonomics (Heap Sizing). Can someone shed a light on what it means ? Seems an attempt to expand, but calculates to 0 ? I remember G1 has some "extra" allocation space that can be used beyond -Xmx, do I remember well ? Thanks ! Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121105/5e168967/attachment.html From java at java4.info Mon Nov 5 07:52:50 2012 From: java at java4.info (Florian Binder) Date: Mon, 05 Nov 2012 16:52:50 +0100 Subject: young gc time In-Reply-To: References: Message-ID: <5097E0D2.90309@java4.info> Hi Leon, there are always about 70mb in your survivor space which are copied on every young collection. I think this is the most expensive operation. You can reduce this by reducing the new gen size or the max threshold. Both will yield in more objects (soon garbage) getting promoted into the old generation. Since you are using the CMS-collector this might affect fragmentation and a little bit more overhead (more old collections). I would suggest to reduce the new gen size, because this increases the old gen size as well, which reduces fragmentation. /Flo Am 05.11.2012 09:43, schrieb the.6th.month at gmail.com: > hi,all: > > I am using ParNew gc in our prod env, and I notice that the young gc > time is consistently high. I am wondering whether there is a way to > lower young gc time. > Through monitoring JMX garbage collection metrics, we found that the > par new gc time was roughly 150~200ms per par new gc, and the par new > gc happened every 15-20 seconds. > I paste a snippet of gc log below: > : 864507K->87635K(917504K), 0.1231380 secs] > 3699003K->2931949K(4476928K), 0.1236430 secs] [Times: user=0.43 > sys=0.00, real=0.12 secs] > 2012-11-05T16:23:23.949+0800: 236482.439: [GC 236482.439: [ParNew > Desired survivor size 67108864 bytes, new threshold 6 (max 6) > - age 1: 17253464 bytes, 17253464 total > - age 2: 14360944 bytes, 31614408 total > - age 3: 25430792 bytes, 57045200 total > - age 4: 2902680 bytes, 59947880 total > - age 5: 5009480 bytes, 64957360 total > - age 6: 5909656 bytes, 70867016 total > : 874067K->90642K(917504K), 0.1477390 secs] > 3718381K->2938612K(4476928K), 0.1481920 secs] [Times: user=0.53 > sys=0.00, real=0.15 secs] > 2012-11-05T16:23:44.631+0800: 236503.121: [GC 236503.121: [ParNew > Desired survivor size 67108864 bytes, new threshold 6 (max 6) > - age 1: 15747824 bytes, 15747824 total > - age 2: 15988992 bytes, 31736816 total > - age 3: 10270928 bytes, 42007744 total > - age 4: 20606448 bytes, 62614192 total > - age 5: 2049256 bytes, 64663448 total > - age 6: 4744976 bytes, 69408424 total > : 877074K->100618K(917504K), 0.1410630 secs] > 3725044K->2954405K(4476928K), 0.1414400 secs] [Times: user=0.48 > sys=0.00, real=0.14 secs] > 2012-11-05T16:24:15.290+0800: 236533.780: [GC 236533.781: [ParNew > Desired survivor size 67108864 bytes, new threshold 5 (max 6) > - age 1: 19065656 bytes, 19065656 total > - age 2: 12890184 bytes, 31955840 total > - age 3: 15095912 bytes, 47051752 total > - age 4: 8298736 bytes, 55350488 total > - age 5: 15055264 bytes, 70405752 total > - age 6: 1900328 bytes, 72306080 total > > and here is the output from jmap -heap: > Server compiler detected. > JVM version is 20.1-b02 > > using parallel threads in the new generation. > using thread-local object allocation. > Concurrent Mark-Sweep GC > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 4718592000 (4500.0MB) > NewSize = 1073741824 (1024.0MB) > MaxNewSize = 1073741824 (1024.0MB) > OldSize = 5439488 (5.1875MB) > NewRatio = 2 > SurvivorRatio = 6 > PermSize = 205520896 (196.0MB) > MaxPermSize = 205520896 (196.0MB) > > Heap Usage: > New Generation (Eden + 1 Survivor Space): > capacity = 939524096 (896.0MB) > used = 668354776 (637.3927841186523MB) > free = 271169320 (258.60721588134766MB) > 71.13758751324245% used > Eden Space: > capacity = 805306368 (768.0MB) > used = 602192224 (574.2952575683594MB) > free = 203114144 (193.70474243164062MB) > 74.77802832921346% used > From Space: > capacity = 134217728 (128.0MB) > used = 66162552 (63.09752655029297MB) > free = 68055176 (64.90247344970703MB) > 49.29494261741638% used > To Space: > capacity = 134217728 (128.0MB) > used = 0 (0.0MB) > free = 134217728 (128.0MB) > 0.0% used > concurrent mark-sweep generation: > capacity = 3644850176 (3476.0MB) > used = 3004386160 (2865.2059173583984MB) > free = 640464016 (610.7940826416016MB) > 82.42824848556957% used > Perm Generation: > capacity = 205520896 (196.0MB) > used = 110269304 (105.16100311279297MB) > free = 95251592 (90.83899688720703MB) > 53.65357301673111% use > > Given the card-table mechanism hotspot adopts for young gc, as our old > gen is around 3.5 gigabytes, each young gc should go through 7m card > table scans, I am wondering whether that's the reason behind the > relatively slow young gc? > > Any advice? Thanks > > All the best, > Leon > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121105/3c359a4c/attachment-0001.html From the.6th.month at gmail.com Mon Nov 5 19:19:30 2012 From: the.6th.month at gmail.com (the.6th.month at gmail.com) Date: Tue, 6 Nov 2012 11:19:30 +0800 Subject: young gc time In-Reply-To: <5097E0D2.90309@java4.info> References: <5097E0D2.90309@java4.info> Message-ID: Hi, Florian: Yeah, as I observe it, the cms full gc frequency is quite low, which happens at an interval of 40 mins or 1 hr, and hence to reduce young gen size is acceptable. But I don't quite understand why copying 70m data from one survivor to another would be such an expensive operation. All the best, Leon On 5 November 2012 23:52, Florian Binder wrote: > Hi Leon, > > there are always about 70mb in your survivor space which are copied on > every young collection. I think this is the most expensive operation. > You can reduce this by reducing the new gen size or the max threshold. > Both will yield in more objects (soon garbage) getting promoted into the > old generation. > Since you are using the CMS-collector this might affect fragmentation and > a little bit more overhead (more old collections). > > I would suggest to reduce the new gen size, because this increases the old > gen size as well, which reduces fragmentation. > > /Flo > > > Am 05.11.2012 09:43, schrieb the.6th.month at gmail.com: > > hi,all: > > I am using ParNew gc in our prod env, and I notice that the young gc > time is consistently high. I am wondering whether there is a way to lower > young gc time. > Through monitoring JMX garbage collection metrics, we found that the par > new gc time was roughly 150~200ms per par new gc, and the par new gc > happened every 15-20 seconds. > I paste a snippet of gc log below: > : 864507K->87635K(917504K), 0.1231380 secs] > 3699003K->2931949K(4476928K), 0.1236430 secs] [Times: user=0.43 sys=0.00, > real=0.12 secs] > 2012-11-05T16:23:23.949+0800: 236482.439: [GC 236482.439: [ParNew > Desired survivor size 67108864 bytes, new threshold 6 (max 6) > - age 1: 17253464 bytes, 17253464 total > - age 2: 14360944 bytes, 31614408 total > - age 3: 25430792 bytes, 57045200 total > - age 4: 2902680 bytes, 59947880 total > - age 5: 5009480 bytes, 64957360 total > - age 6: 5909656 bytes, 70867016 total > : 874067K->90642K(917504K), 0.1477390 secs] 3718381K->2938612K(4476928K), > 0.1481920 secs] [Times: user=0.53 sys=0.00, real=0.15 secs] > 2012-11-05T16:23:44.631+0800: 236503.121: [GC 236503.121: [ParNew > Desired survivor size 67108864 bytes, new threshold 6 (max 6) > - age 1: 15747824 bytes, 15747824 total > - age 2: 15988992 bytes, 31736816 total > - age 3: 10270928 bytes, 42007744 total > - age 4: 20606448 bytes, 62614192 total > - age 5: 2049256 bytes, 64663448 total > - age 6: 4744976 bytes, 69408424 total > : 877074K->100618K(917504K), 0.1410630 secs] 3725044K->2954405K(4476928K), > 0.1414400 secs] [Times: user=0.48 sys=0.00, real=0.14 secs] > 2012-11-05T16:24:15.290+0800: 236533.780: [GC 236533.781: [ParNew > Desired survivor size 67108864 bytes, new threshold 5 (max 6) > - age 1: 19065656 bytes, 19065656 total > - age 2: 12890184 bytes, 31955840 total > - age 3: 15095912 bytes, 47051752 total > - age 4: 8298736 bytes, 55350488 total > - age 5: 15055264 bytes, 70405752 total > - age 6: 1900328 bytes, 72306080 total > > and here is the output from jmap -heap: > Server compiler detected. > JVM version is 20.1-b02 > > using parallel threads in the new generation. > using thread-local object allocation. > Concurrent Mark-Sweep GC > > Heap Configuration: > MinHeapFreeRatio = 40 > MaxHeapFreeRatio = 70 > MaxHeapSize = 4718592000 (4500.0MB) > NewSize = 1073741824 (1024.0MB) > MaxNewSize = 1073741824 (1024.0MB) > OldSize = 5439488 (5.1875MB) > NewRatio = 2 > SurvivorRatio = 6 > PermSize = 205520896 (196.0MB) > MaxPermSize = 205520896 (196.0MB) > > Heap Usage: > New Generation (Eden + 1 Survivor Space): > capacity = 939524096 (896.0MB) > used = 668354776 (637.3927841186523MB) > free = 271169320 (258.60721588134766MB) > 71.13758751324245% used > Eden Space: > capacity = 805306368 (768.0MB) > used = 602192224 (574.2952575683594MB) > free = 203114144 (193.70474243164062MB) > 74.77802832921346% used > From Space: > capacity = 134217728 (128.0MB) > used = 66162552 (63.09752655029297MB) > free = 68055176 (64.90247344970703MB) > 49.29494261741638% used > To Space: > capacity = 134217728 (128.0MB) > used = 0 (0.0MB) > free = 134217728 (128.0MB) > 0.0% used > concurrent mark-sweep generation: > capacity = 3644850176 (3476.0MB) > used = 3004386160 (2865.2059173583984MB) > free = 640464016 (610.7940826416016MB) > 82.42824848556957% used > Perm Generation: > capacity = 205520896 (196.0MB) > used = 110269304 (105.16100311279297MB) > free = 95251592 (90.83899688720703MB) > 53.65357301673111% use > > Given the card-table mechanism hotspot adopts for young gc, as our old > gen is around 3.5 gigabytes, each young gc should go through 7m card table > scans, I am wondering whether that's the reason behind the relatively slow > young gc? > > Any advice? Thanks > > All the best, > Leon > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121106/7d282e32/attachment.html From john.cuthbertson at oracle.com Wed Nov 7 09:24:50 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Wed, 07 Nov 2012 09:24:50 -0800 Subject: G1 log output interpretation In-Reply-To: References: Message-ID: <509A9962.5020600@oracle.com> Hi Simone, Monica asked the same thing recently. The message: 2341.739: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.81 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 bytes (20.00 %)] indicates that we would like to expand the heap because the current GC overhead to date (11.81%) is larger than the desired threshold (10%)...but the amount of uncommitted bytes left in the heap is 0 and we can't really expand the heap and we don't actually perform the attempt. If an actual expansion attempt was attempted there would another message similar to: 24.941: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 1048576 bytes, attempted expansion amount: 2097152 bytes] A way of interpreting this message is should we attempt to expand the heap and by how much (based upon the percentage of the uncommitted space - 20%) One issue I see in the code that we can and should make the min expansion amount in expansion_amount() be MAX(1*M, RegionSize). A new format could be: 24.941: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent GC overhead higher than threshold after GC, recent GC overhead: 11.23 %, threshold: 10.00 %] 24.941: [G1Ergonomics (Heap Sizing) uncommitted: 2097152 bytes, min expansion amount: 2097152 bytes, max expansion amount: 2097152 bytes] 24.941: [G1Ergonomics (Heap Sizing) calculated expansion amount: 419430 bytes (20.00 %), actual expansion amount: 2097152 bytes] Then: 24.941: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 2097152 bytes, attempted expansion amount: 2097152 bytes] Then: 24.941: [G1Ergonomics (Heap Sizing) heap expansion {passed | failed}, capacity before: XXXXX bytes, capacity after: XXXXX bytes] Would that clear up any confusion? Regards, JohnC On 11/5/2012 6:54 AM, Simone Bordet wrote: > Hi, > > I am using 1.7.0_09 with: > > -XX:InitialHeapSize=1073741824 > -XX:MaxHeapSize=1073741824 > -XX:MaxPermSize=268435456 > -XX:ReservedCodeCacheSize=67108864 > -XX:InitiatingHeapOccupancyPercent=60 > -XX:MaxGCPauseMillis=200 > -XX:+PrintAdaptiveSizePolicy > -XX:+PrintCommandLineFlags > -XX:+PrintGCDateStamps > -XX:+PrintGCDetails > -XX:+UseCompressedOops > -XX:+UseG1GC > > and I get this log, from time to time: > > 2012-11-05T12:09:02.422+0100: [GC pause (young) 2341.660: > [G1Ergonomics (CSet Construction) start choosing CSet, predicted base > time: 65.42 ms, remaining time: 134.58 ms, target pause time: 200.00 ms] > 2341.660: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 198 regions, survivors: 6 regions, predicted young region > time: 27.03 ms] > 2341.660: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 198 regions, survivors: 6 regions, old: 0 regions, predicted > pause time: 92.45 ms, target pause time: 200.00 ms] > 2341.739: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: > recent GC overhead higher than threshold after GC, recent GC overhead: > 11.81 %, threshold: 10.00 %, uncommitted: 0 bytes, calculated > expansion amount: 0 bytes (20.00 %)] > , 0.07907900 secs] > > The relevant section is the last line, marked as G1Ergonomics (Heap > Sizing). > > Can someone shed a light on what it means ? Seems an attempt to > expand, but calculates to 0 ? > I remember G1 has some "extra" allocation space that can be used > beyond -Xmx, do I remember well ? > > Thanks ! > > Simon > -- > http://cometd.org > http://webtide.com > Developer advice, training, services and support > from the Jetty & CometD experts. > ---- > Finally, no matter how good the architecture and design are, > to deliver bug-free software with optimal performance and reliability, > the implementation technique must be flawless. Victoria Livschitz > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121107/5a300609/attachment.html From sbordet at intalio.com Wed Nov 7 14:34:45 2012 From: sbordet at intalio.com (Simone Bordet) Date: Wed, 7 Nov 2012 23:34:45 +0100 Subject: G1 log output interpretation In-Reply-To: <509A9962.5020600@oracle.com> References: <509A9962.5020600@oracle.com> Message-ID: Hi, On Wed, Nov 7, 2012 at 6:24 PM, John Cuthbertson wrote: > Hi Simone, > > Monica asked the same thing recently. The message: > > > 2341.739: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent > GC overhead higher than threshold after GC, recent GC overhead: 11.81 %, > threshold: 10.00 %, uncommitted: 0 bytes, calculated expansion amount: 0 > bytes (20.00 %)] > > indicates that we would like to expand the heap because the current GC > overhead to date (11.81%) is larger than the desired threshold (10%)...but > the amount of uncommitted bytes left in the heap is 0 and we can't really > expand the heap and we don't actually perform the attempt. If an actual > expansion attempt was attempted there would another message similar to: > > 24.941: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 1048576 bytes, attempted expansion amount: 2097152 bytes] > > A way of interpreting this message is should we attempt to expand the heap > and by how much (based upon the percentage of the uncommitted space - 20%) > > One issue I see in the code that we can and should make the min expansion > amount in expansion_amount() be MAX(1*M, RegionSize). A new format could > be: > > 24.941: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent > GC overhead higher than threshold after GC, recent GC overhead: 11.23 %, > threshold: 10.00 %] > 24.941: [G1Ergonomics (Heap Sizing) uncommitted: 2097152 bytes, min > expansion amount: 2097152 bytes, max expansion amount: 2097152 bytes] > 24.941: [G1Ergonomics (Heap Sizing) calculated expansion amount: 419430 > bytes (20.00 %), actual expansion amount: 2097152 bytes] I'd still be confused by this line, where the calculated expansion amount is different from the actual expansion amount... My problem was in the interpretation of "uncommitted", which I did not fully understand (my fault). > Then: > 24.941: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 2097152 bytes, attempted expansion amount: 2097152 bytes] > > Then: > 24.941: [G1Ergonomics (Heap Sizing) heap expansion {passed | failed}, > capacity before: XXXXX bytes, capacity after: XXXXX bytes] > > Would that clear up any confusion? Clearer, yes. While we're at it, may I also ask about this other output: [from the previous young GC just after a concurrent marking ended] 29235.349: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: candidate old regions available, candidate old regions: 184 regions, reclaimable: 91408328 bytes (8.51 %), threshold: 1.00 %] [Eden: 192M(191M)->0B(198M) Survivors: 13M->6144K Heap: 520M(1024M)->321M(1024M)] [Times: user=0.17 sys=0.00, real=0.11 secs] 2012-11-07T19:32:49.391+0100: [GC pause (mixed) 29255.887: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 89.44 ms, remaining time: 110.56 ms, target pause time: 200.00 ms] 29255.887: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 198 regions, survivors: 6 regions, predicted young region time: 78.11 ms] 29255.888: [G1Ergonomics (CSet Construction) finish adding old regions to CSet, reason: predicted time is too high, predicted time: 1.54 ms, remaining time: -5.19 ms, old: 46 regions, min: 46 regions] 29255.888: [G1Ergonomics (CSet Construction) added expensive regions to CSet, reason: old CSet region num not reached min, old: 46 regions, expensive: 4 regions, min: 46 regions, remaining time: -5.19 ms] 29255.888: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 198 regions, survivors: 6 regions, old: 46 regions, predicted pause time: 205.19 ms, target pause time: 200.00 ms] , 0.14591800 secs] ... 29256.034: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old regions available, candidate old regions: 138 regions, reclaimable: 48771256 bytes (4.54 %), threshold: 1.00 %] [Eden: 198M(198M)->0B(197M) Survivors: 6144K->7168K Heap: 519M(1024M)->281M(1024M)] [Times: user=0.22 sys=0.00, real=0.15 secs] In particular, what is the exact meaning of "expensive" and why there are all these negative times, and why a predicted time of 1.54 ms is too high. The logs seems to hint that there are expensive and cheap old regions, that a decision has been made about how many of either include in the mixed GC, but it's not clear why the last line always report all old regions. Also, the message "added expensive regions to CSet, old CSet region num not reached min, old: 46 regions, expensive: 4 regions, min: 46 regions" seems to hint that a minimum has not been reached, yet it says the minimum is 46 so it *has* been reached. I understand that after the concurrent marking, 186 old regions have been detected to be candidate for collection. Since G1 uses 4 mixed GC, 186/4=46.5 regions per collection. On this GC, 46 regions have been selected, of which 4 are "expensive" (in the next 3 mixed GCs, all 46 regions are "expensive"). Am I guessing right ? I also noticed that after a concurrent marking ends, there is always a young GC (not a mixed one), followed by the 4 mixed GC. So the first GC after the concurrent marking is never a mixed GC, but always a young GC. Is this normal behavior ? Thanks ! Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From Andreas.Mueller at mgm-tp.com Tue Nov 13 03:00:19 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Tue, 13 Nov 2012 11:00:19 +0000 Subject: Good G1 result, was: G1 issue: falling over to Full GC Message-ID: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de> Hi all, a few days ago I promised to redo the G1 test on this application using JDK 7 and share the results with you all. I now had an opportunity on the weekend to re-run the tests with JDK7u9/32bit on Solaris SPARC. Testing your recommendation not to fix NewSize and SurvivorRatio with G1 but only the total heap size I did one test with a minimal configuration of -Xms1g -Xmx1g -XX:+UseG1GC The result is truly promising and in no way comparable to JDK6 as you might see from the attached pause time plot and quantitative analysis: - GC pauses were never longer than 0.8s (very few above 0.5s) - behavior was stable during the test run of more than 26 hours I do not know for sure whether there is a default pause time target implied with these settings, but it looks like most pauses are cut off around 250 millis. Pauses for young and mixed runs (summarized under "Old Gen GC Pauses" in my tool) behave the same way. I also did a short run setting explicitly -XX:MaxGCPauseMillis=200 to find out that in this case pauses are much more sharply cut off at 200 millis with an average of just 211 millis while in the shown plot average pause duration was around 270 millis with more outliers. So it looks reasonable but not certain that there was a default pause target of 250 millis implied. Can anybody confirm this because I had in mind the default once used to be 100 millis? The price to pay for this very nice G1 behaviour in terms of pauses is also obvious from the evaluation: G1 consumes a rather large share of the available CPU power (~7%) through accumulated pauses which is a factor of 5.5 more than ParallelGC used in a previous test). I wonder what this would mean for the maximum amount of garbage G1 could cope with in this application. During other tests I had seen short term (1 min) garbage creation rates of up to 425 MB/s which ParallelGC digested without causing any problems. So far, I doubt that G1 could swallow that given the rather high overhead it causes already at 51 MB/s. I also did more runs where I set (Max)NewSize and SurvivorRatio to fixed values. The results were: - pause time targets were clearly missed either to the better or to the worse - G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target. So I can confirm Simone's and Charlie's suggestion that these values should be left open for the pause time target to make sense. As fluctuations of the kind observed are very undesirable in the application area of G1 (GC pause control) it looks very undesirable to use such settings in addition to the pause time target. G1 adjustment control seems to have trouble dealing with these conflicting targets. Why then support them both as allowed settings? Regards Andreas P.S.: Sorry for the low quality JPEG but I had to reduce it even more to make it pass under the 100 kB email limit. It now displays a size of 68 kB and should finally go through (Last time it was 89 kB and did not, however) ________________________________________ Von: Charlie Hunt [chunt at salesforce.com] Gesendet: Freitag, 2. November 2012 13:34 An: Andreas M?ller Cc: Simone Bordet; 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net) Betreff: Re: G1 issue: falling over to Full GC Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > Hi Simone, > >> 4972.437: [GC pause (partial), 1.89505180 secs] >> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. > Oops, I understand that now: 'partial' used to be what 'mixed' is now! > Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. > But automatic deployments can overwrite the start script and thus switch back to 6u33. > >> Are you sure you are actually using 1.7.0_u7 ? > I have checked that in the archived start scripts and the result, unfortunetaley, is: no. > The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. > That's the true reason why the results were so much worse and so incomprehensible. > Thank you very much for looking at the log and for asking good questions! > > I'll try to repeat the test and post the results on this list. > > Regards > Andreas > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- A non-text attachment was scrubbed... Name: G1MinimalConfigJDK7u9.jpg Type: image/jpeg Size: 70392 bytes Desc: G1MinimalConfigJDK7u9.jpg Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/0e5e66e8/G1MinimalConfigJDK7u9-0001.jpg From monica.beckwith at oracle.com Tue Nov 13 05:52:58 2012 From: monica.beckwith at oracle.com (Monica Beckwith) Date: Tue, 13 Nov 2012 07:52:58 -0600 Subject: Good G1 result, was: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de> Message-ID: <50A250BA.3040605@oracle.com> Hi Andreas, Thank you for updating to a newer jdk7 for your testing with G1GC. I read your email but haven't yet looked at your data. I wanted to quickly highlight one point and then share a J1 presentation link - 1. The default value for MaxGCPauseMillis is 200ms. You can check that with -XX:PrintFlagsFinal and grepping on "MaxGCPause". "java version "1.7.0_09" Java(TM) SE Runtime Environment (build 1.7.0_09-b05) Java HotSpot(TM) Server VM (build 23.5-b02, mixed mode) uintx MaxGCPauseMillis = 200 {product}" 1. Charlie and I did a presentation on G1 Performance Tuning at JavaOne this year. I have attached a link to it here. I think you will find the analysis useful: https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583 2. If possible, can you please share with me your CPU power numbers (where G1 is 5.5x more than ParGC)? Please feel free to ask more questions... Also, please feel free to email me your GC logs. Regards, Monica Beckwith On 11/13/2012 5:00 AM, Andreas M?ller wrote: > Hi all, > > a few days ago I promised to redo the G1 test on this application using JDK 7 and share the results with you all. > I now had an opportunity on the weekend to re-run the tests with JDK7u9/32bit on Solaris SPARC. > Testing your recommendation not to fix NewSize and SurvivorRatio with G1 but only the total heap size I did one test with a minimal configuration of > -Xms1g -Xmx1g -XX:+UseG1GC > The result is truly promising and in no way comparable to JDK6 as you might see from the attached pause time plot and quantitative analysis: > - GC pauses were never longer than 0.8s (very few above 0.5s) > - behavior was stable during the test run of more than 26 hours > I do not know for sure whether there is a default pause time target implied with these settings, but it looks like most pauses are cut off around 250 millis. > Pauses for young and mixed runs (summarized under "Old Gen GC Pauses" in my tool) behave the same way. > I also did a short run setting explicitly -XX:MaxGCPauseMillis=200 to find out that in this case pauses are much more sharply cut off at 200 millis with an average of just 211 millis while in the shown plot average pause duration was around 270 millis with more outliers. So it looks reasonable but not certain that there was a default pause target of 250 millis implied. Can anybody confirm this because I had in mind the default once used to be 100 millis? > The price to pay for this very nice G1 behaviour in terms of pauses is also obvious from the evaluation: > G1 consumes a rather large share of the available CPU power (~7%) through accumulated pauses which is a factor of 5.5 more than ParallelGC used in a previous test). > I wonder what this would mean for the maximum amount of garbage G1 could cope with in this application. During other tests I had seen short term (1 min) garbage creation rates of up to 425 MB/s which ParallelGC digested without causing any problems. So far, I doubt that G1 could swallow that given the rather high overhead it causes already at 51 MB/s. > I also did more runs where I set (Max)NewSize and SurvivorRatio to fixed values. The results were: > - pause time targets were clearly missed either to the better or to the worse > - G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target. > So I can confirm Simone's and Charlie's suggestion that these values should be left open for the pause time target to make sense. As fluctuations of the kind observed are very undesirable in the application area of G1 (GC pause control) it looks very undesirable to use such settings in addition to the pause time target. G1 adjustment control seems to have trouble dealing with these conflicting targets. Why then support them both as allowed settings? > > Regards > Andreas > > P.S.: Sorry for the low quality JPEG but I had to reduce it even more to make it pass under the 100 kB email limit. It now displays a size of 68 kB and should finally go through (Last time it was 89 kB and did not, however) > ________________________________________ > Von: Charlie Hunt [chunt at salesforce.com] > Gesendet: Freitag, 2. November 2012 13:34 > An: Andreas M?ller > Cc: Simone Bordet; 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net) > Betreff: Re: G1 issue: falling over to Full GC > > Jumping in a bit late ... > > Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. > > Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). > > > From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. > > The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi& po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. > > Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. > > Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. > > Look forward to seeing your GC logs! > > hths, > > charlie .... > > On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: > >> Hi Simone, >> >>> 4972.437: [GC pause (partial), 1.89505180 secs] >>> that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. >> Oops, I understand that now: 'partial' used to be what 'mixed' is now! >> Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. >> But automatic deployments can overwrite the start script and thus switch back to 6u33. >> >>> Are you sure you are actually using 1.7.0_u7 ? >> I have checked that in the archived start scripts and the result, unfortunetaley, is: no. >> The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. >> That's the true reason why the results were so much worse and so incomprehensible. >> Thank you very much for looking at the log and for asking good questions! >> >> I'll try to repeat the test and post the results on this list. >> >> Regards >> Andreas >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- Oracle Monica Beckwith | Java Performance Engineer VOIP: +1 512 401 1274 Texas Green Oracle Oracle is committed to developing practices and products that help protect the environment -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/77168527/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: oracle_sig_logo.gif Type: image/gif Size: 658 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/77168527/oracle_sig_logo.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: green-for-email-sig_0.gif Type: image/gif Size: 356 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/77168527/green-for-email-sig_0.gif From Andreas.Mueller at mgm-tp.com Tue Nov 13 06:36:39 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Tue, 13 Nov 2012 14:36:39 +0000 Subject: AW: Good G1 result, was: G1 issue: falling over to Full GC In-Reply-To: <50A250BA.3040605@oracle.com> References: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de>, <50A250BA.3040605@oracle.com> Message-ID: <46FF8393B58AD84D95E444264805D98F28F859DF@edata02.mgm-edv.de> Hi Monica, thank you for replying and the link to the presentation. In the gauge given in that presentation, throughput was 98,7% with ParallelGC and about 92,9% with G1. That seems very much in-line with targets for both collectors given in that same presentation (99% vs 90%). I usually rather consider the GC overhead which is 100%-98,7%=1,3% for ParallelGC and 100%-92,9%=7,1% and thus end up with my factor of 5,5=7.1/1.3. This was on a 4-CPU machine with SPARC64 V CPUs at 1100 MHz in a loadtest environment. Production runs on 8x SPARC64 V at 1800 MHz and will be migrated to a T3 at 1650 MHz in the next months. I would be happy to mail you the gc logs if of interest but they are rather straighforward in this case as everything worked smoothly and targets were met. I would happily send you the gc logs for the less stable case (where I fixed NewSize and SurvivorRatio to the values optimized for ParallelGC) for analysis. But as my tool cannot parse the very verbose output of -XX:+PrintGCDetails for G1 yet, these logs are short and will probably not give you enough details to understand where the pause time fluctuations came from. But anyway, if you are interested, I will send them both to you but not to the mailing list (they are bulky >>100kB). Bets regards Andreas ________________________________ Von: Monica Beckwith [monica.beckwith at oracle.com] Gesendet: Dienstag, 13. November 2012 14:52 An: Andreas M?ller Cc: 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net); Peter.Kessler at oracle.com Betreff: Re: Good G1 result, was: G1 issue: falling over to Full GC Hi Andreas, Thank you for updating to a newer jdk7 for your testing with G1GC. I read your email but haven't yet looked at your data. I wanted to quickly highlight one point and then share a J1 presentation link - 1. The default value for MaxGCPauseMillis is 200ms. You can check that with -XX:PrintFlagsFinal and grepping on "MaxGCPause". "java version "1.7.0_09" Java(TM) SE Runtime Environment (build 1.7.0_09-b05) Java HotSpot(TM) Server VM (build 23.5-b02, mixed mode) uintx MaxGCPauseMillis = 200 {product}" 1. Charlie and I did a presentation on G1 Performance Tuning at JavaOne this year. I have attached a link to it here. I think you will find the analysis useful: https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583 2. If possible, can you please share with me your CPU power numbers (where G1 is 5.5x more than ParGC)? Please feel free to ask more questions... Also, please feel free to email me your GC logs. Regards, Monica Beckwith On 11/13/2012 5:00 AM, Andreas M?ller wrote: Hi all, a few days ago I promised to redo the G1 test on this application using JDK 7 and share the results with you all. I now had an opportunity on the weekend to re-run the tests with JDK7u9/32bit on Solaris SPARC. Testing your recommendation not to fix NewSize and SurvivorRatio with G1 but only the total heap size I did one test with a minimal configuration of -Xms1g -Xmx1g -XX:+UseG1GC The result is truly promising and in no way comparable to JDK6 as you might see from the attached pause time plot and quantitative analysis: - GC pauses were never longer than 0.8s (very few above 0.5s) - behavior was stable during the test run of more than 26 hours I do not know for sure whether there is a default pause time target implied with these settings, but it looks like most pauses are cut off around 250 millis. Pauses for young and mixed runs (summarized under "Old Gen GC Pauses" in my tool) behave the same way. I also did a short run setting explicitly -XX:MaxGCPauseMillis=200 to find out that in this case pauses are much more sharply cut off at 200 millis with an average of just 211 millis while in the shown plot average pause duration was around 270 millis with more outliers. So it looks reasonable but not certain that there was a default pause target of 250 millis implied. Can anybody confirm this because I had in mind the default once used to be 100 millis? The price to pay for this very nice G1 behaviour in terms of pauses is also obvious from the evaluation: G1 consumes a rather large share of the available CPU power (~7%) through accumulated pauses which is a factor of 5.5 more than ParallelGC used in a previous test). I wonder what this would mean for the maximum amount of garbage G1 could cope with in this application. During other tests I had seen short term (1 min) garbage creation rates of up to 425 MB/s which ParallelGC digested without causing any problems. So far, I doubt that G1 could swallow that given the rather high overhead it causes already at 51 MB/s. I also did more runs where I set (Max)NewSize and SurvivorRatio to fixed values. The results were: - pause time targets were clearly missed either to the better or to the worse - G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target. So I can confirm Simone's and Charlie's suggestion that these values should be left open for the pause time target to make sense. As fluctuations of the kind observed are very undesirable in the application area of G1 (GC pause control) it looks very undesirable to use such settings in addition to the pause time target. G1 adjustment control seems to have trouble dealing with these conflicting targets. Why then support them both as allowed settings? Regards Andreas P.S.: Sorry for the low quality JPEG but I had to reduce it even more to make it pass under the 100 kB email limit. It now displays a size of 68 kB and should finally go through (Last time it was 89 kB and did not, however) ________________________________________ Von: Charlie Hunt [chunt at salesforce.com] Gesendet: Freitag, 2. November 2012 13:34 An: Andreas M?ller Cc: Simone Bordet; 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net) Betreff: Re: G1 issue: falling over to Full GC Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: Hi Simone, 4972.437: [GC pause (partial), 1.89505180 secs] that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. Oops, I understand that now: 'partial' used to be what 'mixed' is now! Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. But automatic deployments can overwrite the start script and thus switch back to 6u33. Are you sure you are actually using 1.7.0_u7 ? I have checked that in the archived start scripts and the result, unfortunetaley, is: no. The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. That's the true reason why the results were so much worse and so incomprehensible. Thank you very much for looking at the log and for asking good questions! I'll try to repeat the test and post the results on this list. Regards Andreas _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- [Oracle] Monica Beckwith | Java Performance Engineer VOIP: +1 512 401 1274 Texas [Green Oracle] Oracle is committed to developing practices and products that help protect the environment -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/ebac2aef/attachment-0001.html -------------- next part -------------- A non-text attachment was scrubbed... Name: oracle_sig_logo.gif Type: image/gif Size: 658 bytes Desc: oracle_sig_logo.gif Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/ebac2aef/oracle_sig_logo-0001.gif -------------- next part -------------- A non-text attachment was scrubbed... Name: green-for-email-sig_0.gif Type: image/gif Size: 356 bytes Desc: green-for-email-sig_0.gif Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/ebac2aef/green-for-email-sig_0-0001.gif From chunt at salesforce.com Tue Nov 13 07:47:18 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Tue, 13 Nov 2012 07:47:18 -0800 Subject: Good G1 result, was: G1 issue: falling over to Full GC In-Reply-To: <50A250BA.3040605@oracle.com> References: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de> <50A250BA.3040605@oracle.com> Message-ID: To add to Monica's comments, and short of seeing the GC logs, (hint, hint). ;-) You mentioned, "- G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target." Does this application go through some rather dramatic application phase shifts which result in wide swings in allocation rates and/or object lifetimes, perhaps maybe I should just ask if the application goes through some pretty significant behavioral changes? IIRC, there may have been discussions on the hotspot-gc-dev mailing list about integration of some enhancements to the G1 pause prediction policy, (or I'm being a moron and confusing that with something else)? That (G1 pause prediction model) may be missing something --- GC logs would help with isolate that. The slides that Monica mentioned has a couple tuning command line options, in particular; -XX:G1MixedGCCountTarget= & -XX:G1OldCSetRegionLiveThresholdPercent=. These two will help if it is (mostly) mixed GCs that are those pause times that are near that factor of 5. That may not be the case since I wouldn't expect to see continuous mixed GCs for an hour. I'd expect to see a set of some sequence of young GCs followed by a marking cycle (perhaps 1 young GC) and then some mixed GCS, and then repeat that general sequence. hths, charlie ... On Nov 13, 2012, at 5:52 AM, Monica Beckwith wrote: Hi Andreas, Thank you for updating to a newer jdk7 for your testing with G1GC. I read your email but haven't yet looked at your data. I wanted to quickly highlight one point and then share a J1 presentation link - 1. The default value for MaxGCPauseMillis is 200ms. You can check that with -XX:PrintFlagsFinal and grepping on "MaxGCPause". "java version "1.7.0_09" Java(TM) SE Runtime Environment (build 1.7.0_09-b05) Java HotSpot(TM) Server VM (build 23.5-b02, mixed mode) uintx MaxGCPauseMillis = 200 {product}" 1. Charlie and I did a presentation on G1 Performance Tuning at JavaOne this year. I have attached a link to it here. I think you will find the analysis useful: https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583 2. If possible, can you please share with me your CPU power numbers (where G1 is 5.5x more than ParGC)? Please feel free to ask more questions... Also, please feel free to email me your GC logs. Regards, Monica Beckwith On 11/13/2012 5:00 AM, Andreas M?ller wrote: Hi all, a few days ago I promised to redo the G1 test on this application using JDK 7 and share the results with you all. I now had an opportunity on the weekend to re-run the tests with JDK7u9/32bit on Solaris SPARC. Testing your recommendation not to fix NewSize and SurvivorRatio with G1 but only the total heap size I did one test with a minimal configuration of -Xms1g -Xmx1g -XX:+UseG1GC The result is truly promising and in no way comparable to JDK6 as you might see from the attached pause time plot and quantitative analysis: - GC pauses were never longer than 0.8s (very few above 0.5s) - behavior was stable during the test run of more than 26 hours I do not know for sure whether there is a default pause time target implied with these settings, but it looks like most pauses are cut off around 250 millis. Pauses for young and mixed runs (summarized under "Old Gen GC Pauses" in my tool) behave the same way. I also did a short run setting explicitly -XX:MaxGCPauseMillis=200 to find out that in this case pauses are much more sharply cut off at 200 millis with an average of just 211 millis while in the shown plot average pause duration was around 270 millis with more outliers. So it looks reasonable but not certain that there was a default pause target of 250 millis implied. Can anybody confirm this because I had in mind the default once used to be 100 millis? The price to pay for this very nice G1 behaviour in terms of pauses is also obvious from the evaluation: G1 consumes a rather large share of the available CPU power (~7%) through accumulated pauses which is a factor of 5.5 more than ParallelGC used in a previous test). I wonder what this would mean for the maximum amount of garbage G1 could cope with in this application. During other tests I had seen short term (1 min) garbage creation rates of up to 425 MB/s which ParallelGC digested without causing any problems. So far, I doubt that G1 could swallow that given the rather high overhead it causes already at 51 MB/s. I also did more runs where I set (Max)NewSize and SurvivorRatio to fixed values. The results were: - pause time targets were clearly missed either to the better or to the worse - G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target. So I can confirm Simone's and Charlie's suggestion that these values should be left open for the pause time target to make sense. As fluctuations of the kind observed are very undesirable in the application area of G1 (GC pause control) it looks very undesirable to use such settings in addition to the pause time target. G1 adjustment control seems to have trouble dealing with these conflicting targets. Why then support them both as allowed settings? Regards Andreas P.S.: Sorry for the low quality JPEG but I had to reduce it even more to make it pass under the 100 kB email limit. It now displays a size of 68 kB and should finally go through (Last time it was 89 kB and did not, however) ________________________________________ Von: Charlie Hunt [chunt at salesforce.com] Gesendet: Freitag, 2. November 2012 13:34 An: Andreas M?ller Cc: Simone Bordet; 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net) Betreff: Re: G1 issue: falling over to Full GC Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: Hi Simone, 4972.437: [GC pause (partial), 1.89505180 secs] that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. Oops, I understand that now: 'partial' used to be what 'mixed' is now! Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. But automatic deployments can overwrite the start script and thus switch back to 6u33. Are you sure you are actually using 1.7.0_u7 ? I have checked that in the archived start scripts and the result, unfortunetaley, is: no. The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. That's the true reason why the results were so much worse and so incomprehensible. Thank you very much for looking at the log and for asking good questions! I'll try to repeat the test and post the results on this list. Regards Andreas _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- Monica Beckwith | Java Performance Engineer VOIP: +1 512 401 1274 Texas Oracle is committed to developing practices and products that help protect the environment _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121113/a145a7f6/attachment.html From danchidanchi at gmail.com Tue Nov 13 14:20:06 2012 From: danchidanchi at gmail.com (Sekiya Nobuhiko) Date: Wed, 14 Nov 2012 07:20:06 +0900 Subject: G1GC success examples (compared to CMS) ? Message-ID: Hi, Has anybody found successful test case in G1GC compared to CMS? I search this mailing list but couldn't find anybody commenting such one. If you know any good cases, I would like to know that test case. If you know that there are really no good examples at current, I would appreciate that information too. I'm testing G1 with 7u7, but really could not get good results. Best Regards, Nobuhiko -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121114/30739602/attachment.html From chunt at salesforce.com Wed Nov 14 07:01:52 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Wed, 14 Nov 2012 07:01:52 -0800 Subject: Good G1 result, was: G1 issue: falling over to Full GC In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F859DF@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F854F5@edata02.mgm-edv.de>, <50A250BA.3040605@oracle.com> <46FF8393B58AD84D95E444264805D98F28F859DF@edata02.mgm-edv.de> Message-ID: <88FA0B4B-B26D-4D54-8383-675F5D9860B8@salesforce.com> For those who may be interested .... A little trick you can use in looking at G1 pause time stuff in GChisto .... if you parse the GC logs and grab the timestamp or datestamp, the type of stop the world pause and the duration for each GC event, you can load these in GChisto[1] as a "Simple GC Log", (lower left hand corner of GChisto there is a dropbox where you can set that option). So, the parsed output looks something like this for G1: YoungGC 10243.416 0.14423000 YoungGC 10247.089 0.15201100 InitialMarkGC 10249.273 0.14927300 RemarkGC 10250.839 0.1071420 CleanupGC 10250.948 0.0116570 YoungGC 10253.459 0.17408200 MixedGC 10255.392 0.18000200 *Note, the type of (stop the world) GC comes first, then the timestamp and then the duration. For G1, the possible stop world GC pauses are; young, initial-mark, remark, clean up, mixed gc, to-space overflow and Full. The actual text in the log may not exactly match what I've listed here though. And, the above can be loaded directly into GChisto as a "Simple GC Log". If you are well versed in awk or perl, you can probably hack something together pretty quickly. ;-) charlie ... [1] - GChisto - http://java.net/projects/gchisto --- you'll need to grab the source and build it. On Nov 13, 2012, at 6:36 AM, Andreas M?ller wrote: Hi Monica, thank you for replying and the link to the presentation. In the gauge given in that presentation, throughput was 98,7% with ParallelGC and about 92,9% with G1. That seems very much in-line with targets for both collectors given in that same presentation (99% vs 90%). I usually rather consider the GC overhead which is 100%-98,7%=1,3% for ParallelGC and 100%-92,9%=7,1% and thus end up with my factor of 5,5=7.1/1.3. This was on a 4-CPU machine with SPARC64 V CPUs at 1100 MHz in a loadtest environment. Production runs on 8x SPARC64 V at 1800 MHz and will be migrated to a T3 at 1650 MHz in the next months. I would be happy to mail you the gc logs if of interest but they are rather straighforward in this case as everything worked smoothly and targets were met. I would happily send you the gc logs for the less stable case (where I fixed NewSize and SurvivorRatio to the values optimized for ParallelGC) for analysis. But as my tool cannot parse the very verbose output of -XX:+PrintGCDetails for G1 yet, these logs are short and will probably not give you enough details to understand where the pause time fluctuations came from. But anyway, if you are interested, I will send them both to you but not to the mailing list (they are bulky >>100kB). Bets regards Andreas ________________________________ Von: Monica Beckwith [monica.beckwith at oracle.com] Gesendet: Dienstag, 13. November 2012 14:52 An: Andreas M?ller Cc: 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net); Peter.Kessler at oracle.com Betreff: Re: Good G1 result, was: G1 issue: falling over to Full GC Hi Andreas, Thank you for updating to a newer jdk7 for your testing with G1GC. I read your email but haven't yet looked at your data. I wanted to quickly highlight one point and then share a J1 presentation link - 1. The default value for MaxGCPauseMillis is 200ms. You can check that with -XX:PrintFlagsFinal and grepping on "MaxGCPause". "java version "1.7.0_09" Java(TM) SE Runtime Environment (build 1.7.0_09-b05) Java HotSpot(TM) Server VM (build 23.5-b02, mixed mode) uintx MaxGCPauseMillis = 200 {product}" 1. Charlie and I did a presentation on G1 Performance Tuning at JavaOne this year. I have attached a link to it here. I think you will find the analysis useful: https://oracleus.activeevents.com/connect/sessionDetail.ww?SESSION_ID=6583 2. If possible, can you please share with me your CPU power numbers (where G1 is 5.5x more than ParGC)? Please feel free to ask more questions... Also, please feel free to email me your GC logs. Regards, Monica Beckwith On 11/13/2012 5:00 AM, Andreas M?ller wrote: Hi all, a few days ago I promised to redo the G1 test on this application using JDK 7 and share the results with you all. I now had an opportunity on the weekend to re-run the tests with JDK7u9/32bit on Solaris SPARC. Testing your recommendation not to fix NewSize and SurvivorRatio with G1 but only the total heap size I did one test with a minimal configuration of -Xms1g -Xmx1g -XX:+UseG1GC The result is truly promising and in no way comparable to JDK6 as you might see from the attached pause time plot and quantitative analysis: - GC pauses were never longer than 0.8s (very few above 0.5s) - behavior was stable during the test run of more than 26 hours I do not know for sure whether there is a default pause time target implied with these settings, but it looks like most pauses are cut off around 250 millis. Pauses for young and mixed runs (summarized under "Old Gen GC Pauses" in my tool) behave the same way. I also did a short run setting explicitly -XX:MaxGCPauseMillis=200 to find out that in this case pauses are much more sharply cut off at 200 millis with an average of just 211 millis while in the shown plot average pause duration was around 270 millis with more outliers. So it looks reasonable but not certain that there was a default pause target of 250 millis implied. Can anybody confirm this because I had in mind the default once used to be 100 millis? The price to pay for this very nice G1 behaviour in terms of pauses is also obvious from the evaluation: G1 consumes a rather large share of the available CPU power (~7%) through accumulated pauses which is a factor of 5.5 more than ParallelGC used in a previous test). I wonder what this would mean for the maximum amount of garbage G1 could cope with in this application. During other tests I had seen short term (1 min) garbage creation rates of up to 425 MB/s which ParallelGC digested without causing any problems. So far, I doubt that G1 could swallow that given the rather high overhead it causes already at 51 MB/s. I also did more runs where I set (Max)NewSize and SurvivorRatio to fixed values. The results were: - pause time targets were clearly missed either to the better or to the worse - G1 behaviour was more instable and sometimes missed the pause time target for something like an hour by a factor of 5 before it returned to values below the target. So I can confirm Simone's and Charlie's suggestion that these values should be left open for the pause time target to make sense. As fluctuations of the kind observed are very undesirable in the application area of G1 (GC pause control) it looks very undesirable to use such settings in addition to the pause time target. G1 adjustment control seems to have trouble dealing with these conflicting targets. Why then support them both as allowed settings? Regards Andreas P.S.: Sorry for the low quality JPEG but I had to reduce it even more to make it pass under the 100 kB email limit. It now displays a size of 68 kB and should finally go through (Last time it was 89 kB and did not, however) ________________________________________ Von: Charlie Hunt [chunt at salesforce.com] Gesendet: Freitag, 2. November 2012 13:34 An: Andreas M?ller Cc: Simone Bordet; 'hotspot-gc-use at openjdk.java.net' (hotspot-gc-use at openjdk.java.net) Betreff: Re: G1 issue: falling over to Full GC Jumping in a bit late ... Strongly suggest to anyone evaluating G1 to not use anything prior to 7u4. And, even better if you use (as of this writing) 7u9, or the latest production Java 7 HotSpot VM. Fwiw, I'm really liking what I am seeing in 7u9 with the exception on one issue, (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7143858), which is currently slated to be back ported to a future Java 7, (thanks Monica, John Cuthbertson and Bengt tackling this!). >From looking at your observations and others comments thus far, my initial reaction is that with a 1G Java heap, you might get the best results with -XX:+UseParallelOldGC. Are you using -XX:+UseParallelGC, or -XX:+UseParallelOldGC? Or, are you not setting a GC? Not until 7u4 is -XX:+UseParallelOldGC automatically set for what's called "server class" machines when you don't specify a GC. The lengthy concurrent mark could be the result of the implementation of G1 in 6u*, or it could be that your system is swapping. Could you check if your system is swapping? On Solaris you can monitor this using vmstat and observing, not only just free memory, but also sr == scan rate along with pi == page in and po == page out. Seeing sr (page scan activity) along with low free memory along with pi & po activity are strong suggestions of swapping. Seeing low free memory and no sr activity is ok, i.e. no swapping. Additionally, you are right. "partial" was changed to "mixed" in the GC logs. For those interested in a bit of history .... this change was made since we felt "partial" was misleading. What partial was intended to mean was a partial old gen collection, which did occur. But, on that same GC event it also included a young gen GC. As a result, we changed the GC event name to "mixed" since that GC event was really a combination of both a young gen GC and portion of old gen GC. Simone also has a good suggestion with including -XX:+PrintFlagsFinal and -showversion as part of the GC log data to collect, especially with G1 continuing to be improve and evolve. Look forward to seeing your GC logs! hths, charlie .... On Nov 2, 2012, at 5:46 AM, Andreas M?ller wrote: Hi Simone, 4972.437: [GC pause (partial), 1.89505180 secs] that I cannot decypher (to Monica - what "partial" means ?), and no mixed GCs, which seems unusual as well. Oops, I understand that now: 'partial' used to be what 'mixed' is now! Our portal usually runs on Java 6u33. For the G1 tests I switched to 7u7 because I had learned that G1 is far from mature in 6u33. But automatic deployments can overwrite the start script and thus switch back to 6u33. Are you sure you are actually using 1.7.0_u7 ? I have checked that in the archived start scripts and the result, unfortunetaley, is: no. The 'good case' was actually running on 7u7 (that's why it was good), but the 'bad case' was unwittingly run on 6u33 again. That's the true reason why the results were so much worse and so incomprehensible. Thank you very much for looking at the log and for asking good questions! I'll try to repeat the test and post the results on this list. Regards Andreas _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -- Monica Beckwith | Java Performance Engineer VOIP: +1 512 401 1274 Texas Oracle is committed to developing practices and products that help protect the environment _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121114/78a2a375/attachment-0001.html From chunt at salesforce.com Wed Nov 14 07:21:14 2012 From: chunt at salesforce.com (Charlie Hunt) Date: Wed, 14 Nov 2012 07:21:14 -0800 Subject: G1GC success examples (compared to CMS) ? In-Reply-To: References: Message-ID: <0E105ABE-2969-4BC3-8926-8FD27512FF97@salesforce.com> Hi Nobuhiko, I had done some comparisons with CMS (and with Parallel[Old]GC). What I had done was induced the undesirable behavior of heap fragmentation and object allocation spikes into application running with CMS via -javaagent. * Fyi, -javaagent essentially allows you to run another Java program in parallel with your Java app. (Thanks Vladimir Ivanov, cc'd, for suggesting the idea to use -javaagent.) I also did the same experiment against G1. What those experiments illustrated was G1 in the presence of either heap fragmentation or wide swings in object allocation rates offered much more predictable behavior than CMS, in particular, avoiding (the painful) full GCs. To date, the focus with G1 was been more so on larger Java heaps, than on smaller Java heaps with lower than say 100ms - 150ms pause times. The folks at Oracle may be able to offer additional comments. If I were to summarize where I see G1 wrt CMS today, I would say that if you are experiencing full GCs with CMS, due to fragmentation, or "losing the race" [1], or you have a desire to be able to run your Java application with a smaller Java heap, then G1 will likely help. If you are able to run with CMS today, and experience no full GCs and the amount of memory your Java heap is using, then you're probably (at this point in time) will have better results with CMS. I would additionally add that since G1 is expected to one day be the replacement for CMS, (which won't happen until it's pretty well shown that G1 can do better than CMS on those apps running CMS today), it's advisable that you if you're using CMS today, you probably should periodically take a look at G1 and offer feedback to this mailing list of your observations. And, most importantly, if you can, offer both the CMS and G1 logs (please include -XX:+PrintGC[Date|Time]Stamps and -XX:+PrintGCDetails) along with the JDK/JRE version (i.e. java -version) and the set of command line options you are using for both G1 and CMS. If you happen to have those GC logs today, I'd be happy to take a look. hths, charlie ... [1] "losing the race" -- a situation where objects are getting promoted to old generation faster than the rate at which than the CMS collector can collect them from old generation and leads to a compaction of old generation via a full GC ... other words, the old generation concurrent collector lost the race with the promotion rate and the JVM needed to do a full GC as a corrective action. On Nov 13, 2012, at 2:20 PM, Sekiya Nobuhiko wrote: > Hi, > > Has anybody found successful test case in G1GC compared to CMS? > I search this mailing list but couldn't find anybody commenting such one. > > If you know any good cases, I would like to know that test case. > > If you know that there are really no good examples at current, I would appreciate that information too. > > I'm testing G1 with 7u7, but really could not get good results. > > Best Regards, > Nobuhiko > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From Andreas.Mueller at mgm-tp.com Thu Nov 15 01:46:46 2012 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Thu, 15 Nov 2012 09:46:46 +0000 Subject: G1GC success examples (compared to CMS) ? Message-ID: <46FF8393B58AD84D95E444264805D98F28F85C1C@edata02.mgm-edv.de> Hi Nobuhiko, >Has anybody found successful test case in G1GC compared to CMS? Just two/three days ago I have reported an example of a portal application on this list. Subject was "Good G1 result". I have also extensively tested the same portal using ParallelGC and CMS. Let me just summarize the Pros and Cons of these 3 algorithms: 1.ParallelGC Pros: - very good throughput (98,7%) - young gen pauses are very short (<100ms) - it is possible (with my application!) to make Full GC pauses very infrequent by proper generation sizing - long term stability (days during loadtest ~ weeks in production) is very good Cons: - Full GC pauses are long (5s on hardware with low single-thread-performance) and will get worse when we are forced to grow the heap size of now only 1 GB - to make generation sizing work you have to switch off adaptive sizing (-XX:-UseAdaptiveSizePolicy, this is a small con compared to ParNewGC which is in other respects just as good as ParallelGC) 2. CMS Pros: - throughput is almost as good as with ParallelGC - young gen pauses are more or less as short as with ParallelGC if you apply the same generation sizing (this is actually a feature of ParNewGC running by default with CMS) - CMS runs are as infrequent as Full GC runs with ParallelGC if you apply the same gen sizing - CMS pauses (initial mark and remark) were always below 1s (100-800 millis) Cons: - I ran into very bad heap fragmentation after 1 day of full load with only 1 GB of heap and untuned gen sizes (promotion rate was too high). I solved that but do not know at what timescale such problems could reappear in production (risk remains!) - I had to increase heap size from 1GB to 1.5 GB to solve the fragmentation issue 3. G1 Pros: - the longest pauses were no longer than the CMS pauses - long term stability was as good as with CMS once I did the configuration properly as suggested on this list (do not set new gen and survivor sizes!) - lowest footprint: same 1GB heap size as with ParallelGC was by far enough, even less should work well - stability seems much better now than it was in the early days of CMS (Java 1.4.2/1.5 some years ago with lots of very bad memory violation failures!) Cons: - average pause times (200-500ms depending on your target) are much longer than with either ParallelGC and CMS because there is basically only one breed of pauses. If you reduce the duration of these pauses you increase the GC overhead (see below) - throughput was much lower (93%) - much higher GC overhead means that G1 does not work well when the garbage rate increases far beyond the 50-60 MB/s I had in the test (while ParallelGC/CMS in peak situations of other tests cleaned up to 420 MB/s. With other applications on other hardware I have even seen ParallelGC/CMS cleaning 1GB/s in a stable and efficient way) - G1 is not usable on JDK6 and therefore on older appservers like OC4J Conclusion: 1.) with heap sizes up to 1-2 GB I stick with ParallelGC because it is the most robust and efficient algorithm and rare (once every hour or few hours) Full GC pauses are acceptable. 2.) with larger productive heaps at the moment I look at CMS and manage the fragmentation issue by proper tuning, intensive testing and close production monitoring 3.) I have an eye on G1 because it seems to live up to its promises and may be able to replace CMS in the next years. Best regards Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121115/2eb8e719/attachment.html From Andreas.Loew at oracle.com Thu Nov 15 04:09:54 2012 From: Andreas.Loew at oracle.com (Andreas Loew) Date: Thu, 15 Nov 2012 13:09:54 +0100 Subject: Help against CMS old gen fragmentation (was: G1GC success examples (compared to CMS) ?) In-Reply-To: <46FF8393B58AD84D95E444264805D98F28F85C1C@edata02.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98F28F85C1C@edata02.mgm-edv.de> Message-ID: <50A4DB92.6030903@oracle.com> Hi Andreas, Am 15.11.2012 10:46, schrieb Andreas M?ller: > 2. CMS > Cons: > - I ran into very bad heap fragmentation after 1 day of full load with > only 1 GB of heap and untuned gen sizes (promotion rate was too high). > I solved that but do not know at what timescale such problems could > reappear in production (risk remains!) as you noticed, there are multiple things that you can do in case you run into CMS old gen fragmentation issues: * firstly, monitor the old gen promotion rate of your application by using -XX:PrintFLSStatistics= with n=1 or n=2 to collect some data * if you notice object allocation requests into old that are particularly large in size, try to find out which ones and possibly look at whether the application can be redesigned to use smaller objects * increase both the young gen and the survivor space size in order to be able to garbage collect more of these short and medium lived objects during the tenuring process while still in the young gen (i.e. reduce the over all promotion rate and object size from young into old) * have the CMS GC kick in earlier (i.e. lower initial occupancy ratio and set initial occupancy ratio only) * or increase the old gen heap - if there is more space available, the remaining fragments of free space will automatically become larger. > - I had to increase heap size from 1GB to 1.5 GB to solve the > fragmentation issue Maybe applying some more of the above options could even allow you to get down a less drastic increase in heap size (50% is much)... Hope this helps & best regards, Andreas -- Andreas Loew | Senior Java Architect Oracle Advanced Customer Services ORACLE Germany From danchidanchi at gmail.com Thu Nov 15 18:14:51 2012 From: danchidanchi at gmail.com (Sekiya Nobuhiko) Date: Fri, 16 Nov 2012 11:14:51 +0900 Subject: G1GC success examples (compared to CMS) ? In-Reply-To: <0E105ABE-2969-4BC3-8926-8FD27512FF97@salesforce.com> References: <0E105ABE-2969-4BC3-8926-8FD27512FF97@salesforce.com> Message-ID: (resendig to mailing list) Hi Charlie, Thanks for your response. (I've read your javaOne presentations and you Java performance book. I liked it very much. ) I guess fragmentation is the key word here. I have done a test that aimed on getting fragmentations. But as a result, after some time running, using CMS, it seemed not to increase in heap usage, and no Full GCs. May be my way of allocating objects was not good. ( or, maybe not running it long enough) What I did is, I ran a Web Application, one jmeter thread doing 3 requests. First request adding byte[] of 60kbyte * 10 (so 600kbyte) to HttpSession (so it will last till session timeout), second request adding byte[] of 80kbyte * 10, third 100 kbyte * 10. Done that for about 3000 user sessions (=3000 jmeter threads) . Eventually increasing threads. There is interval of like 3 seconds for each requests of the three. So, when the session reaches timeout, I hoped getting fragments of 600kbyte, 800kbyte, 1000kbytes. In this case, compared to G1, CMS was slightly better in resonse time (compared by 90% line). CMS:46msec to G1: 56msec . This is jmeter response time, not gc time. No Full GC happening on both. This was on Windows with -Xmx5g -Xms5g. Jdk7u7. CMS: -Xmx5g -Xms5g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:NewRatio=4 G1: -Xmx5g -Xms5g -XX:+UseG1GC -XX:MaxGCPauseMillis=50 May be I should first test with a more simple application to create fragmentations. Any sugestions on that? Also I would love to know a good way to monitor fragmentations. I'm not sure if I'm really creating fragmentations with my application. Regards, Nobuhiko 2012/11/15 Charlie Hunt > Hi Nobuhiko, > > I had done some comparisons with CMS (and with Parallel[Old]GC). > > What I had done was induced the undesirable behavior of heap fragmentation > and object allocation spikes into application running with CMS via > -javaagent. > * Fyi, -javaagent essentially allows you to run another Java program in > parallel with your Java app. (Thanks Vladimir Ivanov, cc'd, for suggesting > the idea to use -javaagent.) > I also did the same experiment against G1. > > What those experiments illustrated was G1 in the presence of either heap > fragmentation or wide swings in object allocation rates offered much more > predictable behavior than CMS, in particular, avoiding (the painful) full > GCs. > > To date, the focus with G1 was been more so on larger Java heaps, than on > smaller Java heaps with lower than say 100ms - 150ms pause times. The > folks at Oracle may be able to offer additional comments. > > If I were to summarize where I see G1 wrt CMS today, I would say that if > you are experiencing full GCs with CMS, due to fragmentation, or "losing > the race" [1], or you have a desire to be able to run your Java application > with a smaller Java heap, then G1 will likely help. If you are able to run > with CMS today, and experience no full GCs and the amount of memory your > Java heap is using, then you're probably (at this point in time) will have > better results with CMS. I would additionally add that since G1 is > expected to one day be the replacement for CMS, (which won't happen until > it's pretty well shown that G1 can do better than CMS on those apps running > CMS today), it's advisable that you if you're using CMS today, you probably > should periodically take a look at G1 and offer feedback to this mailing > list of your observations. And, most importantly, if you can, offer both > the CMS and G1 logs (please include -XX:+PrintGC[Date|Time]Stamps and > -XX:+PrintGCDetails) along with the JDK/JRE version (i.e. java -version) > and the set of command line options you are using for both G1 and CMS. > > If you happen to have those GC logs today, I'd be happy to take a look. > > hths, > > charlie ... > > [1] "losing the race" -- a situation where objects are getting promoted to > old generation faster than the rate at which than the CMS collector can > collect them from old generation and leads to a compaction of old > generation via a full GC ... other words, the old generation concurrent > collector lost the race with the promotion rate and the JVM needed to do a > full GC as a corrective action. > > On Nov 13, 2012, at 2:20 PM, Sekiya Nobuhiko wrote: > > > Hi, > > > > Has anybody found successful test case in G1GC compared to CMS? > > I search this mailing list but couldn't find anybody commenting such one. > > > > If you know any good cases, I would like to know that test case. > > > > If you know that there are really no good examples at current, I would > appreciate that information too. > > > > I'm testing G1 with 7u7, but really could not get good results. > > > > Best Regards, > > Nobuhiko > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121116/44693635/attachment.html From alexey.ragozin at gmail.com Fri Nov 16 02:51:27 2012 From: alexey.ragozin at gmail.com (Alexey Ragozin) Date: Fri, 16 Nov 2012 14:51:27 +0400 Subject: Help against CMS old gen fragmentation (was: G1GC success examples (compared to CMS) ?) Message-ID: Hi Andreas, Unfortunately -XX:PrintFLSStatistics= is producing too much garbage in logs making it self impractical for production environment. Exposing larges heap free chunk size via JMX would be much more useful for monitoring purposes. Lack of any way to monitor fragmentation in production is very unfortunate from operational prospective (especially for 24x7 applications). Fragmentation is building up slowly and if it would be possible to detect it (size of largest chunk below threshold) ops could restart JVM gracefully before imminent full GC. To (other) Andreas Besides increasing heap size, tweaking PLABs can reduce fragmentation in certain case. Here ( http://blog.ragozin.info/2011/10/cms-heap-fragmentation-follow-up-1.html ) you can find some result from my experiments. Regards, Alexey > Hi Andreas, > > Am 15.11.2012 10:46, schrieb Andreas M?ller: >> 2. CMS >> Cons: >> - I ran into very bad heap fragmentation after 1 day of full load with >> only 1 GB of heap and untuned gen sizes (promotion rate was too high). >> I solved that but do not know at what timescale such problems could >> reappear in production (risk remains!) > as you noticed, there are multiple things that you can do in case you > run into CMS old gen fragmentation issues: > > * firstly, monitor the old gen promotion rate of your application by > using -XX:PrintFLSStatistics= with n=1 or n=2 to collect some data > * if you notice object allocation requests into old that are > particularly large in size, try to find out which ones and possibly look > at whether the application can be redesigned to use smaller objects > * increase both the young gen and the survivor space size in order to be > able to garbage collect more of these short and medium lived objects > during the tenuring process while still in the young gen (i.e. reduce > the over all promotion rate and object size from young into old) > * have the CMS GC kick in earlier (i.e. lower initial occupancy ratio > and set initial occupancy ratio only) > * or increase the old gen heap - if there is more space available, the > remaining fragments of free space will automatically become larger. > >> - I had to increase heap size from 1GB to 1.5 GB to solve the >> fragmentation issue > Maybe applying some more of the above options could even allow you to > get down a less drastic increase in heap size (50% is much)... > > Hope this helps & best regards, > Andreas > > -- > Andreas Loew | Senior Java Architect > Oracle Advanced Customer Services > ORACLE Germany From chkwok at digibites.nl Fri Nov 16 03:31:04 2012 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Fri, 16 Nov 2012 12:31:04 +0100 Subject: Help against CMS old gen fragmentation (was: G1GC success examples (compared to CMS) ?) In-Reply-To: References: Message-ID: On Fri, Nov 16, 2012 at 11:51 AM, Alexey Ragozin wrote: > Besides increasing heap size, tweaking PLABs can reduce fragmentation > in certain case. > Here ( > http://blog.ragozin.info/2011/10/cms-heap-fragmentation-follow-up-1.html > ) you can find some result from my experiments. > Whoa, I didn't know there was a big fragmentation issue in 1.6.0_17-24... And guess what, I just checked and found some 1.6.0_17's in production, time to schedule an upgrade. Thanks for the great blog post! Regards, > Alexey -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121116/054ec745/attachment.html From john.cuthbertson at oracle.com Tue Nov 20 17:57:50 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Tue, 20 Nov 2012 17:57:50 -0800 Subject: G1 log output interpretation In-Reply-To: References: <509A9962.5020600@oracle.com> Message-ID: <50AC351E.80707@oracle.com> Hi Simone, Apologies. For some reason this ended up in my junk email folder (and no, it wasn't deliberate) :-) . I've submitted: JDK-8003731 - G1: Improve PrintAdaptiveSizePolicy (ErgoVerbose) output. to track your issues with the ErgoVerbose output. I'll try to answer your questions inline.... On 11/07/12 14:34, Simone Bordet wrote: >> 24.941: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: recent >> GC overhead higher than threshold after GC, recent GC overhead: 11.23 %, >> threshold: 10.00 %] >> 24.941: [G1Ergonomics (Heap Sizing) uncommitted: 2097152 bytes, min >> expansion amount: 2097152 bytes, max expansion amount: 2097152 bytes] >> 24.941: [G1Ergonomics (Heap Sizing) calculated expansion amount: 419430 >> bytes (20.00 %), actual expansion amount: 2097152 bytes] >> > > I'd still be confused by this line, where the calculated expansion > amount is different from the actual expansion amount... > My problem was in the interpretation of "uncommitted", which I did not > fully understand (my fault). > > When G1 expands the heap it needs to round the calculated (or desired) expansion amount (based upon the percentage flag) up to a multiple of heap region size. In the example above the calculated amount was much less than the region size (2mb). Unfortunately we can't expand the heap by a fraction of a region. >> Then: >> 24.941: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion >> amount: 2097152 bytes, attempted expansion amount: 2097152 bytes] >> >> Then: >> 24.941: [G1Ergonomics (Heap Sizing) heap expansion {passed | failed}, >> capacity before: XXXXX bytes, capacity after: XXXXX bytes] >> >> Would that clear up any confusion? >> > > Clearer, yes. > > While we're at it, may I also ask about this other output: > > [from the previous young GC just after a concurrent marking ended] > 29235.349: [G1Ergonomics (Mixed GCs) start mixed GCs, reason: > candidate old regions available, candidate old regions: 184 regions, > reclaimable: 91408328 bytes (8.51 %), threshold: 1.00 %] > [Eden: 192M(191M)->0B(198M) Survivors: 13M->6144K Heap: > 520M(1024M)->321M(1024M)] > [Times: user=0.17 sys=0.00, real=0.11 secs] > > 2012-11-07T19:32:49.391+0100: [GC pause (mixed) 29255.887: > [G1Ergonomics (CSet Construction) start choosing CSet, predicted base > time: 89.44 ms, remaining time: 110.56 ms, target pause time: 200.00 > ms] > 29255.887: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 198 regions, survivors: 6 regions, predicted young region > time: 78.11 ms] > 29255.888: [G1Ergonomics (CSet Construction) finish adding old > regions to CSet, reason: predicted time is too high, predicted time: > 1.54 ms, remaining time: -5.19 ms, old: 46 regions, min: 46 regions] > 29255.888: [G1Ergonomics (CSet Construction) added expensive regions > to CSet, reason: old CSet region num not reached min, old: 46 regions, > expensive: 4 regions, min: 46 regions, remaining time: -5.19 ms] > 29255.888: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 198 regions, survivors: 6 regions, old: 46 regions, predicted > pause time: 205.19 ms, target pause time: 200.00 ms] > , 0.14591800 secs] > ... > 29256.034: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: > candidate old regions available, candidate old regions: 138 regions, > reclaimable: 48771256 bytes (4.54 %), threshold: 1.00 %] > [Eden: 198M(198M)->0B(197M) Survivors: 6144K->7168K Heap: > 519M(1024M)->281M(1024M)] > [Times: user=0.22 sys=0.00, real=0.15 secs] > > In particular, what is the exact meaning of "expensive" and why there > are all these negative times, and why a predicted time of 1.54 ms is > too high. > We predict how long it will take to collect an old region based upon the amount of live data that has to be evacuated, the size of the region's remembered set, and a couple of other terms that we track. This yields the predicted time for a region. When we're determining which old regions to add to the collection set we keep track of how much time we have left (against the pause time goal) after subtracted the sum of the predicted collection times for the regions already in the collection set. A region that is too expensive is one whose predicted time doesn't fit in the time remaining. At the end of marking we sort the candidate old regions based upon the ratio of live data vs predicted collection time (i.e. a measure of the GC efficiency) and we try to collect the old regions with the most "bang for the buck" early in the mixed GC phase. As you have seen in your logs, the prediction code gets it wrong. I think we're over predicting young regions and under predicting old regions. I'm hoping to fix that. > I also noticed that after a concurrent marking ends, there is always a > young GC (not a mixed one), followed by the 4 mixed GC. > So the first GC after the concurrent marking is never a mixed GC, but > always a young GC. Is this normal behavior ? > > Yes it is. The first GC after marking completed had it's eden sized before marking completed and so it is sized based upon expecting to collect only young regions. Adding some old regions to this collection set could exceed the goal. Hope this helps. JohnC -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121120/8e671033/attachment.html From sbordet at intalio.com Wed Nov 21 05:47:50 2012 From: sbordet at intalio.com (Simone Bordet) Date: Wed, 21 Nov 2012 14:47:50 +0100 Subject: G1 log output interpretation In-Reply-To: <50AC351E.80707@oracle.com> References: <509A9962.5020600@oracle.com> <50AC351E.80707@oracle.com> Message-ID: Hi, On Wed, Nov 21, 2012 at 2:57 AM, John Cuthbertson wrote: > Hi Simone, > > Apologies. For some reason this ended up in my junk email folder (and no, it > wasn't deliberate) :-) . > > I've submitted: JDK-8003731 - G1: Improve PrintAdaptiveSizePolicy > (ErgoVerbose) output. > > to track your issues with the ErgoVerbose output. Thanks for this (although I think the bug is not visible from outside) and for your explanations. Simon -- http://cometd.org http://webtide.com Developer advice, training, services and support from the Jetty & CometD experts. ---- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From brian.williams at mayalane.com Mon Nov 26 11:17:02 2012 From: brian.williams at mayalane.com (Brian Williams) Date: Mon, 26 Nov 2012 14:17:02 -0500 Subject: Discrepancy in ParNew pause times between 1.6 and 1.7 CMS Message-ID: <969A3845-84DB-46AE-98B2-DE76FC0CD59B@mayalane.com> We're in the process of doing some performance testing with 1.7, and we're seeing much longer ParNew times in 1.7 than we saw in 1.6. We could use some help isolating the cause. Running an identical performance test of an hour's length, with 1.6u37 we're seeing only 0.009% of pauses over 100ms and just 0.09% over 10ms, but with 1.7u9, we're seeing 64% of pauses over 100ms. Here is typical comparison of a ParNew between the two. It was run with -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags -XX:PrintFLSStatistics=1 -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime. 1.6u37 2012-11-25T16:51:13.145-0600: 1498.096: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 2132560492 Max Chunk Size: 2132560492 Number of Blocks: 1 Av. Block Size: 2132560492 Tree Height: 1 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 512 Max Chunk Size: 512 Number of Blocks: 1 Av. Block Size: 512 Tree Height: 1 1498.096: [ParNew Desired survivor size 107347968 bytes, new threshold 4 (max 4) - age 1: 614064 bytes, 614064 total - age 2: 39344 bytes, 653408 total - age 3: 322432 bytes, 975840 total - age 4: 1208 bytes, 977048 total : 1679253K->1224K(1887488K), 0.0015070 secs] 1792056K->114027K(52219136K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 2132560412 Max Chunk Size: 2132560412 Number of Blocks: 1 Av. Block Size: 2132560412 Tree Height: 1 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 512 Max Chunk Size: 512 Number of Blocks: 1 Av. Block Size: 512 Tree Height: 1 , 0.0016110 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] Total time for which application threads were stopped: 0.0022330 seconds 1.7u9: 2012-11-25T19:10:40.869-0600: 1498.458: [GC Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 2136954070 Max Chunk Size: 2136954070 Number of Blocks: 1 Av. Block Size: 2136954070 Tree Height: 1 Before GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 1498.458: [ParNew Desired survivor size 107347968 bytes, new threshold 6 (max 6) - age 1: 1122960 bytes, 1122960 total - age 2: 1560 bytes, 1124520 total - age 3: 2232 bytes, 1126752 total - age 4: 324760 bytes, 1451512 total - age 5: 2232 bytes, 1453744 total - age 6: 728 bytes, 1454472 total : 1680121K->2393K(1887488K), 0.1132030 secs] 1759235K->81509K(52219136K)After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 2136954070 Max Chunk Size: 2136954070 Number of Blocks: 1 Av. Block Size: 2136954070 Tree Height: 1 After GC: Statistics for BinaryTreeDictionary: ------------------------------------ Total Free Space: 0 Max Chunk Size: 0 Number of Blocks: 0 Tree Height: 0 , 0.1133210 secs] [Times: user=1.45 sys=0.00, real=0.11 secs] Total time for which application threads were stopped: 0.1142450 seconds Notice the 2ms application stop time vs 114ms. We can point you all at full GC logs For what it's worth we repeated this test on 1.7 with our standard set of 1.6 arguments that we've had success with in the past: -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 -XX:ParallelCMSThreads=3 -XX:CMSMaxAbortablePrecleanTime=3600000 -XX:CMSInitiatingOccupancyFraction=83 -XX:+UseParNewGC as well as these minimal CMS args: -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC The results were more or less the same. Are there some additional JVM args that we can enable to shed some light on the big discrepancy in pause times? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121126/425c06bb/attachment.html From alexey.ragozin at gmail.com Mon Nov 26 21:58:43 2012 From: alexey.ragozin at gmail.com (Alexey Ragozin) Date: Tue, 27 Nov 2012 09:58:43 +0400 Subject: Discrepancy in ParNew pause times between 1.6 and 1.7 CMS Message-ID: Hi Brian, I have found similar regression testing young GC performance. After investigation I have come to conclusion that culprit is -XX:+BlockOffsetArrayUseUnallocatedBlock VM option (requires -XX:+UnlockDiagnosticVMOptions). This option is true is 1.6 and false in 1.7. Commit comment means some concurrency issues with that flag. But you should not blame JVM for discrepancy in your test. Problem is withing your test which is using only small fraction of 50GiB. BlockOffsetArrayUseUnallocatedBlock allows JVM exclude not-yet-used memory from collection. So in 1.6 your benchmark effectively measures heapsize of 1-4 GiB. IMHO you probably adjust your test to use more realistic memory pattern (I expect 1.7 be a bit slower, though, at least that was an outcome from my testing). If you interested in minimizing young GC time at a look at http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html Regards, Alexey > We're in the process of doing some performance testing with 1.7, and we're seeing much longer ParNew times in 1.7 than we saw in 1.6. We could use some help isolating the cause. > > Running an identical performance test of an hour's length, with 1.6u37 we're seeing only 0.009% of pauses over 100ms and just 0.09% over 10ms, but with 1.7u9, we're seeing 64% of pauses over 100ms. > > Here is typical comparison of a ParNew between the two. It was run with -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags -XX:PrintFLSStatistics=1 -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime. > > 1.6u37 > > 2012-11-25T16:51:13.145-0600: 1498.096: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 2132560492 > Max Chunk Size: 2132560492 > Number of Blocks: 1 > Av. Block Size: 2132560492 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 512 > Max Chunk Size: 512 > Number of Blocks: 1 > Av. Block Size: 512 > Tree Height: 1 > 1498.096: [ParNew > Desired survivor size 107347968 bytes, new threshold 4 (max 4) > - age 1: 614064 bytes, 614064 total > - age 2: 39344 bytes, 653408 total > - age 3: 322432 bytes, 975840 total > - age 4: 1208 bytes, 977048 total > : 1679253K->1224K(1887488K), 0.0015070 secs] 1792056K->114027K(52219136K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 2132560412 > Max Chunk Size: 2132560412 > Number of Blocks: 1 > Av. Block Size: 2132560412 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 512 > Max Chunk Size: 512 > Number of Blocks: 1 > Av. Block Size: 512 > Tree Height: 1 > , 0.0016110 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > Total time for which application threads were stopped: 0.0022330 seconds > > 1.7u9: > > 2012-11-25T19:10:40.869-0600: 1498.458: [GC Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 2136954070 > Max Chunk Size: 2136954070 > Number of Blocks: 1 > Av. Block Size: 2136954070 > Tree Height: 1 > Before GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > 1498.458: [ParNew > Desired survivor size 107347968 bytes, new threshold 6 (max 6) > - age 1: 1122960 bytes, 1122960 total > - age 2: 1560 bytes, 1124520 total > - age 3: 2232 bytes, 1126752 total > - age 4: 324760 bytes, 1451512 total > - age 5: 2232 bytes, 1453744 total > - age 6: 728 bytes, 1454472 total > : 1680121K->2393K(1887488K), 0.1132030 secs] 1759235K->81509K(52219136K)After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 2136954070 > Max Chunk Size: 2136954070 > Number of Blocks: 1 > Av. Block Size: 2136954070 > Tree Height: 1 > After GC: > Statistics for BinaryTreeDictionary: > ------------------------------------ > Total Free Space: 0 > Max Chunk Size: 0 > Number of Blocks: 0 > Tree Height: 0 > , 0.1133210 secs] [Times: user=1.45 sys=0.00, real=0.11 secs] > Total time for which application threads were stopped: 0.1142450 seconds > > Notice the 2ms application stop time vs 114ms. We can point you all at full GC logs > > For what it's worth we repeated this test on 1.7 with our standard set of 1.6 arguments that we've had success with in the past: > > -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 -XX:ParallelCMSThreads=3 -XX:CMSMaxAbortablePrecleanTime=3600000 -XX:CMSInitiatingOccupancyFraction=83 -XX:+UseParNewGC > > as well as these minimal CMS args: > > -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > The results were more or less the same. > > Are there some additional JVM args that we can enable to shed some light on the big discrepancy in pause times? > > Thanks. From brian.williams at mayalane.com Wed Nov 28 13:21:36 2012 From: brian.williams at mayalane.com (Brian Williams) Date: Wed, 28 Nov 2012 16:21:36 -0500 Subject: Discrepancy in ParNew pause times between 1.6 and 1.7 CMS In-Reply-To: References: Message-ID: <8CE1ADA1-D422-4EE9-80D2-7A5C1DE4DA25@mayalane.com> Thanks Alexey. We re-ran our tests separately with each of your recommendations. Reducing the heap size and running with BlockOffsetArrayUseUnallocatedBlock each dramatically improve performance. It's not quite to where 1.6 is, but it's much closer. Thanks also for the ParGCCardsPerStrideChunk suggestion. We saw a small but noticeable benefit to using it. You're absolutely right that a 50GB heap is way too big for this test. We were testing with a much smaller dataset than is typical. In the past, we've seen benefits to having more memory available for the JVM to prolong promotion failures. Now, we'll be on the lookout that bigger isn't always better. On Nov 27, 2012, at 12:58 AM, Alexey Ragozin wrote: > Hi Brian, > > I have found similar regression testing young GC performance. > > After investigation I have come to conclusion that culprit is > -XX:+BlockOffsetArrayUseUnallocatedBlock VM option (requires > -XX:+UnlockDiagnosticVMOptions). > > This option is true is 1.6 and false in 1.7. Commit comment means some > concurrency issues with that flag. > > But you should not blame JVM for discrepancy in your test. Problem is > withing your test which is using only small fraction of 50GiB. > BlockOffsetArrayUseUnallocatedBlock allows JVM exclude not-yet-used > memory from collection. So in 1.6 your benchmark effectively measures > heapsize of 1-4 GiB. > > IMHO you probably adjust your test to use more realistic memory > pattern (I expect 1.7 be a bit slower, though, at least that was an > outcome from my testing). > > If you interested in minimizing young GC time at a look at > http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html > > Regards, > Alexey > >> We're in the process of doing some performance testing with 1.7, and we're seeing much longer ParNew times in 1.7 than we saw in 1.6. We could use some help isolating the cause. >> >> Running an identical performance test of an hour's length, with 1.6u37 we're seeing only 0.009% of pauses over 100ms and just 0.09% over 10ms, but with 1.7u9, we're seeing 64% of pauses over 100ms. >> >> Here is typical comparison of a ParNew between the two. It was run with -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags -XX:PrintFLSStatistics=1 -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime. >> >> 1.6u37 >> >> 2012-11-25T16:51:13.145-0600: 1498.096: [GC Before GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 2132560492 >> Max Chunk Size: 2132560492 >> Number of Blocks: 1 >> Av. Block Size: 2132560492 >> Tree Height: 1 >> Before GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 512 >> Max Chunk Size: 512 >> Number of Blocks: 1 >> Av. Block Size: 512 >> Tree Height: 1 >> 1498.096: [ParNew >> Desired survivor size 107347968 bytes, new threshold 4 (max 4) >> - age 1: 614064 bytes, 614064 total >> - age 2: 39344 bytes, 653408 total >> - age 3: 322432 bytes, 975840 total >> - age 4: 1208 bytes, 977048 total >> : 1679253K->1224K(1887488K), 0.0015070 secs] 1792056K->114027K(52219136K)After GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 2132560412 >> Max Chunk Size: 2132560412 >> Number of Blocks: 1 >> Av. Block Size: 2132560412 >> Tree Height: 1 >> After GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 512 >> Max Chunk Size: 512 >> Number of Blocks: 1 >> Av. Block Size: 512 >> Tree Height: 1 >> , 0.0016110 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] >> Total time for which application threads were stopped: 0.0022330 seconds >> >> 1.7u9: >> >> 2012-11-25T19:10:40.869-0600: 1498.458: [GC Before GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 2136954070 >> Max Chunk Size: 2136954070 >> Number of Blocks: 1 >> Av. Block Size: 2136954070 >> Tree Height: 1 >> Before GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 0 >> Max Chunk Size: 0 >> Number of Blocks: 0 >> Tree Height: 0 >> 1498.458: [ParNew >> Desired survivor size 107347968 bytes, new threshold 6 (max 6) >> - age 1: 1122960 bytes, 1122960 total >> - age 2: 1560 bytes, 1124520 total >> - age 3: 2232 bytes, 1126752 total >> - age 4: 324760 bytes, 1451512 total >> - age 5: 2232 bytes, 1453744 total >> - age 6: 728 bytes, 1454472 total >> : 1680121K->2393K(1887488K), 0.1132030 secs] 1759235K->81509K(52219136K)After GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 2136954070 >> Max Chunk Size: 2136954070 >> Number of Blocks: 1 >> Av. Block Size: 2136954070 >> Tree Height: 1 >> After GC: >> Statistics for BinaryTreeDictionary: >> ------------------------------------ >> Total Free Space: 0 >> Max Chunk Size: 0 >> Number of Blocks: 0 >> Tree Height: 0 >> , 0.1133210 secs] [Times: user=1.45 sys=0.00, real=0.11 secs] >> Total time for which application threads were stopped: 0.1142450 seconds >> >> Notice the 2ms application stop time vs 114ms. We can point you all at full GC logs >> >> For what it's worth we repeated this test on 1.7 with our standard set of 1.6 arguments that we've had success with in the past: >> >> -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 -XX:ParallelCMSThreads=3 -XX:CMSMaxAbortablePrecleanTime=3600000 -XX:CMSInitiatingOccupancyFraction=83 -XX:+UseParNewGC >> >> as well as these minimal CMS args: >> >> -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g -XX:+UseConcMarkSweepGC -XX:+UseParNewGC >> >> The results were more or less the same. >> >> Are there some additional JVM args that we can enable to shed some light on the big discrepancy in pause times? >> >> Thanks. From ysr1729 at gmail.com Wed Nov 28 17:21:13 2012 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Wed, 28 Nov 2012 17:21:13 -0800 Subject: Discrepancy in ParNew pause times between 1.6 and 1.7 CMS In-Reply-To: References: Message-ID: Impressive. Sounds like the stride size should be ergonomically set based on heap size & #workers, rather than being a static value? If the optimal varies slowly (i.e. is not sensitive above a specific value, such as your 4K), perhaps that should be the default for now, with suitable ergonomic downsizing for smaller heap sizes , if needed. Copying gc-dev, requesting filing a CR for that ergo enhancement, or at least a change in default value? PS: BTW, Do you have the CR# for the unallocated block change, just to refresh collective memory? I would have thought that empty unallocated blocks because they would have no dirty cards would be quickly taken care of. A large regression seems a bit surprising. -- ramki On Mon, Nov 26, 2012 at 9:58 PM, Alexey Ragozin wrote: > Hi Brian, > > I have found similar regression testing young GC performance. > > After investigation I have come to conclusion that culprit is > -XX:+BlockOffsetArrayUseUnallocatedBlock VM option (requires > -XX:+UnlockDiagnosticVMOptions). > > This option is true is 1.6 and false in 1.7. Commit comment means some > concurrency issues with that flag. > > But you should not blame JVM for discrepancy in your test. Problem is > withing your test which is using only small fraction of 50GiB. > BlockOffsetArrayUseUnallocatedBlock allows JVM exclude not-yet-used > memory from collection. So in 1.6 your benchmark effectively measures > heapsize of 1-4 GiB. > > IMHO you probably adjust your test to use more realistic memory > pattern (I expect 1.7 be a bit slower, though, at least that was an > outcome from my testing). > > If you interested in minimizing young GC time at a look at > http://blog.ragozin.info/2012/03/secret-hotspot-option-improving-gc.html > > Regards, > Alexey > > > We're in the process of doing some performance testing with 1.7, and > we're seeing much longer ParNew times in 1.7 than we saw in 1.6. We could > use some help isolating the cause. > > > > Running an identical performance test of an hour's length, with 1.6u37 > we're seeing only 0.009% of pauses over 100ms and just 0.09% over 10ms, but > with 1.7u9, we're seeing 64% of pauses over 100ms. > > > > Here is typical comparison of a ParNew between the two. It was run with > -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintCommandLineFlags > -XX:PrintFLSStatistics=1 -XX:+PrintTenuringDistribution > -XX:+PrintGCApplicationStoppedTime. > > > > 1.6u37 > > > > 2012-11-25T16:51:13.145-0600: 1498.096: [GC Before GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 2132560492 > > Max Chunk Size: 2132560492 > > Number of Blocks: 1 > > Av. Block Size: 2132560492 > > Tree Height: 1 > > Before GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 512 > > Max Chunk Size: 512 > > Number of Blocks: 1 > > Av. Block Size: 512 > > Tree Height: 1 > > 1498.096: [ParNew > > Desired survivor size 107347968 bytes, new threshold 4 (max 4) > > - age 1: 614064 bytes, 614064 total > > - age 2: 39344 bytes, 653408 total > > - age 3: 322432 bytes, 975840 total > > - age 4: 1208 bytes, 977048 total > > : 1679253K->1224K(1887488K), 0.0015070 secs] > 1792056K->114027K(52219136K)After GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 2132560412 > > Max Chunk Size: 2132560412 > > Number of Blocks: 1 > > Av. Block Size: 2132560412 > > Tree Height: 1 > > After GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 512 > > Max Chunk Size: 512 > > Number of Blocks: 1 > > Av. Block Size: 512 > > Tree Height: 1 > > , 0.0016110 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] > > Total time for which application threads were stopped: 0.0022330 seconds > > > > 1.7u9: > > > > 2012-11-25T19:10:40.869-0600: 1498.458: [GC Before GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 2136954070 > > Max Chunk Size: 2136954070 > > Number of Blocks: 1 > > Av. Block Size: 2136954070 > > Tree Height: 1 > > Before GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 0 > > Max Chunk Size: 0 > > Number of Blocks: 0 > > Tree Height: 0 > > 1498.458: [ParNew > > Desired survivor size 107347968 bytes, new threshold 6 (max 6) > > - age 1: 1122960 bytes, 1122960 total > > - age 2: 1560 bytes, 1124520 total > > - age 3: 2232 bytes, 1126752 total > > - age 4: 324760 bytes, 1451512 total > > - age 5: 2232 bytes, 1453744 total > > - age 6: 728 bytes, 1454472 total > > : 1680121K->2393K(1887488K), 0.1132030 secs] > 1759235K->81509K(52219136K)After GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 2136954070 > > Max Chunk Size: 2136954070 > > Number of Blocks: 1 > > Av. Block Size: 2136954070 > > Tree Height: 1 > > After GC: > > Statistics for BinaryTreeDictionary: > > ------------------------------------ > > Total Free Space: 0 > > Max Chunk Size: 0 > > Number of Blocks: 0 > > Tree Height: 0 > > , 0.1133210 secs] [Times: user=1.45 sys=0.00, real=0.11 secs] > > Total time for which application threads were stopped: 0.1142450 seconds > > > > Notice the 2ms application stop time vs 114ms. We can point you all at > full GC logs > > > > For what it's worth we repeated this test on 1.7 with our standard set > of 1.6 arguments that we've had success with in the past: > > > > -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g > -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled > -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled > -XX:+CMSScavengeBeforeRemark -XX:RefDiscoveryPolicy=1 > -XX:ParallelCMSThreads=3 -XX:CMSMaxAbortablePrecleanTime=3600000 > -XX:CMSInitiatingOccupancyFraction=83 -XX:+UseParNewGC > > > > as well as these minimal CMS args: > > > > -d64 -server -Xmx50g -Xms50g -XX:MaxNewSize=2g -XX:NewSize=2g > -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > > > > The results were more or less the same. > > > > Are there some additional JVM args that we can enable to shed some light > on the big discrepancy in pause times? > > > > Thanks. > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121128/87273a2b/attachment.html From ryebrye at gmail.com Thu Nov 29 19:29:02 2012 From: ryebrye at gmail.com (Ryan Gardner) Date: Thu, 29 Nov 2012 22:29:02 -0500 Subject: G1 to-space-overflow on one server (but not on another identical server under identical load) Message-ID: We're doing some load testing of an instance of a solr search that has a pretty frequent replication... the object allocation rates are all over the map, and tuning it with CMS was very difficult and we weren't able to meet our latency targets - so we decided to try G1. I'm running some tests now, and I have two identical servers that are getting identical traffic. on one server, here's an excerpt from the gc log: Here are the VM arguments that I'm using: -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+ExplicitGCInvokesConcurrent -XX:MaxGCPauseMillis=150 -XX:+AggressiveOpts -XX:+UseG1GC -Xmx24G -Xms24G Is there some option I can set to stack the deck in favor of avoiding a "to-space-overflow"? If anyone is interested, I can provide the full GC logs from both of the servers - 2012-11-30T01:42:21.857+0000: 4571.106: [GC pause (young), 0.08561600 secs] [Parallel Time: 62.0 ms] [GC Worker Start (ms): 4571107.0 4571107.0 4571107.0 4571107.0 4571107.1 4571107.1 4571107.2 4571107.2 4571107.2 4571107.3 4571107.3 4571107.3 4571107.4 4571107.4 4571107.4 4571107.5 4571107.5 4571107.5 Avg: 4571107.2, Min: 4571107.0, Max: 4571107.5, Diff: 0.5] [Ext Root Scanning (ms): 1.2 1.2 1.1 1.1 1.1 1.0 1.0 1.0 1.0 1.0 0.9 1.2 0.8 0.9 1.0 0.9 0.8 0.8 Avg: 1.0, Min: 0.8, Max: 1.2, Diff: 0.5] [Update RS (ms): 8.1 8.3 8.1 8.1 8.0 8.1 8.2 7.9 8.1 8.1 8.2 7.9 8.0 8.0 7.8 8.1 8.1 8.9 Avg: 8.1, Min: 7.8, Max: 8.9, Diff: 1.1] [Processed Buffers : 13 14 15 13 15 11 17 15 11 14 14 11 12 13 13 11 11 11 Sum: 234, Avg: 13, Min: 11, Max: 17, Diff: 6] [Scan RS (ms): 0.4 0.3 0.5 0.5 0.7 0.6 0.6 0.7 0.6 0.5 0.5 0.5 0.7 0.6 0.6 0.4 0.6 0.1 Avg: 0.5, Min: 0.1, Max: 0.7, Diff: 0.6] [Object Copy (ms): 47.2 47.1 46.9 47.0 47.0 48.6 46.9 47.0 46.9 46.9 47.0 46.9 47.0 47.7 47.1 47.0 46.9 48.3 Avg: 47.2, Min: 46.9, Max: 48.6, Diff: 1.7] [Termination (ms): 1.9 1.8 1.9 1.9 1.9 0.3 1.8 1.8 1.9 1.9 1.8 1.8 1.9 1.2 1.7 1.9 1.8 0.0 Avg: 1.6, Min: 0.0, Max: 1.9, Diff: 1.9] [Termination Attempts : 9 8 14 13 16 16 8 1 17 20 12 14 18 11 18 18 17 18 Sum: 248, Avg: 13, Min: 1, Max: 20, Diff: 19] [GC Worker End (ms): 4571166.1 4571165.7 4571165.7 4571166.0 4571166.0 4571165.9 4571165.8 4571166.0 4571165.7 4571165.7 4571165.7 4571165.7 4571165.7 4571165.9 4571166.1 4571166.2 4571165.7 4571165.9 Avg: 4571165.8, Min: 4571165.7, Max: 4571166.2, Diff: 0.5] [GC Worker (ms): 59.1 58.7 58.7 58.9 58.9 58.8 58.6 58.8 58.5 58.4 58.4 58.4 58.3 58.6 58.7 58.7 58.2 58.4 Avg: 58.6, Min: 58.2, Max: 59.1, Diff: 0.9] [GC Worker Other (ms): 3.3 3.4 3.4 3.4 3.4 3.5 3.6 3.6 3.6 3.7 3.7 3.7 3.7 3.8 3.8 3.8 3.9 3.9 Avg: 3.6, Min: 3.3, Max: 3.9, Diff: 0.5] [Clear CT: 2.3 ms] [Other: 21.2 ms] [Choose CSet: 0.2 ms] [Ref Proc: 5.8 ms] [Ref Enq: 0.1 ms] [Free CSet: 13.6 ms] [Eden: 6728M(6728M)->0B(4472M) Survivors: 376M->616M Heap: 10758M(24576M)->4264M(24576M)] [Times: user=1.08 sys=0.00, real=0.08 secs] 2012-11-30T01:42:27.493+0000: 4576.742: [GC pause (young), 0.06992900 secs] [Parallel Time: 52.4 ms] [GC Worker Start (ms): 4576742.7 4576742.8 4576742.8 4576742.8 4576742.9 4576742.9 4576743.0 4576743.0 4576743.1 4576743.1 4576743.1 4576743.2 4576743.2 4576743.2 4576743.3 4576743.3 4576743.3 4576743.3 Avg: 4576743.1, Min: 4576742.7, Max: 4576743.3, Diff: 0.6] [Ext Root Scanning (ms): 1.4 1.3 1.4 1.1 1.0 1.2 1.1 1.0 1.2 0.9 1.1 1.1 1.2 1.1 1.0 1.0 0.8 0.7 Avg: 1.1, Min: 0.7, Max: 1.4, Diff: 0.7] [Update RS (ms): 0.8 0.7 0.6 0.8 0.7 2.4 0.7 0.8 0.7 1.1 0.7 0.8 0.4 0.4 0.4 0.7 0.8 1.2 Avg: 0.8, Min: 0.4, Max: 2.4, Diff: 2.0] [Processed Buffers : 2 2 1 7 4 1 6 2 1 1 2 1 4 1 1 1 1 1 Sum: 39, Avg: 2, Min: 1, Max: 7, Diff: 6] [Scan RS (ms): 0.1 0.4 0.3 0.3 0.3 0.1 0.4 0.2 0.0 0.0 0.1 0.0 0.2 0.3 0.4 0.1 0.2 0.0 Avg: 0.2, Min: 0.0, Max: 0.4, Diff: 0.3] [Object Copy (ms): 47.2 47.4 47.3 47.3 47.4 45.7 47.4 47.4 47.2 47.4 47.4 47.1 47.5 47.2 47.2 47.1 47.4 47.0 Avg: 47.2, Min: 45.7, Max: 47.5, Diff: 1.8] [Termination (ms): 0.3 0.1 0.2 0.2 0.3 0.3 0.0 0.0 0.2 0.0 0.1 0.3 0.0 0.2 0.2 0.3 0.1 0.3 Avg: 0.2, Min: 0.0, Max: 0.3, Diff: 0.3] [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] [GC Worker End (ms): 4576792.6 4576792.6 4576792.6 4576792.7 4576792.8 4576792.7 4576792.8 4576792.9 4576792.8 4576792.7 4576792.6 4576792.6 4576792.8 4576792.6 4576792.6 4576792.6 4576792.6 4576792.6 Avg: 4576792.7, Min: 4576792.6, Max: 4576792.9, Diff: 0.3] [GC Worker (ms): 49.8 49.8 49.8 49.9 49.9 49.8 49.8 49.8 49.7 49.5 49.5 49.4 49.6 49.3 49.3 49.3 49.3 49.3 Avg: 49.6, Min: 49.3, Max: 49.9, Diff: 0.6] [GC Worker Other (ms): 2.6 2.6 2.6 2.7 2.7 2.8 2.8 2.9 3.0 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.2 3.2 Avg: 2.9, Min: 2.6, Max: 3.2, Diff: 0.6] [Clear CT: 1.6 ms] [Other: 15.9 ms] [Choose CSet: 0.2 ms] [Ref Proc: 3.5 ms] [Ref Enq: 0.1 ms] [Free CSet: 10.0 ms] [Eden: 4472M(4472M)->0B(16704M) Survivors: 616M->288M Heap: 8736M(24576M)->4279M(24576M)] [Times: user=0.92 sys=0.01, real=0.07 secs] 2012-11-30T01:43:06.366+0000: 4615.615: [GC pause (young) (to-space overflow), 7.55314600 secs] [Parallel Time: 5994.5 ms] [GC Worker Start (ms): 4615616.3 4615616.4 4615616.4 4615616.4 4615616.4 4615616.4 4615616.5 4615616.5 4615616.5 4615616.6 4615616.6 4615616.7 4615616.7 4615616.7 4615616.8 4615616.8 4615616.8 4615616.9 Avg: 4615616.6, Min: 4615616.3, Max: 4615616.9, Diff: 0.6] [Ext Root Scanning (ms): 1.2 1.3 1.3 1.2 1.2 1.2 1.6 1.0 1.1 1.0 0.9 0.9 1.4 1.1 0.9 0.9 0.9 1.0 Avg: 1.1, Min: 0.9, Max: 1.6, Diff: 0.8] [Update RS (ms): 9.8 10.1 9.5 9.5 10.0 9.6 10.4 9.5 9.8 9.7 10.9 9.7 9.1 9.9 10.2 9.7 9.5 9.6 Avg: 9.8, Min: 9.1, Max: 10.9, Diff: 1.8] [Processed Buffers : 16 17 14 19 14 13 14 14 15 18 13 13 12 14 13 12 14 14 Sum: 259, Avg: 14, Min: 12, Max: 19, Diff: 7] [Scan RS (ms): 1.1 0.6 1.3 1.3 0.8 1.0 0.1 1.3 1.0 1.1 0.1 1.2 1.3 0.7 0.5 1.0 1.2 0.9 Avg: 0.9, Min: 0.1, Max: 1.3, Diff: 1.2] [Object Copy (ms): 5979.6 5978.9 5979.2 5979.1 5979.5 5979.3 5978.7 5979.6 5979.2 5979.5 5979.2 5979.3 5978.9 5979.0 5979.1 5979.2 5979.0 5979.5 Avg: 5979.2, Min: 5978.7, Max: 5979.6, Diff: 1.0] [Termination (ms): 0.1 0.7 0.2 0.5 0.1 0.3 0.6 0.2 0.3 0.0 0.2 0.2 0.6 0.5 0.4 0.3 0.6 0.0 Avg: 0.3, Min: 0.0, Max: 0.7, Diff: 0.7] [Termination Attempts : 3 1 1 3 3 5 3 4 4 4 4 1 3 3 3 5 6 4 Sum: 60, Avg: 3, Min: 1, Max: 6, Diff: 5] [GC Worker End (ms): 4621608.7 4621608.0 4621608.0 4621608.7 4621608.4 4621608.5 4621608.0 4621608.3 4621608.6 4621608.0 4621608.2 4621608.0 4621608.0 4621608.1 4621608.3 4621608.4 4621608.0 4621608.1 Avg: 4621608.2, Min: 4621608.0, Max: 4621608.7, Diff: 0.7] [GC Worker (ms): 5992.4 5991.6 5991.6 5992.3 5992.0 5992.1 5991.5 5991.8 5992.1 5991.4 5991.5 5991.3 5991.3 5991.3 5991.6 5991.6 5991.2 5991.2 Avg: 5991.7, Min: 5991.2, Max: 5992.4, Diff: 1.2] [GC Worker Other (ms): 2.9 2.9 3.0 3.0 3.0 3.0 3.0 3.0 3.1 3.1 3.2 3.2 3.2 3.3 3.3 3.3 3.4 3.5 Avg: 3.1, Min: 2.9, Max: 3.5, Diff: 0.6] [Clear CT: 3.3 ms] [Other: 1555.4 ms] [Choose CSet: 0.4 ms] [Ref Proc: 57.3 ms] [Ref Enq: 0.3 ms] [Free CSet: 19.4 ms] [Eden: 16704M(16704M)->0B(2784M) Survivors: 288M->2128M Heap: 21362M(24576M)->22239M(24576M)] [Times: user=35.37 sys=0.33, real=7.55 secs] 2012-11-30T01:43:16.740+0000: 4625.989: [GC pause (young) (to-space overflow) (initial-mark), 23.61614000 secs] [Parallel Time: 21781.4 ms] [GC Worker Start (ms): 4625989.9 4625990.0 4625990.1 4625990.1 4625990.2 4625990.2 4625990.2 4625990.3 4625990.3 4625990.3 4625990.4 4625990.4 4625990.4 4625990.4 4625990.5 4625990.5 4625990.5 4625990.5 Avg: 4625990.3, Min: 4625989.9, Max: 4625990.5, Diff: 0.6] [Ext Root Scanning (ms): 1.7 2.0 1.4 1.2 1.4 1.4 1.4 1.4 1.1 1.2 1.2 1.0 1.1 1.1 1.1 0.9 1.1 1.1 Avg: 1.3, Min: 0.9, Max: 2.0, Diff: 1.1] [Update RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.4 0.5 0.6 0.6 1.4 0.5 0.9 0.6 0.7 0.5 0.7 Avg: 0.4, Min: 0.0, Max: 1.4, Diff: 1.4] [Processed Buffers : 0 0 0 0 0 0 16 4 11 22 30 39 30 26 9 62 21 9 Sum: 279, Avg: 15, Min: 0, Max: 62, Diff: 62] [Scan RS (ms): 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.6 0.6 0.5 0.6 0.0 0.8 0.4 0.5 0.7 0.6 0.4 Avg: 0.4, Min: 0.0, Max: 0.8, Diff: 0.7] [Object Copy (ms): 21776.6 21776.1 21776.5 21776.6 21776.2 21776.4 21775.0 21774.7 21775.3 21775.3 21775.0 21774.8 21775.4 21774.6 21775.2 21774.6 21774.7 21775.0 Avg: 21775.5, Min: 21774.6, Max: 21776.6, Diff: 2.1] [Termination (ms): 0.0 0.3 0.4 0.4 0.5 0.3 0.7 1.0 0.6 0.4 0.6 0.7 0.2 1.0 0.5 1.0 1.0 0.7 Avg: 0.6, Min: 0.0, Max: 1.0, Diff: 1.0] [Termination Attempts : 1 1 1 2 2 1 1 1 3 1 1 1 1 1 1 2 2 1 Sum: 24, Avg: 1, Min: 1, Max: 3, Diff: 2] [GC Worker End (ms): 4647768.4 4647768.5 4647768.5 4647768.4 4647768.4 4647768.7 4647768.8 4647768.5 4647768.5 4647768.4 4647768.7 4647768.5 4647768.4 4647768.4 4647768.6 4647768.6 4647768.4 4647768.4 Avg: 4647768.5, Min: 4647768.4, Max: 4647768.8, Diff: 0.3] [GC Worker (ms): 21778.5 21778.5 21778.4 21778.3 21778.3 21778.5 21778.5 21778.2 21778.2 21778.1 21778.3 21778.1 21778.0 21778.0 21778.1 21778.1 21777.9 21777.9 Avg: 21778.2, Min: 21777.9, Max: 21778.5, Diff: 0.6] [GC Worker Other (ms): 2.9 3.0 3.0 3.1 3.1 3.2 3.2 3.2 3.3 3.3 3.3 3.3 3.4 3.4 3.4 3.5 3.5 3.5 Avg: 3.3, Min: 2.9, Max: 3.5, Diff: 0.6] [Clear CT: 1.5 ms] [Other: 1833.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.8 ms] [Ref Enq: 0.0 ms] [Free CSet: 3.6 ms] [Eden: 1384M(2784M)->0B(4912M) Survivors: 2128M->0B Heap: 23623M(24576M)->23623M(24576M)] [Times: user=33.81 sys=0.54, real=23.62 secs] 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-root-region-scan-start] 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-root-region-scan-end, 0.0000690] 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-mark-start] 2012-11-30T01:43:40.358+0000: 4649.607: [GC pause (young), 1.66905500 secs] [Parallel Time: 1667.5 ms] [GC Worker Start (ms): 4649607.0 4649607.0 4649607.1 4649607.2 4649607.2 4649607.3 4649607.3 4649607.3 4649607.4 4649607.8 4649607.4 4649607.4 4649607.5 4649607.5 4649607.6 4649607.6 4649607.6 4649607.7 Avg: 4649607.4, Min: 4649607.0, Max: 4649607.8, Diff: 0.8] [Ext Root Scanning (ms): 1.6 2.1 2.3 1.5 2.1 1.6 2.0 1.9 1.8 1.7 1.9 2.1 1.6 1.7 1.4 1.6 1.2 1.3 Avg: 1.7, Min: 1.2, Max: 2.3, Diff: 1.1] [SATB Filtering (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] [Update RS (ms): 1662.9 1662.8 1662.5 1663.0 1662.6 1663.1 1662.6 1662.7 1662.9 1662.5 1662.6 1662.4 1662.9 1662.8 1663.1 1662.8 1663.2 1663.1 Avg: 1662.8, Min: 1662.4, Max: 1663.2, Diff: 0.7] [Processed Buffers : 10947 10965 17033 10676 10631 10692 10524 10706 16833 10703 10751 12889 10735 18332 18581 19972 16982 19856 Sum: 247808, Avg: 13767, Min: 10524, Max: 19972, Diff: 9448] [Scan RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] [Object Copy (ms): 0.5 0.1 0.1 0.3 0.2 0.1 0.1 0.1 0.0 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 Avg: 0.1, Min: 0.0, Max: 0.5, Diff: 0.5] [Termination (ms): 0.0 0.0 0.1 0.1 0.0 0.1 0.1 0.1 0.0 0.0 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.0 Avg: 0.0, Min: 0.0, Max: 0.1, Diff: 0.1] [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] [GC Worker End (ms): 4651272.1 4651272.1 4651272.2 4651272.1 4651272.1 4651272.2 4651272.1 4651272.1 4651272.1 4651272.1 4651272.1 4651272.1 4651272.1 4651272.2 4651272.1 4651272.1 4651272.1 4651272.1 Avg: 4651272.1, Min: 4651272.1, Max: 4651272.2, Diff: 0.0] [GC Worker (ms): 1665.1 1665.1 1665.0 1665.0 1664.9 1664.9 1664.8 1664.8 1664.8 1664.4 1664.8 1664.7 1664.7 1664.6 1664.6 1664.5 1664.5 1664.5 Avg: 1664.8, Min: 1664.4, Max: 1665.1, Diff: 0.8] [GC Worker Other (ms): 2.4 2.4 2.5 2.6 2.6 2.6 2.7 2.7 2.7 3.2 2.8 2.8 2.8 2.9 3.0 3.0 3.0 3.1 Avg: 2.8, Min: 2.4, Max: 3.2, Diff: 0.8] [Complete CSet Marking: 0.0 ms] [Clear CT: 0.5 ms] [Other: 1.1 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.8 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.2 ms] [Eden: 0B(4912M)->0B(4912M) Survivors: 0B->0B Heap: 23623M(24576M)->23623M(24576M)] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121129/b6a6df0b/attachment-0001.html From john.cuthbertson at oracle.com Fri Nov 30 10:21:52 2012 From: john.cuthbertson at oracle.com (John Cuthbertson) Date: Fri, 30 Nov 2012 10:21:52 -0800 Subject: G1 to-space-overflow on one server (but not on another identical server under identical load) In-Reply-To: References: Message-ID: <50B8F940.3040001@oracle.com> Hi Ryan, I would definitely be interested in both logs. I would be interested in plotting any differences in the promotion rates or start times and durations of marking cycles. Can you also give me details of your command-line flags? There's a couple of experiments to try: * Lowering your IHOP (InitiatiangHeapOccupancyPercent - default value is 45) to start the marking cycle a little bit earlier * Increase the value of G1ReservePercent (default value 10). Also do you know approximately if your application creates large arrays? If so then you could be running into issues with "holes" in the heap associated with these (in G1 parlance) humongous objects. This can be reduced by explicitly setting the value for G1HeapRegionSize. By my guess your region size will be either 8m or 16m (based upon the size of your heap). Increasing the value can increase the packing density since some humongous objects are no longer humongous; reducing it can increase the number of humongous objects but reduce the size and number of "holes" since the humongous objects may occupy a larger number of regions but the wasted space at the end of these objects is smaller. After looking at your logs I may have a few more suggestions. BTW we have a couple of CRs (and ideas) to reduce the duration of pauses that experience a to-space overflow/exhaustion. I'm not sure they're public yet but they are: 8003235 G1: Parellelize displaced header restoration during evacuation failures 8003237 G1: Reduce unnecessary (and failing) allocation attempts when handling an evacuation failure The first one should address the high "other" time; the second should help to reduce the object copy time Thanks. JohnC On 11/29/12 19:29, Ryan Gardner wrote: > We're doing some load testing of an instance of a solr search that has > a pretty frequent replication... the object allocation rates are all > over the map, and tuning it with CMS was very difficult and we weren't > able to meet our latency targets - so we decided to try G1. > > I'm running some tests now, and I have two identical servers that are > getting identical traffic. on one server, here's an excerpt from the > gc log: > > Here are the VM arguments that I'm using: > -XX:+PrintGCDateStamps -XX:+PrintGCDetails > -XX:+ExplicitGCInvokesConcurrent -XX:MaxGCPauseMillis=150 > -XX:+AggressiveOpts -XX:+UseG1GC -Xmx24G -Xms24G > > Is there some option I can set to stack the deck in favor of avoiding > a "to-space-overflow"? > > If anyone is interested, I can provide the full GC logs from both of > the servers - > > 2012-11-30T01:42:21.857+0000: 4571.106: [GC pause (young), 0.08561600 > secs] > [Parallel Time: 62.0 ms] > [GC Worker Start (ms): 4571107.0 4571107.0 4571107.0 > 4571107.0 4571107.1 4571107.1 4571107.2 4571107.2 4571107.2 > 4571107.3 4571107.3 4571107.3 4571107.4 4571107.4 4571107.4 > 4571107.5 4571107.5 4571107.5 > Avg: 4571107.2, Min: 4571107.0, Max: 4571107.5, Diff: 0.5] > [Ext Root Scanning (ms): 1.2 1.2 1.1 1.1 1.1 1.0 1.0 1.0 > 1.0 1.0 0.9 1.2 0.8 0.9 1.0 0.9 0.8 0.8 > Avg: 1.0, Min: 0.8, Max: 1.2, Diff: 0.5] > [Update RS (ms): 8.1 8.3 8.1 8.1 8.0 8.1 8.2 7.9 8.1 > 8.1 8.2 7.9 8.0 8.0 7.8 8.1 8.1 8.9 > Avg: 8.1, Min: 7.8, Max: 8.9, Diff: 1.1] > [Processed Buffers : 13 14 15 13 15 11 17 15 11 14 14 11 12 > 13 13 11 11 11 > Sum: 234, Avg: 13, Min: 11, Max: 17, Diff: 6] > [Scan RS (ms): 0.4 0.3 0.5 0.5 0.7 0.6 0.6 0.7 0.6 0.5 > 0.5 0.5 0.7 0.6 0.6 0.4 0.6 0.1 > Avg: 0.5, Min: 0.1, Max: 0.7, Diff: 0.6] > [Object Copy (ms): 47.2 47.1 46.9 47.0 47.0 48.6 46.9 > 47.0 46.9 46.9 47.0 46.9 47.0 47.7 47.1 47.0 46.9 48.3 > Avg: 47.2, Min: 46.9, Max: 48.6, Diff: 1.7] > [Termination (ms): 1.9 1.8 1.9 1.9 1.9 0.3 1.8 1.8 1.9 > 1.9 1.8 1.8 1.9 1.2 1.7 1.9 1.8 0.0 > Avg: 1.6, Min: 0.0, Max: 1.9, Diff: 1.9] > [Termination Attempts : 9 8 14 13 16 16 8 1 17 20 12 14 18 11 > 18 18 17 18 > Sum: 248, Avg: 13, Min: 1, Max: 20, Diff: 19] > [GC Worker End (ms): 4571166.1 4571165.7 4571165.7 4571166.0 > 4571166.0 4571165.9 4571165.8 4571166.0 4571165.7 4571165.7 > 4571165.7 4571165.7 4571165.7 4571165.9 4571166.1 4571166.2 > 4571165.7 4571165.9 > Avg: 4571165.8, Min: 4571165.7, Max: 4571166.2, Diff: 0.5] > [GC Worker (ms): 59.1 58.7 58.7 58.9 58.9 58.8 58.6 58.8 > 58.5 58.4 58.4 58.4 58.3 58.6 58.7 58.7 58.2 58.4 > Avg: 58.6, Min: 58.2, Max: 59.1, Diff: 0.9] > [GC Worker Other (ms): 3.3 3.4 3.4 3.4 3.4 3.5 3.6 3.6 > 3.6 3.7 3.7 3.7 3.7 3.8 3.8 3.8 3.9 3.9 > Avg: 3.6, Min: 3.3, Max: 3.9, Diff: 0.5] > [Clear CT: 2.3 ms] > [Other: 21.2 ms] > [Choose CSet: 0.2 ms] > [Ref Proc: 5.8 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 13.6 ms] > [Eden: 6728M(6728M)->0B(4472M) Survivors: 376M->616M Heap: > 10758M(24576M)->4264M(24576M)] > [Times: user=1.08 sys=0.00, real=0.08 secs] > 2012-11-30T01:42:27.493+0000: 4576.742: [GC pause (young), 0.06992900 > secs] > [Parallel Time: 52.4 ms] > [GC Worker Start (ms): 4576742.7 4576742.8 4576742.8 > 4576742.8 4576742.9 4576742.9 4576743.0 4576743.0 4576743.1 > 4576743.1 4576743.1 4576743.2 4576743.2 4576743.2 4576743.3 > 4576743.3 4576743.3 4576743.3 > Avg: 4576743.1, Min: 4576742.7, Max: 4576743.3, Diff: 0.6] > [Ext Root Scanning (ms): 1.4 1.3 1.4 1.1 1.0 1.2 1.1 1.0 > 1.2 0.9 1.1 1.1 1.2 1.1 1.0 1.0 0.8 0.7 > Avg: 1.1, Min: 0.7, Max: 1.4, Diff: 0.7] > [Update RS (ms): 0.8 0.7 0.6 0.8 0.7 2.4 0.7 0.8 0.7 > 1.1 0.7 0.8 0.4 0.4 0.4 0.7 0.8 1.2 > Avg: 0.8, Min: 0.4, Max: 2.4, Diff: 2.0] > [Processed Buffers : 2 2 1 7 4 1 6 2 1 1 2 1 4 1 1 1 1 1 > Sum: 39, Avg: 2, Min: 1, Max: 7, Diff: 6] > [Scan RS (ms): 0.1 0.4 0.3 0.3 0.3 0.1 0.4 0.2 0.0 0.0 > 0.1 0.0 0.2 0.3 0.4 0.1 0.2 0.0 > Avg: 0.2, Min: 0.0, Max: 0.4, Diff: 0.3] > [Object Copy (ms): 47.2 47.4 47.3 47.3 47.4 45.7 47.4 > 47.4 47.2 47.4 47.4 47.1 47.5 47.2 47.2 47.1 47.4 47.0 > Avg: 47.2, Min: 45.7, Max: 47.5, Diff: 1.8] > [Termination (ms): 0.3 0.1 0.2 0.2 0.3 0.3 0.0 0.0 0.2 > 0.0 0.1 0.3 0.0 0.2 0.2 0.3 0.1 0.3 > Avg: 0.2, Min: 0.0, Max: 0.3, Diff: 0.3] > [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] > [GC Worker End (ms): 4576792.6 4576792.6 4576792.6 4576792.7 > 4576792.8 4576792.7 4576792.8 4576792.9 4576792.8 4576792.7 > 4576792.6 4576792.6 4576792.8 4576792.6 4576792.6 4576792.6 > 4576792.6 4576792.6 > Avg: 4576792.7, Min: 4576792.6, Max: 4576792.9, Diff: 0.3] > [GC Worker (ms): 49.8 49.8 49.8 49.9 49.9 49.8 49.8 49.8 > 49.7 49.5 49.5 49.4 49.6 49.3 49.3 49.3 49.3 49.3 > Avg: 49.6, Min: 49.3, Max: 49.9, Diff: 0.6] > [GC Worker Other (ms): 2.6 2.6 2.6 2.7 2.7 2.8 2.8 2.9 > 3.0 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.2 3.2 > Avg: 2.9, Min: 2.6, Max: 3.2, Diff: 0.6] > [Clear CT: 1.6 ms] > [Other: 15.9 ms] > [Choose CSet: 0.2 ms] > [Ref Proc: 3.5 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 10.0 ms] > [Eden: 4472M(4472M)->0B(16704M) Survivors: 616M->288M Heap: > 8736M(24576M)->4279M(24576M)] > [Times: user=0.92 sys=0.01, real=0.07 secs] > 2012-11-30T01:43:06.366+0000: 4615.615: [GC pause (young) (to-space > overflow), 7.55314600 secs] > [Parallel Time: 5994.5 ms] > [GC Worker Start (ms): 4615616.3 4615616.4 4615616.4 > 4615616.4 4615616.4 4615616.4 4615616.5 4615616.5 4615616.5 > 4615616.6 4615616.6 4615616.7 4615616.7 4615616.7 4615616.8 > 4615616.8 4615616.8 4615616.9 > Avg: 4615616.6, Min: 4615616.3, Max: 4615616.9, Diff: 0.6] > [Ext Root Scanning (ms): 1.2 1.3 1.3 1.2 1.2 1.2 1.6 1.0 > 1.1 1.0 0.9 0.9 1.4 1.1 0.9 0.9 0.9 1.0 > Avg: 1.1, Min: 0.9, Max: 1.6, Diff: 0.8] > [Update RS (ms): 9.8 10.1 9.5 9.5 10.0 9.6 10.4 9.5 9.8 > 9.7 10.9 9.7 9.1 9.9 10.2 9.7 9.5 9.6 > Avg: 9.8, Min: 9.1, Max: 10.9, Diff: 1.8] > [Processed Buffers : 16 17 14 19 14 13 14 14 15 18 13 13 12 > 14 13 12 14 14 > Sum: 259, Avg: 14, Min: 12, Max: 19, Diff: 7] > [Scan RS (ms): 1.1 0.6 1.3 1.3 0.8 1.0 0.1 1.3 1.0 1.1 > 0.1 1.2 1.3 0.7 0.5 1.0 1.2 0.9 > Avg: 0.9, Min: 0.1, Max: 1.3, Diff: 1.2] > [Object Copy (ms): 5979.6 5978.9 5979.2 5979.1 5979.5 > 5979.3 5978.7 5979.6 5979.2 5979.5 5979.2 5979.3 5978.9 > 5979.0 5979.1 5979.2 5979.0 5979.5 > Avg: 5979.2, Min: 5978.7, Max: 5979.6, Diff: 1.0] > [Termination (ms): 0.1 0.7 0.2 0.5 0.1 0.3 0.6 0.2 0.3 > 0.0 0.2 0.2 0.6 0.5 0.4 0.3 0.6 0.0 > Avg: 0.3, Min: 0.0, Max: 0.7, Diff: 0.7] > [Termination Attempts : 3 1 1 3 3 5 3 4 4 4 4 1 3 3 3 5 6 4 > Sum: 60, Avg: 3, Min: 1, Max: 6, Diff: 5] > [GC Worker End (ms): 4621608.7 4621608.0 4621608.0 4621608.7 > 4621608.4 4621608.5 4621608.0 4621608.3 4621608.6 4621608.0 > 4621608.2 4621608.0 4621608.0 4621608.1 4621608.3 4621608.4 > 4621608.0 4621608.1 > Avg: 4621608.2, Min: 4621608.0, Max: 4621608.7, Diff: 0.7] > [GC Worker (ms): 5992.4 5991.6 5991.6 5992.3 5992.0 5992.1 > 5991.5 5991.8 5992.1 5991.4 5991.5 5991.3 5991.3 5991.3 > 5991.6 5991.6 5991.2 5991.2 > Avg: 5991.7, Min: 5991.2, Max: 5992.4, Diff: 1.2] > [GC Worker Other (ms): 2.9 2.9 3.0 3.0 3.0 3.0 3.0 3.0 > 3.1 3.1 3.2 3.2 3.2 3.3 3.3 3.3 3.4 3.5 > Avg: 3.1, Min: 2.9, Max: 3.5, Diff: 0.6] > [Clear CT: 3.3 ms] > [Other: 1555.4 ms] > [Choose CSet: 0.4 ms] > [Ref Proc: 57.3 ms] > [Ref Enq: 0.3 ms] > [Free CSet: 19.4 ms] > [Eden: 16704M(16704M)->0B(2784M) Survivors: 288M->2128M Heap: > 21362M(24576M)->22239M(24576M)] > [Times: user=35.37 sys=0.33, real=7.55 secs] > 2012-11-30T01:43:16.740+0000: 4625.989: [GC pause (young) (to-space > overflow) (initial-mark), 23.61614000 secs] > [Parallel Time: 21781.4 ms] > [GC Worker Start (ms): 4625989.9 4625990.0 4625990.1 > 4625990.1 4625990.2 4625990.2 4625990.2 4625990.3 4625990.3 > 4625990.3 4625990.4 4625990.4 4625990.4 4625990.4 4625990.5 > 4625990.5 4625990.5 4625990.5 > Avg: 4625990.3, Min: 4625989.9, Max: 4625990.5, Diff: 0.6] > [Ext Root Scanning (ms): 1.7 2.0 1.4 1.2 1.4 1.4 1.4 1.4 > 1.1 1.2 1.2 1.0 1.1 1.1 1.1 0.9 1.1 1.1 > Avg: 1.3, Min: 0.9, Max: 2.0, Diff: 1.1] > [Update RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.4 0.5 > 0.6 0.6 1.4 0.5 0.9 0.6 0.7 0.5 0.7 > Avg: 0.4, Min: 0.0, Max: 1.4, Diff: 1.4] > [Processed Buffers : 0 0 0 0 0 0 16 4 11 22 30 39 30 26 9 62 21 9 > Sum: 279, Avg: 15, Min: 0, Max: 62, Diff: 62] > [Scan RS (ms): 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.6 0.6 0.5 > 0.6 0.0 0.8 0.4 0.5 0.7 0.6 0.4 > Avg: 0.4, Min: 0.0, Max: 0.8, Diff: 0.7] > [Object Copy (ms): 21776.6 21776.1 21776.5 21776.6 21776.2 > 21776.4 21775.0 21774.7 21775.3 21775.3 21775.0 21774.8 > 21775.4 21774.6 21775.2 21774.6 21774.7 21775.0 > Avg: 21775.5, Min: 21774.6, Max: 21776.6, Diff: 2.1] > [Termination (ms): 0.0 0.3 0.4 0.4 0.5 0.3 0.7 1.0 0.6 > 0.4 0.6 0.7 0.2 1.0 0.5 1.0 1.0 0.7 > Avg: 0.6, Min: 0.0, Max: 1.0, Diff: 1.0] > [Termination Attempts : 1 1 1 2 2 1 1 1 3 1 1 1 1 1 1 2 2 1 > Sum: 24, Avg: 1, Min: 1, Max: 3, Diff: 2] > [GC Worker End (ms): 4647768.4 4647768.5 4647768.5 4647768.4 > 4647768.4 4647768.7 4647768.8 4647768.5 4647768.5 4647768.4 > 4647768.7 4647768.5 4647768.4 4647768.4 4647768.6 4647768.6 > 4647768.4 4647768.4 > Avg: 4647768.5, Min: 4647768.4, Max: 4647768.8, Diff: 0.3] > [GC Worker (ms): 21778.5 21778.5 21778.4 21778.3 21778.3 > 21778.5 21778.5 21778.2 21778.2 21778.1 21778.3 21778.1 > 21778.0 21778.0 21778.1 21778.1 21777.9 21777.9 > Avg: 21778.2, Min: 21777.9, Max: 21778.5, Diff: 0.6] > [GC Worker Other (ms): 2.9 3.0 3.0 3.1 3.1 3.2 3.2 3.2 > 3.3 3.3 3.3 3.3 3.4 3.4 3.4 3.5 3.5 3.5 > Avg: 3.3, Min: 2.9, Max: 3.5, Diff: 0.6] > [Clear CT: 1.5 ms] > [Other: 1833.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.8 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 3.6 ms] > [Eden: 1384M(2784M)->0B(4912M) Survivors: 2128M->0B Heap: > 23623M(24576M)->23623M(24576M)] > [Times: user=33.81 sys=0.54, real=23.62 secs] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC > concurrent-root-region-scan-start] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC > concurrent-root-region-scan-end, 0.0000690] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-mark-start] > 2012-11-30T01:43:40.358+0000: 4649.607: [GC pause (young), 1.66905500 > secs] > [Parallel Time: 1667.5 ms] > [GC Worker Start (ms): 4649607.0 4649607.0 4649607.1 > 4649607.2 4649607.2 4649607.3 4649607.3 4649607.3 4649607.4 > 4649607.8 4649607.4 4649607.4 4649607.5 4649607.5 4649607.6 > 4649607.6 4649607.6 4649607.7 > Avg: 4649607.4, Min: 4649607.0, Max: 4649607.8, Diff: 0.8] > [Ext Root Scanning (ms): 1.6 2.1 2.3 1.5 2.1 1.6 2.0 1.9 > 1.8 1.7 1.9 2.1 1.6 1.7 1.4 1.6 1.2 1.3 > Avg: 1.7, Min: 1.2, Max: 2.3, Diff: 1.1] > [SATB Filtering (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] > [Update RS (ms): 1662.9 1662.8 1662.5 1663.0 1662.6 1663.1 > 1662.6 1662.7 1662.9 1662.5 1662.6 1662.4 1662.9 1662.8 > 1663.1 1662.8 1663.2 1663.1 > Avg: 1662.8, Min: 1662.4, Max: 1663.2, Diff: 0.7] > [Processed Buffers : 10947 10965 17033 10676 10631 10692 > 10524 10706 16833 10703 10751 12889 10735 18332 18581 19972 16982 19856 > Sum: 247808, Avg: 13767, Min: 10524, Max: 19972, Diff: 9448] > [Scan RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] > [Object Copy (ms): 0.5 0.1 0.1 0.3 0.2 0.1 0.1 0.1 0.0 > 0.1 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 > Avg: 0.1, Min: 0.0, Max: 0.5, Diff: 0.5] > [Termination (ms): 0.0 0.0 0.1 0.1 0.0 0.1 0.1 0.1 0.0 > 0.0 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.0 > Avg: 0.0, Min: 0.0, Max: 0.1, Diff: 0.1] > [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] > [GC Worker End (ms): 4651272.1 4651272.1 4651272.2 4651272.1 > 4651272.1 4651272.2 4651272.1 4651272.1 4651272.1 4651272.1 > 4651272.1 4651272.1 4651272.1 4651272.2 4651272.1 4651272.1 > 4651272.1 4651272.1 > Avg: 4651272.1, Min: 4651272.1, Max: 4651272.2, Diff: 0.0] > [GC Worker (ms): 1665.1 1665.1 1665.0 1665.0 1664.9 1664.9 > 1664.8 1664.8 1664.8 1664.4 1664.8 1664.7 1664.7 1664.6 > 1664.6 1664.5 1664.5 1664.5 > Avg: 1664.8, Min: 1664.4, Max: 1665.1, Diff: 0.8] > [GC Worker Other (ms): 2.4 2.4 2.5 2.6 2.6 2.6 2.7 2.7 > 2.7 3.2 2.8 2.8 2.8 2.9 3.0 3.0 3.0 3.1 > Avg: 2.8, Min: 2.4, Max: 3.2, Diff: 0.8] > [Complete CSet Marking: 0.0 ms] > [Clear CT: 0.5 ms] > [Other: 1.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.8 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.2 ms] > [Eden: 0B(4912M)->0B(4912M) Survivors: 0B->0B Heap: > 23623M(24576M)->23623M(24576M)] > ------------------------------------------------------------------------ > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121130/51937093/attachment-0001.html From ryebrye at gmail.com Fri Nov 30 16:03:48 2012 From: ryebrye at gmail.com (Ryan Gardner) Date: Fri, 30 Nov 2012 19:03:48 -0500 Subject: G1 to-space-overflow on one server (but not on another identical server under identical load) In-Reply-To: <50B8F940.3040001@oracle.com> References: <50B8F940.3040001@oracle.com> Message-ID: On Fri, Nov 30, 2012 at 1:21 PM, John Cuthbertson < john.cuthbertson at oracle.com> wrote: > ** > Hi Ryan, > > I would definitely be interested in both logs. I would be interested in > plotting any differences in the promotion rates or start times and > durations of marking cycles. Can you also give me details of your > command-line flags? > Yes, I can. I've created a zip with both GC logs - solr4-gc.log and solr5-gc.log - along with a README.txt explaining what's going on with them. It's hosted here: http://www.filedropper.com/solrgclogs (I can host put it somewhere else if that place wont work - it's about 2 megs when zipped up) It's running in tomcat with the following CATALINA_OPTS: export CATALINA_OPTS="-Djava.awt.headless=true -Dfile.encoding=UTF-8 -server \ -Xms24G -Xmx24G \ -XX:+UseG1GC -XX:+AggressiveOpts -XX:MaxGCPauseMillis=150 \ -XX:+ExplicitGCInvokesConcurrent \ -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:$CATALINA_BASE/logs/gc.log \ -Dcom.sun.management.jmxremote \ -Dcom.sun.management.jmxremote.port=39650 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=true \ -Dcom.sun.management.jmxremote.password.file=$CATALINA_HOME/conf/jmxremote.password \ -Dcom.sun.management.jmxremote.access.file=$CATALINA_HOME/conf/jmxremote.access \ -Dsolr.data.dir=/db/solr -Dsolr.directoryFactory=solr.NIOFSDirectoryFactory -Dsolr.replication.enable.master=false -Dsolr.replication.enable.slave=false" > > There's a couple of experiments to try: > > * Lowering your IHOP (InitiatiangHeapOccupancyPercent - default value is > 45) to start the marking cycle a little bit earlier > The percent when it has been starting seems to be a lot higher (it seems to move around based on when it starts the concurrent phase) - Is there a G1 equivalent of the "UseCMSInitiatingOccupancyOnly" > * Increase the value of G1ReservePercent (default value 10). > > Also do you know approximately if your application creates large arrays? > If so then you could be running into issues with "holes" in the heap > associated with these (in G1 parlance) humongous objects. This can be > reduced by explicitly setting the value for G1HeapRegionSize. By my guess > your region size will be either 8m or 16m (based upon the size of your > heap). Increasing the value can increase the packing density since some > humongous objects are no longer humongous; reducing it can increase the > number of humongous objects but reduce the size and number of "holes" since > the humongous objects may occupy a larger number of regions but the wasted > space at the end of these objects is smaller. > The application is basically an out-of-the box version of SOLR 4.0 with some configuration tweaks. It has a handful of relatively small sized caches that live for about 2 minutes before being unused, and while they are alive some of those caches churn through a large number of objects. I have plenty of heap overhead - I can run some tests with both a larger and smaller G1HeapRegionSize setting to see what influence it has on the GC performance of the app. I've also noticed that my young gen collections are rather long - I was hoping that G1's adaptive sizing would realize that the young gen GC's are taking more time than my target pause time and size down the young gen area - but it doesn't seem to be. Is setting Xmn or G1 One other thing I've noticed, is that enabling ParallelRefProcessing has had a big positive impact on the GC performance (the average GC remark GC ref-proc pause with default settings was 0.46 seconds - with it parallel the average is 0.05 seconds) I've since enabled that, and I've started looking at enabling +XX:UseLargePages and a couple of other options to get it to work better. I've been hesistant about telling G1 an Xmn to use since it seemed in earlier tests that it will not use any adaptive sizing for the young generation if it has an Xmn and I was assuming that ergonomics would find a better long-term young gen than I did, but the roughtly 5GB eden seems to be yielding young gen pauses of around 700 or 900ms - which is a lot higher than our pause target for this app. (We have plenty of spare CPU available on the server and would prefer to push the objects into old gen and let the concurrent phases deal with it - if we can keep the overall pauses low we can even increase the heap size if we need to - the servers have 120GB of RAM in them currently but we need to leave at least 64GB of RAM unallocated for the OS to be able to cache the index files it uses) > After looking at your logs I may have a few more suggestions. > > BTW we have a couple of CRs (and ideas) to reduce the duration of pauses > that experience a to-space overflow/exhaustion. I'm not sure they're public > yet but they are: > > 8003235 G1: > Parellelize displaced header restoration during evacuation failures > 8003237 G1: Reduce unnecessary (and failing) allocation attempts > when handling an evacuation failure > The first one should address the high "other" time; the second should help > to reduce the object copy time > > Cool :) I can't find any details on those - hopefully they'll work their way into a version I can use in soon :) > Thanks. > > JohnC > > > On 11/29/12 19:29, Ryan Gardner wrote: > > We're doing some load testing of an instance of a solr search that has a > pretty frequent replication... the object allocation rates are all over the > map, and tuning it with CMS was very difficult and we weren't able to meet > our latency targets - so we decided to try G1. > > I'm running some tests now, and I have two identical servers that are > getting identical traffic. on one server, here's an excerpt from the gc > log: > > Here are the VM arguments that I'm using: > -XX:+PrintGCDateStamps -XX:+PrintGCDetails > -XX:+ExplicitGCInvokesConcurrent -XX:MaxGCPauseMillis=150 > -XX:+AggressiveOpts -XX:+UseG1GC -Xmx24G -Xms24G > > Is there some option I can set to stack the deck in favor of avoiding a > "to-space-overflow"? > > If anyone is interested, I can provide the full GC logs from both of the > servers - > > 2012-11-30T01:42:21.857+0000: 4571.106: [GC pause (young), 0.08561600 > secs] > [Parallel Time: 62.0 ms] > [GC Worker Start (ms): 4571107.0 4571107.0 4571107.0 4571107.0 > 4571107.1 4571107.1 4571107.2 4571107.2 4571107.2 4571107.3 > 4571107.3 4571107.3 4571107.4 4571107.4 4571107.4 4571107.5 > 4571107.5 4571107.5 > Avg: 4571107.2, Min: 4571107.0, Max: 4571107.5, Diff: 0.5] > [Ext Root Scanning (ms): 1.2 1.2 1.1 1.1 1.1 1.0 1.0 1.0 > 1.0 1.0 0.9 1.2 0.8 0.9 1.0 0.9 0.8 0.8 > Avg: 1.0, Min: 0.8, Max: 1.2, Diff: 0.5] > [Update RS (ms): 8.1 8.3 8.1 8.1 8.0 8.1 8.2 7.9 8.1 8.1 > 8.2 7.9 8.0 8.0 7.8 8.1 8.1 8.9 > Avg: 8.1, Min: 7.8, Max: 8.9, Diff: 1.1] > [Processed Buffers : 13 14 15 13 15 11 17 15 11 14 14 11 12 13 13 > 11 11 11 > Sum: 234, Avg: 13, Min: 11, Max: 17, Diff: 6] > [Scan RS (ms): 0.4 0.3 0.5 0.5 0.7 0.6 0.6 0.7 0.6 0.5 > 0.5 0.5 0.7 0.6 0.6 0.4 0.6 0.1 > Avg: 0.5, Min: 0.1, Max: 0.7, Diff: 0.6] > [Object Copy (ms): 47.2 47.1 46.9 47.0 47.0 48.6 46.9 47.0 > 46.9 46.9 47.0 46.9 47.0 47.7 47.1 47.0 46.9 48.3 > Avg: 47.2, Min: 46.9, Max: 48.6, Diff: 1.7] > [Termination (ms): 1.9 1.8 1.9 1.9 1.9 0.3 1.8 1.8 1.9 1.9 > 1.8 1.8 1.9 1.2 1.7 1.9 1.8 0.0 > Avg: 1.6, Min: 0.0, Max: 1.9, Diff: 1.9] > [Termination Attempts : 9 8 14 13 16 16 8 1 17 20 12 14 18 11 18 > 18 17 18 > Sum: 248, Avg: 13, Min: 1, Max: 20, Diff: 19] > [GC Worker End (ms): 4571166.1 4571165.7 4571165.7 4571166.0 > 4571166.0 4571165.9 4571165.8 4571166.0 4571165.7 4571165.7 > 4571165.7 4571165.7 4571165.7 4571165.9 4571166.1 4571166.2 > 4571165.7 4571165.9 > Avg: 4571165.8, Min: 4571165.7, Max: 4571166.2, Diff: 0.5] > [GC Worker (ms): 59.1 58.7 58.7 58.9 58.9 58.8 58.6 58.8 > 58.5 58.4 58.4 58.4 58.3 58.6 58.7 58.7 58.2 58.4 > Avg: 58.6, Min: 58.2, Max: 59.1, Diff: 0.9] > [GC Worker Other (ms): 3.3 3.4 3.4 3.4 3.4 3.5 3.6 3.6 3.6 > 3.7 3.7 3.7 3.7 3.8 3.8 3.8 3.9 3.9 > Avg: 3.6, Min: 3.3, Max: 3.9, Diff: 0.5] > [Clear CT: 2.3 ms] > [Other: 21.2 ms] > [Choose CSet: 0.2 ms] > [Ref Proc: 5.8 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 13.6 ms] > [Eden: 6728M(6728M)->0B(4472M) Survivors: 376M->616M Heap: > 10758M(24576M)->4264M(24576M)] > [Times: user=1.08 sys=0.00, real=0.08 secs] > 2012-11-30T01:42:27.493+0000: 4576.742: [GC pause (young), 0.06992900 secs] > [Parallel Time: 52.4 ms] > [GC Worker Start (ms): 4576742.7 4576742.8 4576742.8 4576742.8 > 4576742.9 4576742.9 4576743.0 4576743.0 4576743.1 4576743.1 > 4576743.1 4576743.2 4576743.2 4576743.2 4576743.3 4576743.3 > 4576743.3 4576743.3 > Avg: 4576743.1, Min: 4576742.7, Max: 4576743.3, Diff: 0.6] > [Ext Root Scanning (ms): 1.4 1.3 1.4 1.1 1.0 1.2 1.1 1.0 > 1.2 0.9 1.1 1.1 1.2 1.1 1.0 1.0 0.8 0.7 > Avg: 1.1, Min: 0.7, Max: 1.4, Diff: 0.7] > [Update RS (ms): 0.8 0.7 0.6 0.8 0.7 2.4 0.7 0.8 0.7 1.1 > 0.7 0.8 0.4 0.4 0.4 0.7 0.8 1.2 > Avg: 0.8, Min: 0.4, Max: 2.4, Diff: 2.0] > [Processed Buffers : 2 2 1 7 4 1 6 2 1 1 2 1 4 1 1 1 1 1 > Sum: 39, Avg: 2, Min: 1, Max: 7, Diff: 6] > [Scan RS (ms): 0.1 0.4 0.3 0.3 0.3 0.1 0.4 0.2 0.0 0.0 > 0.1 0.0 0.2 0.3 0.4 0.1 0.2 0.0 > Avg: 0.2, Min: 0.0, Max: 0.4, Diff: 0.3] > [Object Copy (ms): 47.2 47.4 47.3 47.3 47.4 45.7 47.4 47.4 > 47.2 47.4 47.4 47.1 47.5 47.2 47.2 47.1 47.4 47.0 > Avg: 47.2, Min: 45.7, Max: 47.5, Diff: 1.8] > [Termination (ms): 0.3 0.1 0.2 0.2 0.3 0.3 0.0 0.0 0.2 0.0 > 0.1 0.3 0.0 0.2 0.2 0.3 0.1 0.3 > Avg: 0.2, Min: 0.0, Max: 0.3, Diff: 0.3] > [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] > [GC Worker End (ms): 4576792.6 4576792.6 4576792.6 4576792.7 > 4576792.8 4576792.7 4576792.8 4576792.9 4576792.8 4576792.7 > 4576792.6 4576792.6 4576792.8 4576792.6 4576792.6 4576792.6 > 4576792.6 4576792.6 > Avg: 4576792.7, Min: 4576792.6, Max: 4576792.9, Diff: 0.3] > [GC Worker (ms): 49.8 49.8 49.8 49.9 49.9 49.8 49.8 49.8 > 49.7 49.5 49.5 49.4 49.6 49.3 49.3 49.3 49.3 49.3 > Avg: 49.6, Min: 49.3, Max: 49.9, Diff: 0.6] > [GC Worker Other (ms): 2.6 2.6 2.6 2.7 2.7 2.8 2.8 2.9 3.0 > 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.2 3.2 > Avg: 2.9, Min: 2.6, Max: 3.2, Diff: 0.6] > [Clear CT: 1.6 ms] > [Other: 15.9 ms] > [Choose CSet: 0.2 ms] > [Ref Proc: 3.5 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 10.0 ms] > [Eden: 4472M(4472M)->0B(16704M) Survivors: 616M->288M Heap: > 8736M(24576M)->4279M(24576M)] > [Times: user=0.92 sys=0.01, real=0.07 secs] > 2012-11-30T01:43:06.366+0000: 4615.615: [GC pause (young) (to-space > overflow), 7.55314600 secs] > [Parallel Time: 5994.5 ms] > [GC Worker Start (ms): 4615616.3 4615616.4 4615616.4 4615616.4 > 4615616.4 4615616.4 4615616.5 4615616.5 4615616.5 4615616.6 > 4615616.6 4615616.7 4615616.7 4615616.7 4615616.8 4615616.8 > 4615616.8 4615616.9 > Avg: 4615616.6, Min: 4615616.3, Max: 4615616.9, Diff: 0.6] > [Ext Root Scanning (ms): 1.2 1.3 1.3 1.2 1.2 1.2 1.6 1.0 > 1.1 1.0 0.9 0.9 1.4 1.1 0.9 0.9 0.9 1.0 > Avg: 1.1, Min: 0.9, Max: 1.6, Diff: 0.8] > [Update RS (ms): 9.8 10.1 9.5 9.5 10.0 9.6 10.4 9.5 9.8 > 9.7 10.9 9.7 9.1 9.9 10.2 9.7 9.5 9.6 > Avg: 9.8, Min: 9.1, Max: 10.9, Diff: 1.8] > [Processed Buffers : 16 17 14 19 14 13 14 14 15 18 13 13 12 14 13 > 12 14 14 > Sum: 259, Avg: 14, Min: 12, Max: 19, Diff: 7] > [Scan RS (ms): 1.1 0.6 1.3 1.3 0.8 1.0 0.1 1.3 1.0 1.1 > 0.1 1.2 1.3 0.7 0.5 1.0 1.2 0.9 > Avg: 0.9, Min: 0.1, Max: 1.3, Diff: 1.2] > [Object Copy (ms): 5979.6 5978.9 5979.2 5979.1 5979.5 5979.3 > 5978.7 5979.6 5979.2 5979.5 5979.2 5979.3 5978.9 5979.0 5979.1 > 5979.2 5979.0 5979.5 > Avg: 5979.2, Min: 5978.7, Max: 5979.6, Diff: 1.0] > [Termination (ms): 0.1 0.7 0.2 0.5 0.1 0.3 0.6 0.2 0.3 0.0 > 0.2 0.2 0.6 0.5 0.4 0.3 0.6 0.0 > Avg: 0.3, Min: 0.0, Max: 0.7, Diff: 0.7] > [Termination Attempts : 3 1 1 3 3 5 3 4 4 4 4 1 3 3 3 5 6 4 > Sum: 60, Avg: 3, Min: 1, Max: 6, Diff: 5] > [GC Worker End (ms): 4621608.7 4621608.0 4621608.0 4621608.7 > 4621608.4 4621608.5 4621608.0 4621608.3 4621608.6 4621608.0 > 4621608.2 4621608.0 4621608.0 4621608.1 4621608.3 4621608.4 > 4621608.0 4621608.1 > Avg: 4621608.2, Min: 4621608.0, Max: 4621608.7, Diff: 0.7] > [GC Worker (ms): 5992.4 5991.6 5991.6 5992.3 5992.0 5992.1 > 5991.5 5991.8 5992.1 5991.4 5991.5 5991.3 5991.3 5991.3 5991.6 > 5991.6 5991.2 5991.2 > Avg: 5991.7, Min: 5991.2, Max: 5992.4, Diff: 1.2] > [GC Worker Other (ms): 2.9 2.9 3.0 3.0 3.0 3.0 3.0 3.0 3.1 > 3.1 3.2 3.2 3.2 3.3 3.3 3.3 3.4 3.5 > Avg: 3.1, Min: 2.9, Max: 3.5, Diff: 0.6] > [Clear CT: 3.3 ms] > [Other: 1555.4 ms] > [Choose CSet: 0.4 ms] > [Ref Proc: 57.3 ms] > [Ref Enq: 0.3 ms] > [Free CSet: 19.4 ms] > [Eden: 16704M(16704M)->0B(2784M) Survivors: 288M->2128M Heap: > 21362M(24576M)->22239M(24576M)] > [Times: user=35.37 sys=0.33, real=7.55 secs] > 2012-11-30T01:43:16.740+0000: 4625.989: [GC pause (young) (to-space > overflow) (initial-mark), 23.61614000 secs] > [Parallel Time: 21781.4 ms] > [GC Worker Start (ms): 4625989.9 4625990.0 4625990.1 4625990.1 > 4625990.2 4625990.2 4625990.2 4625990.3 4625990.3 4625990.3 > 4625990.4 4625990.4 4625990.4 4625990.4 4625990.5 4625990.5 > 4625990.5 4625990.5 > Avg: 4625990.3, Min: 4625989.9, Max: 4625990.5, Diff: 0.6] > [Ext Root Scanning (ms): 1.7 2.0 1.4 1.2 1.4 1.4 1.4 1.4 > 1.1 1.2 1.2 1.0 1.1 1.1 1.1 0.9 1.1 1.1 > Avg: 1.3, Min: 0.9, Max: 2.0, Diff: 1.1] > [Update RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.4 0.5 0.6 > 0.6 1.4 0.5 0.9 0.6 0.7 0.5 0.7 > Avg: 0.4, Min: 0.0, Max: 1.4, Diff: 1.4] > [Processed Buffers : 0 0 0 0 0 0 16 4 11 22 30 39 30 26 9 62 21 9 > Sum: 279, Avg: 15, Min: 0, Max: 62, Diff: 62] > [Scan RS (ms): 0.1 0.1 0.1 0.1 0.1 0.1 0.8 0.6 0.6 0.5 > 0.6 0.0 0.8 0.4 0.5 0.7 0.6 0.4 > Avg: 0.4, Min: 0.0, Max: 0.8, Diff: 0.7] > [Object Copy (ms): 21776.6 21776.1 21776.5 21776.6 21776.2 > 21776.4 21775.0 21774.7 21775.3 21775.3 21775.0 21774.8 21775.4 > 21774.6 21775.2 21774.6 21774.7 21775.0 > Avg: 21775.5, Min: 21774.6, Max: 21776.6, Diff: 2.1] > [Termination (ms): 0.0 0.3 0.4 0.4 0.5 0.3 0.7 1.0 0.6 0.4 > 0.6 0.7 0.2 1.0 0.5 1.0 1.0 0.7 > Avg: 0.6, Min: 0.0, Max: 1.0, Diff: 1.0] > [Termination Attempts : 1 1 1 2 2 1 1 1 3 1 1 1 1 1 1 2 2 1 > Sum: 24, Avg: 1, Min: 1, Max: 3, Diff: 2] > [GC Worker End (ms): 4647768.4 4647768.5 4647768.5 4647768.4 > 4647768.4 4647768.7 4647768.8 4647768.5 4647768.5 4647768.4 > 4647768.7 4647768.5 4647768.4 4647768.4 4647768.6 4647768.6 > 4647768.4 4647768.4 > Avg: 4647768.5, Min: 4647768.4, Max: 4647768.8, Diff: 0.3] > [GC Worker (ms): 21778.5 21778.5 21778.4 21778.3 21778.3 > 21778.5 21778.5 21778.2 21778.2 21778.1 21778.3 21778.1 21778.0 > 21778.0 21778.1 21778.1 21777.9 21777.9 > Avg: 21778.2, Min: 21777.9, Max: 21778.5, Diff: 0.6] > [GC Worker Other (ms): 2.9 3.0 3.0 3.1 3.1 3.2 3.2 3.2 3.3 > 3.3 3.3 3.3 3.4 3.4 3.4 3.5 3.5 3.5 > Avg: 3.3, Min: 2.9, Max: 3.5, Diff: 0.6] > [Clear CT: 1.5 ms] > [Other: 1833.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.8 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 3.6 ms] > [Eden: 1384M(2784M)->0B(4912M) Survivors: 2128M->0B Heap: > 23623M(24576M)->23623M(24576M)] > [Times: user=33.81 sys=0.54, real=23.62 secs] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC > concurrent-root-region-scan-start] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC > concurrent-root-region-scan-end, 0.0000690] > 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-mark-start] > 2012-11-30T01:43:40.358+0000: 4649.607: [GC pause (young), 1.66905500 secs] > [Parallel Time: 1667.5 ms] > [GC Worker Start (ms): 4649607.0 4649607.0 4649607.1 4649607.2 > 4649607.2 4649607.3 4649607.3 4649607.3 4649607.4 4649607.8 > 4649607.4 4649607.4 4649607.5 4649607.5 4649607.6 4649607.6 > 4649607.6 4649607.7 > Avg: 4649607.4, Min: 4649607.0, Max: 4649607.8, Diff: 0.8] > [Ext Root Scanning (ms): 1.6 2.1 2.3 1.5 2.1 1.6 2.0 1.9 > 1.8 1.7 1.9 2.1 1.6 1.7 1.4 1.6 1.2 1.3 > Avg: 1.7, Min: 1.2, Max: 2.3, Diff: 1.1] > [SATB Filtering (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] > [Update RS (ms): 1662.9 1662.8 1662.5 1663.0 1662.6 1663.1 > 1662.6 1662.7 1662.9 1662.5 1662.6 1662.4 1662.9 1662.8 1663.1 > 1662.8 1663.2 1663.1 > Avg: 1662.8, Min: 1662.4, Max: 1663.2, Diff: 0.7] > [Processed Buffers : 10947 10965 17033 10676 10631 10692 10524 > 10706 16833 10703 10751 12889 10735 18332 18581 19972 16982 19856 > Sum: 247808, Avg: 13767, Min: 10524, Max: 19972, Diff: 9448] > [Scan RS (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] > [Object Copy (ms): 0.5 0.1 0.1 0.3 0.2 0.1 0.1 0.1 0.0 0.1 > 0.2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 > Avg: 0.1, Min: 0.0, Max: 0.5, Diff: 0.5] > [Termination (ms): 0.0 0.0 0.1 0.1 0.0 0.1 0.1 0.1 0.0 0.0 > 0.1 0.1 0.1 0.1 0.1 0.0 0.1 0.0 > Avg: 0.0, Min: 0.0, Max: 0.1, Diff: 0.1] > [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 > Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0] > [GC Worker End (ms): 4651272.1 4651272.1 4651272.2 4651272.1 > 4651272.1 4651272.2 4651272.1 4651272.1 4651272.1 4651272.1 > 4651272.1 4651272.1 4651272.1 4651272.2 4651272.1 4651272.1 > 4651272.1 4651272.1 > Avg: 4651272.1, Min: 4651272.1, Max: 4651272.2, Diff: 0.0] > [GC Worker (ms): 1665.1 1665.1 1665.0 1665.0 1664.9 1664.9 > 1664.8 1664.8 1664.8 1664.4 1664.8 1664.7 1664.7 1664.6 1664.6 > 1664.5 1664.5 1664.5 > Avg: 1664.8, Min: 1664.4, Max: 1665.1, Diff: 0.8] > [GC Worker Other (ms): 2.4 2.4 2.5 2.6 2.6 2.6 2.7 2.7 2.7 > 3.2 2.8 2.8 2.8 2.9 3.0 3.0 3.0 3.1 > Avg: 2.8, Min: 2.4, Max: 3.2, Diff: 0.8] > [Complete CSet Marking: 0.0 ms] > [Clear CT: 0.5 ms] > [Other: 1.1 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.8 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.2 ms] > [Eden: 0B(4912M)->0B(4912M) Survivors: 0B->0B Heap: > 23623M(24576M)->23623M(24576M)] > > ------------------------------ > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121130/7432f73d/attachment-0001.html