From ivan.mamontov at gmail.com Thu May 1 10:12:15 2014 From: ivan.mamontov at gmail.com (=?UTF-8?B?0JzQsNC80L7QvdGC0L7QsiDQmNCy0LDQvQ==?=) Date: Thu, 1 May 2014 14:12:15 +0400 Subject: is it correct Heap Structure on basis of given JVM Parameters ? In-Reply-To: <53605D04.2040208@oracle.com> References: <94424C8743C3F7479E3F25C7CC00BD4104B3A60142@FLDP1LUMXC7V81.us.one.verizon.com> <53605D04.2040208@oracle.com> Message-ID: I think it will be very useful to add the following flags: - -XX:+PrintCommandLineFlags Print flags specified on command line or set by ergonomics - -XX:+PrintVMOptions Print flags that appeared on the command line - -XX:+PrintFlagsFinal Print all VM flags after argument and ergonomic processing This will avoid problems with implicit VM options. java -XX:+PrintVMOptions -XX:+PrintCommandLineFlags -XX:+PrintFlagsFinal -version VM option '+PrintFlagsFinal' VM option '+PrintVMOptions' VM option '+PrintCommandLineFlags' -XX:InitialHeapSize=262900032 -XX:MaxHeapSize=4206400512 -XX:ParallelGCThreads=4 -XX:+PrintCommandLineFlags -XX:+PrintFlagsFinal -XX:+PrintVMOptions -XX:+UseCompressedOops -XX:+UseParallelGC [Global flags] uintx AdaptivePermSizeWeight = 20 {product} uintx AdaptiveSizeDecrementScaleFactor = 4 {product} uintx AdaptiveSizeMajorGCDecayTimeScale = 10 {product} uintx AdaptiveSizePausePolicy = 0 {product} ... java version "1.6.0_45" Java(TM) SE Runtime Environment (build 1.6.0_45-b06) Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode) It works on java from 1.6.0_30 to 1.7.0_45. 2014-04-30 6:16 GMT+04:00 Jon Masamitsu : > Vipin, > > If you want to see the heap after it has fully expanded, use -Xms2048m > (not -Xms32m) and > -XX:NewSize=64m (not -XX:NewSize=8m) and add -XX:PrintGCDetails -version > to your java > command line. You'll see something like this > > $JAVA_HOME/bin/java -XX:+PrintGCDetails -Xms2048m -XX:NewSize=64m > -XX:NewRatio=2 -XX:MaxNewSize=64m -XX:SurvivorRatio=25 -Xmx2048m -version > java version "1.7.0" > Java(TM) SE Runtime Environment (build 1.7.0-b147) > Java HotSpot(TM) Server VM (build 21.0-b17, mixed mode) > Heap > PSYoungGen total 63168K, used 2432K [0xf7c00000, 0xfbc00000, > 0xfbc00000) > eden space 60800K, 4% used [0xf7c00000,0xf7e600d0,0xfb760000) > from space 2368K, 0% used [0xfb9b0000,0xfb9b0000,0xfbc00000) > to space 2368K, 0% used [0xfb760000,0xfb760000,0xfb9b0000) > PSOldGen total 2031616K, used 0K [0x7bc00000, 0xf7c00000, > 0xf7c00000) > object space 2031616K, 0% used [0x7bc00000,0x7bc00000,0xf7c00000) > PSPermGen total 16384K, used 1356K [0x77c00000, 0x78c00000, > 0x7bc00000) > object space 16384K, 8% used [0x77c00000,0x77d532d0,0x78c00000) > > Jon > > > > On 4/29/2014 2:54 AM, Sharma, Vipin K wrote: > > HI All > > > > In my Java application ( JDK 7) JVM parameters are > > -Xms32m -Xmx2048m -Xss1m -XX:+UseParallelGC -XX:NewRatio=2 -XX:NewSize=8m > -XX:MaxNewSize=64m -XX:SurvivorRatio=25 -XX:+UseAdaptiveSizePolicy > > > > Xmx2048m : Maximum Heap memory is 2GB > > MaxNewSize=64m : Maximum Young Generation Size 64 MB > > NewRatio=2 : Old Generation Size will be double of > Young generation size so Maximum Old generation Size will be 64*2 > > > > As per my understanding below will be Heap structure when all parts of > heap are allocated maximum memory > > Young(64 M ) + Old ( 128 M) + Rest Memory only for Permanent > Generation Area > > > > I feel we are not using heap memory efficiently more than 1.8 GB is > allocated for Permanent Generation only. > > Is it correct understanding ? if not then what will be heap structure in > case we give max memory to all parts? > > > > > > Thanks, > > Vipin Sharma > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -- Thanks, Ivan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ryebrye at gmail.com Tue May 6 15:41:02 2014 From: ryebrye at gmail.com (Ryan Gardner) Date: Tue, 6 May 2014 11:41:02 -0400 Subject: Do -XX:+UseGCTaskAffinity -XX:+BindGCTaskThreadsToCPUs have any impact for G1 on linux? Message-ID: Anyone know off hand if -XX:+UseGCTaskAffinity or -XX:+BindGCTaskThreadsToCPUs have any impact on a 1.7u40+ x86_64 bit JVM running on linux? (I read a few places online saying they were NO OPs on linux - figured I'd ask here before I either spin up an experiment or dive into the hotspot code) Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fancyerii at gmail.com Wed May 7 05:43:51 2014 From: fancyerii at gmail.com (Li Li) Date: Wed, 7 May 2014 13:43:51 +0800 Subject: how to get default vm parameter value or the value of a running process? Message-ID: is there any command to get the oracle/openjdk vm parameter's value? e.g. get -Xss tell me the default Xss or get -Xss pid tell me this running process's Xss value From yiyeguhu at gmail.com Wed May 7 05:48:47 2014 From: yiyeguhu at gmail.com (Tao Mao) Date: Tue, 6 May 2014 22:48:47 -0700 Subject: how to get default vm parameter value or the value of a running process? In-Reply-To: References: Message-ID: Try this? java -XX:+PrintFlagFinal -version Tao On Tue, May 6, 2014 at 10:43 PM, Li Li wrote: > is there any command to get the oracle/openjdk vm parameter's value? > e.g. get -Xss tell me the default Xss or get -Xss pid tell me this > running process's Xss value > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From caoxudong818 at gmail.com Wed May 7 15:44:35 2014 From: caoxudong818 at gmail.com (=?UTF-8?B?5pu55pet5Lic?=) Date: Wed, 7 May 2014 23:44:35 +0800 Subject: how to get default vm parameter value or the value of a running process? In-Reply-To: References: Message-ID: fix typo -XX:+PrintFlagsFinal 2014-05-07 13:48 GMT+08:00 Tao Mao : > Try this? > java -XX:+PrintFlagFinal -version > > Tao > > > On Tue, May 6, 2014 at 10:43 PM, Li Li wrote: > >> is there any command to get the oracle/openjdk vm parameter's value? >> e.g. get -Xss tell me the default Xss or get -Xss pid tell me this >> running process's Xss value >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernd-2014 at eckenfels.net Wed May 7 16:15:54 2014 From: bernd-2014 at eckenfels.net (Bernd Eckenfels) Date: Wed, 7 May 2014 18:15:54 +0200 Subject: how to get default vm parameter value or the value of a running process? In-Reply-To: References: Message-ID: <20140507181554.00005144.bernd-2014@eckenfels.net> Hello, for a running process you can use jinfo to print the flags set on the comamnd line. Those flags and the one set by ergonomics will be printed with "jcmd PID VM.flags" and a full list of all flags in effect is printed by "jcmd PID VM.flags -all". (java8) Gruss Bernd Am Wed, 7 May 2014 23:44:35 +0800 schrieb ??? : > fix typo > > -XX:+PrintFlagsFinal > > > 2014-05-07 13:48 GMT+08:00 Tao Mao : > > > Try this? > > java -XX:+PrintFlagFinal -version > > > > Tao > > > > > > On Tue, May 6, 2014 at 10:43 PM, Li Li wrote: > > > >> is there any command to get the oracle/openjdk vm parameter's > >> value? e.g. get -Xss tell me the default Xss or get -Xss pid tell > >> me this running process's Xss value > >> _______________________________________________ > >> hotspot-gc-use mailing list > >> hotspot-gc-use at openjdk.java.net > >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > >> > > > > > > _______________________________________________ > > hotspot-gc-use mailing list > > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > From vitalyd at gmail.com Wed May 7 23:34:20 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 7 May 2014 19:34:20 -0400 Subject: ParNew - how does it decide if Full GC is needed Message-ID: Hi guys, I'd like to get some clarity on exactly how ParNew GC decides whether to follow-up a young GC with a full GC. Here's a snippet of a young GC followed up by full GC on jdk 7u51: 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] 29526.581: [Full GC 4112377K->4085613K(15728640K), 6.8704770 secs] The vm args are: -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 So it's expected that most objects in the young gen will die after a young collection, and that's the case here (~7gb collected). We have about 4gb survivors, which obviously overflows the two survivor spaces (they're 1gb each). Is the survivor space overflow the reason the full gc is initiated, and obviously doesn't clear much (again, as expected)? It also appears that both survivor spaces are completely empty after this full gc, whereas I'd expect some objects to stay there and only some overflow amount would be promoted to tenured size. The other possible theory behind why full gc occurs in this case is that ParNew cannot predict how much space a young gc will clear (i.e. it doesn't know that majority will be collected) and given that tenured size is pretty small compared to eden, it initiates a full gc. Alternatively, since all objects apparently got promoted to tenured after this collection (why, by the way? we don't reduce tenuring threshold and this was only the 2nd GC of the day) and the promoted size is something like 98% of old gen capacity, GC panics and does a full GC in hopes of leaving itself some breathing room in the tenured space. Or is there something else entirely? I'd greatly appreciate if someone could explain the above. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernd-2014 at eckenfels.net Thu May 8 00:40:24 2014 From: bernd-2014 at eckenfels.net (Bernd Eckenfels) Date: Thu, 8 May 2014 02:40:24 +0200 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: Message-ID: <20140508024024.000054a6.bernd-2014@eckenfels.net> Am Wed, 7 May 2014 19:34:20 -0400 schrieb Vitaly Davidovich : > The vm args are: > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. You will get 16384-12288=4gb old size, thats quite low. As you can see in your FullGC the steady state after FullGC has filled it nearly completely. Gruss Bernd From matthew.miller at forgerock.com Thu May 8 00:46:38 2014 From: matthew.miller at forgerock.com (Matt Miller) Date: Wed, 07 May 2014 20:46:38 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: Message-ID: <536AD3EE.4000409@forgerock.com> Are you sure you didn't run a jmap to get a histogram or a heap dump at this time? A full GC with the tenured space that wide open doesn't make much sense otherwise. When you take a histogram or heap dump, the GC log doesn't tell you. I think it should, but others have said that this is the expected behavior. I'd much rather it be like a System.gc() which prints (System) in the GC Log -Matt On 5/7/14, 7:34 PM, Vitaly Davidovich wrote: > Hi guys, > > I'd like to get some clarity on exactly how ParNew GC decides whether > to follow-up a young GC with a full GC. Here's a snippet of a young > GC followed up by full GC on jdk 7u51: > > 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] > 29526.581: [Full GC 4112377K->4085613K(15728640K), 6.8704770 secs] > > The vm args are: > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > So it's expected that most objects in the young gen will die after a > young collection, and that's the case here (~7gb collected). We have > about 4gb survivors, which obviously overflows the two survivor spaces > (they're 1gb each). Is the survivor space overflow the reason the > full gc is initiated, and obviously doesn't clear much (again, as > expected)? It also appears that both survivor spaces are completely > empty after this full gc, whereas I'd expect some objects to stay > there and only some overflow amount would be promoted to tenured size. > > The other possible theory behind why full gc occurs in this case is > that ParNew cannot predict how much space a young gc will clear (i.e. > it doesn't know that majority will be collected) and given that > tenured size is pretty small compared to eden, it initiates a full gc. > Alternatively, since all objects apparently got promoted to tenured > after this collection (why, by the way? we don't reduce tenuring > threshold and this was only the 2nd GC of the day) and the promoted > size is something like 98% of old gen capacity, GC panics and does a > full GC in hopes of leaving itself some breathing room in the tenured > space. > > Or is there something else entirely? > > I'd greatly appreciate if someone could explain the above. > > Thanks > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 00:55:45 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 7 May 2014 20:55:45 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <20140508024024.000054a6.bernd-2014@eckenfels.net> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> Message-ID: Yes, I know :) This is some cruft that needs to be cleaned up. So my suspicion is that full gc is triggered precisely because old gen occupancy is almost 100%, but I'd appreciate confirmation on that. What's surprising is that even though old gen is almost full, young gen has lots of room now. In fact, this system is restarted daily so we never see another young gc before the restart. The other odd observation is that survivor spaces are completely empty after this full gc despite tenuring threshold not being adjusted. My intuitive thinking is that there was no real reason for the full gc to occur; whatever allocation failed in young could now succeed and whatever was tenured fit, albeit very tightly. Sent from my phone On May 7, 2014 8:40 PM, "Bernd Eckenfels" wrote: > Am Wed, 7 May 2014 19:34:20 -0400 > schrieb Vitaly Davidovich : > > > The vm args are: > > > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. > You will get 16384-12288=4gb old size, thats quite low. As you can see > in your FullGC the steady state after FullGC has filled it nearly > completely. > > Gruss > Bernd > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 01:01:52 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 7 May 2014 21:01:52 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536AD3EE.4000409@forgerock.com> References: <536AD3EE.4000409@forgerock.com> Message-ID: I'm sure jmap wasn't run at this time. jstat reports last gc reason as allocation failure as well, although that's secondary. Right, I'm just as puzzled as to why a full gc occurred given that eden opened up again after the young collection. Only thing I can think of is either (a) gc decides on whether to do a full collection before reclaiming eden and thus is paranoid given the imbalance in size between eden and tenured or (b) it panics that after promotion tenured is nearly 100% full and decides to proactively collect it. Would be nice to figure out exactly what the gc is thinking here; I can then figure out the proper tuning approach. Sent from my phone On May 7, 2014 8:46 PM, "Matt Miller" wrote: > Are you sure you didn't run a jmap to get a histogram or a heap dump at > this time? > A full GC with the tenured space that wide open doesn't make much sense > otherwise. > > When you take a histogram or heap dump, the GC log doesn't tell you. I > think it should, but others have said that this is the expected behavior. > I'd much rather it be like a System.gc() which prints (System) in the GC Log > > -Matt > > On 5/7/14, 7:34 PM, Vitaly Davidovich wrote: > > Hi guys, > > I'd like to get some clarity on exactly how ParNew GC decides whether to > follow-up a young GC with a full GC. Here's a snippet of a young GC > followed up by full GC on jdk 7u51: > > 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] > 29526.581: [Full GC 4112377K->4085613K(15728640K), 6.8704770 secs] > > The vm args are: > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > So it's expected that most objects in the young gen will die after a > young collection, and that's the case here (~7gb collected). We have about > 4gb survivors, which obviously overflows the two survivor spaces (they're > 1gb each). Is the survivor space overflow the reason the full gc is > initiated, and obviously doesn't clear much (again, as expected)? It also > appears that both survivor spaces are completely empty after this full gc, > whereas I'd expect some objects to stay there and only some overflow amount > would be promoted to tenured size. > > The other possible theory behind why full gc occurs in this case is that > ParNew cannot predict how much space a young gc will clear (i.e. it doesn't > know that majority will be collected) and given that tenured size is pretty > small compared to eden, it initiates a full gc. Alternatively, since all > objects apparently got promoted to tenured after this collection (why, by > the way? we don't reduce tenuring threshold and this was only the 2nd GC of > the day) and the promoted size is something like 98% of old gen capacity, > GC panics and does a full GC in hopes of leaving itself some breathing room > in the tenured space. > > Or is there something else entirely? > > I'd greatly appreciate if someone could explain the above. > > Thanks > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 01:10:40 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Wed, 7 May 2014 21:10:40 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <536AD3EE.4000409@forgerock.com> Message-ID: Actually, one thing I just thought of is we force a couple of gc's at startup time (to reclaim garbage generated during init) and if that causes premature promotion to tenured space (again, not sure why that would be given we use default tenuring threshold of 7) then perhaps we hit a promotion failure during this real GC since old gen capacity is not large enough. However, I'd still appreciate it if someone could shed some light on any other possibilities here. I should also mention that I checked perm gen and direct mem usage (I.e. direct byte bufs) and neither one appears to be a suspect. Sent from my phone On May 7, 2014 9:01 PM, "Vitaly Davidovich" wrote: > I'm sure jmap wasn't run at this time. jstat reports last gc reason as > allocation failure as well, although that's secondary. > > Right, I'm just as puzzled as to why a full gc occurred given that eden > opened up again after the young collection. Only thing I can think of is > either (a) gc decides on whether to do a full collection before reclaiming > eden and thus is paranoid given the imbalance in size between eden and > tenured or (b) it panics that after promotion tenured is nearly 100% full > and decides to proactively collect it. > > Would be nice to figure out exactly what the gc is thinking here; I can > then figure out the proper tuning approach. > > Sent from my phone > On May 7, 2014 8:46 PM, "Matt Miller" > wrote: > >> Are you sure you didn't run a jmap to get a histogram or a heap dump at >> this time? >> A full GC with the tenured space that wide open doesn't make much sense >> otherwise. >> >> When you take a histogram or heap dump, the GC log doesn't tell you. I >> think it should, but others have said that this is the expected behavior. >> I'd much rather it be like a System.gc() which prints (System) in the GC Log >> >> -Matt >> >> On 5/7/14, 7:34 PM, Vitaly Davidovich wrote: >> >> Hi guys, >> >> I'd like to get some clarity on exactly how ParNew GC decides whether >> to follow-up a young GC with a full GC. Here's a snippet of a young GC >> followed up by full GC on jdk 7u51: >> >> 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] >> 29526.581: [Full GC 4112377K->4085613K(15728640K), 6.8704770 secs] >> >> The vm args are: >> >> -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >> -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >> >> So it's expected that most objects in the young gen will die after a >> young collection, and that's the case here (~7gb collected). We have about >> 4gb survivors, which obviously overflows the two survivor spaces (they're >> 1gb each). Is the survivor space overflow the reason the full gc is >> initiated, and obviously doesn't clear much (again, as expected)? It also >> appears that both survivor spaces are completely empty after this full gc, >> whereas I'd expect some objects to stay there and only some overflow amount >> would be promoted to tenured size. >> >> The other possible theory behind why full gc occurs in this case is >> that ParNew cannot predict how much space a young gc will clear (i.e. it >> doesn't know that majority will be collected) and given that tenured size >> is pretty small compared to eden, it initiates a full gc. Alternatively, >> since all objects apparently got promoted to tenured after this collection >> (why, by the way? we don't reduce tenuring threshold and this was only the >> 2nd GC of the day) and the promoted size is something like 98% of old gen >> capacity, GC panics and does a full GC in hopes of leaving itself some >> breathing room in the tenured space. >> >> Or is there something else entirely? >> >> I'd greatly appreciate if someone could explain the above. >> >> Thanks >> >> >> _______________________________________________ >> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bernd-2014 at eckenfels.net Thu May 8 05:39:24 2014 From: bernd-2014 at eckenfels.net (Bernd Eckenfels) Date: Thu, 8 May 2014 07:39:24 +0200 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <536AD3EE.4000409@forgerock.com> Message-ID: <20140508073924.00002489.bernd-2014@eckenfels.net> Am Wed, 7 May 2014 21:10:40 -0400 schrieb Vitaly Davidovich : > However, I'd still appreciate it if someone could shed some light on > any other possibilities here. I should also mention that I checked > perm gen and direct mem usage (I.e. direct byte bufs) and neither one > appears to be a suspect. It would be good if you turn on verbose garbage collection logging and post the logs somewhere, it is hard to tell anything otherwise. To me it looks like there is more promoted to old gen than you expect. Gruss Bernd From jwu at gmx.ch Thu May 8 17:16:49 2014 From: jwu at gmx.ch (=?ISO-8859-1?Q?J=F6rg_W=FCthrich?=) Date: Thu, 08 May 2014 19:16:49 +0200 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <20140508073924.00002489.bernd-2014@eckenfels.net> References: <536AD3EE.4000409@forgerock.com> <20140508073924.00002489.bernd-2014@eckenfels.net> Message-ID: <2ab5cbce-6837-4d87-9aa4-5035f169685e@email.android.com> Another possibility is perm gen that runs full. To see that, you'll have to switch to a more verbose log (use -XX:+PrintGCDetails), as Bernd mentioned. Best regards, J?rg On May 8, 2014 7:39:24 AM CEST, Bernd Eckenfels wrote: >Am Wed, 7 May 2014 21:10:40 -0400 >schrieb Vitaly Davidovich : > >> However, I'd still appreciate it if someone could shed some light on >> any other possibilities here. I should also mention that I checked >> perm gen and direct mem usage (I.e. direct byte bufs) and neither one >> appears to be a suspect. > >It would be good if you turn on verbose garbage collection logging and >post the logs somewhere, it is hard to tell anything otherwise. To me >it looks like there is more promoted to old gen than you expect. > >Gruss >Bernd >_______________________________________________ >hotspot-gc-use mailing list >hotspot-gc-use at openjdk.java.net >http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.masamitsu at oracle.com Thu May 8 17:39:08 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 08 May 2014 10:39:08 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> Message-ID: <536BC13C.5020307@oracle.com> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: > > Yes, I know :) This is some cruft that needs to be cleaned up. > > So my suspicion is that full gc is triggered precisely because old gen > occupancy is almost 100%, but I'd appreciate confirmation on that. > What's surprising is that even though old gen is almost full, young > gen has lots of room now. In fact, this system is restarted daily so > we never see another young gc before the restart. > > The other odd observation is that survivor spaces are completely empty > after this full gc despite tenuring threshold not being adjusted. > The full gc algorithm used compacts everything (old gen and young gen) into the old gen unless it does not all fit. If the old gen overflows, the young gen is compacted into itself. Live in the young gen is compacted into eden first and then into the survivor spaces. > > My intuitive thinking is that there was no real reason for the full gc > to occur; whatever allocation failed in young could now succeed and > whatever was tenured fit, albeit very tightly. > Still puzzling about the full GC. Are you using CMS? If you have PrintGCDetails output, that might help. Jon > > Sent from my phone > > On May 7, 2014 8:40 PM, "Bernd Eckenfels" > wrote: > > Am Wed, 7 May 2014 19:34:20 -0400 > schrieb Vitaly Davidovich >: > > > The vm args are: > > > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > Hmm... you have confliciting arguments here, MaxNewSize overwrites > Xmn. > You will get 16384-12288=4gb old size, thats quite low. As you can see > in your FullGC the steady state after FullGC has filled it nearly > completely. > > Gruss > Bernd > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 18:38:35 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 14:38:35 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536BC13C.5020307@oracle.com> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: Hi Jon, Nope, we're not using CMS here; this is the throughput/parallel collector setup. I was browsing some of the gc code in openjdk, and noticed a few places where each generation attempts to decide (upfront from what I can tell, i.e. before doing the collection) whether it thinks it's "safe" to perform the collection (and if it's not, it punts to the next generation) and also whether some amount of promoted bytes will fit. I didn't dig too much yet, but a cursory scan of that code leads me to think that perhaps the defNew generation is asking the next gen (i.e. tenured) whether it could handle some estimated promotion amount, and given the large imbalance between Young and Tenured size, tenured is reporting that things won't fit -- this then causes a full gc. Is that at all possible from what you know? On your first remark about compaction, just to make sure I understand, you're saying that a full GC prefers to move all live objects into tenured (this means taking objects out of survivor space and eden), irrespective of whether their tenuring threshold has been exceeded? If that compaction/migration of objects into tenured overflows tenured, then it attempts to compact the young gen, with overflow into survivor space from eden. So basically, this generation knows how to perform compaction and it's not just a copying collection? Is there a way to get the young gen to print an age table of objects in its survivor space? I couldn't find one, but perhaps I'm blind. Also, as a confirmation, System.gc() always invokes a full gc with the parallel collector, right? I believe so, but just wanted to double check while we're on the topic. Thanks On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu wrote: > > On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: > > Yes, I know :) This is some cruft that needs to be cleaned up. > > So my suspicion is that full gc is triggered precisely because old gen > occupancy is almost 100%, but I'd appreciate confirmation on that. What's > surprising is that even though old gen is almost full, young gen has lots > of room now. In fact, this system is restarted daily so we never see > another young gc before the restart. > > The other odd observation is that survivor spaces are completely empty > after this full gc despite tenuring threshold not being adjusted. > > > The full gc algorithm used compacts everything (old gen and young gen) into > the old gen unless it does not all fit. If the old gen overflows, the > young gen > is compacted into itself. Live in the young gen is compacted into eden > first and > then into the survivor spaces. > > My intuitive thinking is that there was no real reason for the full gc > to occur; whatever allocation failed in young could now succeed and > whatever was tenured fit, albeit very tightly. > > > Still puzzling about the full GC. Are you using CMS? If you have > PrintGCDetails output, > that might help. > > Jon > > Sent from my phone > On May 7, 2014 8:40 PM, "Bernd Eckenfels" > wrote: > >> Am Wed, 7 May 2014 19:34:20 -0400 >> schrieb Vitaly Davidovich : >> >> > The vm args are: >> > >> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >> >> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >> You will get 16384-12288=4gb old size, thats quite low. As you can see >> in your FullGC the steady state after FullGC has filled it nearly >> completely. >> >> Gruss >> Bernd >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Thu May 8 19:34:11 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 8 May 2014 12:34:11 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: Hi Vitaly -- On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: > Hi Jon, > > Nope, we're not using CMS here; this is the throughput/parallel collector > setup. > > I was browsing some of the gc code in openjdk, and noticed a few places > where each generation attempts to decide (upfront from what I can tell, > i.e. before doing the collection) whether it thinks it's "safe" to perform > the collection (and if it's not, it punts to the next generation) and also > whether some amount of promoted bytes will fit. > > I didn't dig too much yet, but a cursory scan of that code leads me to > think that perhaps the defNew generation is asking the next gen (i.e. > tenured) whether it could handle some estimated promotion amount, and given > the large imbalance between Young and Tenured size, tenured is reporting > that things won't fit -- this then causes a full gc. Is that at all > possible from what you know? > If that were to happen, you wouldn't see the minor gc that precedes the full gc in the log snippet you posted. The only situation I know where a minor GC is followed immediately by a major is when a minor gc didn't manage to fit an allocation request in the space available. But, thinking more about that, it can't be because one would expect that Eden knows the largest object it can allocate, so if the request is larger than will fit in young, the allocator would just go look for space in the older generation. If that didn't fit, the old gen would precipitate a gc which would collect the entire heap (all this should be taken with a dose of salt as I don't have the code in front of me as I type, and I haven't looked at the allocation policy code in ages). > > On your first remark about compaction, just to make sure I understand, > you're saying that a full GC prefers to move all live objects into tenured > (this means taking objects out of survivor space and eden), irrespective of > whether their tenuring threshold has been exceeded? If that > compaction/migration of objects into tenured overflows tenured, then it > attempts to compact the young gen, with overflow into survivor space from > eden. So basically, this generation knows how to perform compaction and > it's not just a copying collection? > That is correct. A full gc does in fact move all survivors from young gen into the old gen. This is a limitation (artificial nepotism can ensue because of "too young" objects that will soon die, getting artificially dragged into the old generation) that I had been lobbying to fix for a while now. I think there's even an old, perhaps still open, bug for this. > Is there a way to get the young gen to print an age table of objects in > its survivor space? I couldn't find one, but perhaps I'm blind. > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > Also, as a confirmation, System.gc() always invokes a full gc with the > parallel collector, right? I believe so, but just wanted to double check > while we're on the topic. > Right. (Not sure what happens if JNI critical section is in force -- whether it's skipped or we wait for the JNI CS to exit/complete; hopefully others can fill in the blanks/inaccuracies in my comments above, since they are based on things that used to be a while ago in code I haven't looked at recently.) -- ramki > > Thanks > > > On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu wrote: > >> >> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >> >> Yes, I know :) This is some cruft that needs to be cleaned up. >> >> So my suspicion is that full gc is triggered precisely because old gen >> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >> surprising is that even though old gen is almost full, young gen has lots >> of room now. In fact, this system is restarted daily so we never see >> another young gc before the restart. >> >> The other odd observation is that survivor spaces are completely empty >> after this full gc despite tenuring threshold not being adjusted. >> >> >> The full gc algorithm used compacts everything (old gen and young gen) >> into >> the old gen unless it does not all fit. If the old gen overflows, the >> young gen >> is compacted into itself. Live in the young gen is compacted into eden >> first and >> then into the survivor spaces. >> >> My intuitive thinking is that there was no real reason for the full gc >> to occur; whatever allocation failed in young could now succeed and >> whatever was tenured fit, albeit very tightly. >> >> >> Still puzzling about the full GC. Are you using CMS? If you have >> PrintGCDetails output, >> that might help. >> >> Jon >> >> Sent from my phone >> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >> wrote: >> >>> Am Wed, 7 May 2014 19:34:20 -0400 >>> schrieb Vitaly Davidovich : >>> >>> > The vm args are: >>> > >>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>> >>> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>> in your FullGC the steady state after FullGC has filled it nearly >>> completely. >>> >>> Gruss >>> Bernd >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >> >> >> _______________________________________________ >> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Thu May 8 19:36:47 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 8 May 2014 12:36:47 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: By the way, as others have noted, -XX:+PrintGCDetails at max verbosity level would be your friend to get more visibility into this. Include -XX:+PrintHeapAtGC for even better visibility. For good measure, after the puzzling full gc happens (and hopefully before another GC happens) capture jstat data re the heap (old gen), for direct allocation visibility. -- ramki On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna wrote: > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: > >> Hi Jon, >> >> Nope, we're not using CMS here; this is the throughput/parallel collector >> setup. >> >> I was browsing some of the gc code in openjdk, and noticed a few places >> where each generation attempts to decide (upfront from what I can tell, >> i.e. before doing the collection) whether it thinks it's "safe" to perform >> the collection (and if it's not, it punts to the next generation) and also >> whether some amount of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that code leads me to >> think that perhaps the defNew generation is asking the next gen (i.e. >> tenured) whether it could handle some estimated promotion amount, and given >> the large imbalance between Young and Tenured size, tenured is reporting >> that things won't fit -- this then causes a full gc. Is that at all >> possible from what you know? >> > > If that were to happen, you wouldn't see the minor gc that precedes the > full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed immediately by a > major is when a minor gc didn't manage to fit an allocation request in the > space available. But, thinking more about that, it can't be because one > would expect that Eden knows the largest object it can allocate, so if the > request is larger than will fit in young, the allocator would just go look > for space in the older generation. If that didn't fit, the old gen would > precipitate a gc which would collect the entire heap (all this should be > taken with a dose of salt as I don't have the code in front of me as I > type, and I haven't looked at the allocation policy code in ages). > > >> >> On your first remark about compaction, just to make sure I understand, >> you're saying that a full GC prefers to move all live objects into tenured >> (this means taking objects out of survivor space and eden), irrespective of >> whether their tenuring threshold has been exceeded? If that >> compaction/migration of objects into tenured overflows tenured, then it >> attempts to compact the young gen, with overflow into survivor space from >> eden. So basically, this generation knows how to perform compaction and >> it's not just a copying collection? >> > > That is correct. A full gc does in fact move all survivors from young gen > into the old gen. This is a limitation (artificial nepotism can ensue > because of "too young" objects that will soon die, getting artificially > dragged into the old generation) that I had been lobbying to fix for a > while now. I think there's even an old, perhaps still open, bug for this. > > >> Is there a way to get the young gen to print an age table of objects in >> its survivor space? I couldn't find one, but perhaps I'm blind. >> > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > >> >> Also, as a confirmation, System.gc() always invokes a full gc with the >> parallel collector, right? I believe so, but just wanted to double check >> while we're on the topic. >> > > Right. (Not sure what happens if JNI critical section is in force -- > whether it's skipped or we wait for the JNI CS to exit/complete; hopefully > others can fill in the blanks/inaccuracies in my comments above, since they > are based on things that used to be a while ago in code I haven't looked at > recently.) > > -- ramki > > >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu wrote: >> >>> >>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>> Yes, I know :) This is some cruft that needs to be cleaned up. >>> >>> So my suspicion is that full gc is triggered precisely because old gen >>> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >>> surprising is that even though old gen is almost full, young gen has lots >>> of room now. In fact, this system is restarted daily so we never see >>> another young gc before the restart. >>> >>> The other odd observation is that survivor spaces are completely empty >>> after this full gc despite tenuring threshold not being adjusted. >>> >>> >>> The full gc algorithm used compacts everything (old gen and young gen) >>> into >>> the old gen unless it does not all fit. If the old gen overflows, the >>> young gen >>> is compacted into itself. Live in the young gen is compacted into eden >>> first and >>> then into the survivor spaces. >>> >>> My intuitive thinking is that there was no real reason for the full gc >>> to occur; whatever allocation failed in young could now succeed and >>> whatever was tenured fit, albeit very tightly. >>> >>> >>> Still puzzling about the full GC. Are you using CMS? If you have >>> PrintGCDetails output, >>> that might help. >>> >>> Jon >>> >>> Sent from my phone >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>> wrote: >>> >>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>> schrieb Vitaly Davidovich : >>>> >>>> > The vm args are: >>>> > >>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>> >>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >>>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>>> in your FullGC the steady state after FullGC has filled it nearly >>>> completely. >>>> >>>> Gruss >>>> Bernd >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 19:52:31 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 15:52:31 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: Thanks Ramki, very helpful info. If you don't mind, I have a few follow-up questions. The only situation I know where a minor GC is followed immediately by a > major is when a minor gc didn't manage to fit an allocation request in the > space available. But, thinking more about that, it can't be because one > would expect that Eden knows the largest object it can allocate, so if the > request is larger than will fit in young, the allocator would just go look > for space in the older generation. If that didn't fit, the old gen would > precipitate a gc which would collect the entire heap (all this should be > taken with a dose of salt as I don't have the code in front of me as I > type, and I haven't looked at the allocation policy code in ages). So in this case, you're referring to an allocation that doesn't fit even after young is collected, right? If so, in my case, I don't think that's an issue since we have gigabytes left after the young collection finishes. That is correct. A full gc does in fact move all survivors from young gen > into the old gen. This is a limitation (artificial nepotism can ensue > because of "too young" objects that will soon die, getting artificially > dragged into the old generation) that I had been lobbying to fix for a > while now. I think there's even an old, perhaps still open, bug for this. Indeed, this does sound less than optimal. I wonder if anyone has a link to the bug for this and/or knows its status. +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) I tried this before sending out the last email, but it didn't seem to print anything of interest (I induced a gc via System.gc(), but I doubt that makes any difference). Do I need to have any other option (e.g. verbosity) enabled before this spits anything out? I'll try it again though. Thanks On Thu, May 8, 2014 at 3:34 PM, Srinivas Ramakrishna wrote: > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: > >> Hi Jon, >> >> Nope, we're not using CMS here; this is the throughput/parallel collector >> setup. >> >> I was browsing some of the gc code in openjdk, and noticed a few places >> where each generation attempts to decide (upfront from what I can tell, >> i.e. before doing the collection) whether it thinks it's "safe" to perform >> the collection (and if it's not, it punts to the next generation) and also >> whether some amount of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that code leads me to >> think that perhaps the defNew generation is asking the next gen (i.e. >> tenured) whether it could handle some estimated promotion amount, and given >> the large imbalance between Young and Tenured size, tenured is reporting >> that things won't fit -- this then causes a full gc. Is that at all >> possible from what you know? >> > > If that were to happen, you wouldn't see the minor gc that precedes the > full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed immediately by a > major is when a minor gc didn't manage to fit an allocation request in the > space available. But, thinking more about that, it can't be because one > would expect that Eden knows the largest object it can allocate, so if the > request is larger than will fit in young, the allocator would just go look > for space in the older generation. If that didn't fit, the old gen would > precipitate a gc which would collect the entire heap (all this should be > taken with a dose of salt as I don't have the code in front of me as I > type, and I haven't looked at the allocation policy code in ages). > > >> >> On your first remark about compaction, just to make sure I understand, >> you're saying that a full GC prefers to move all live objects into tenured >> (this means taking objects out of survivor space and eden), irrespective of >> whether their tenuring threshold has been exceeded? If that >> compaction/migration of objects into tenured overflows tenured, then it >> attempts to compact the young gen, with overflow into survivor space from >> eden. So basically, this generation knows how to perform compaction and >> it's not just a copying collection? >> > > That is correct. A full gc does in fact move all survivors from young gen > into the old gen. This is a limitation (artificial nepotism can ensue > because of "too young" objects that will soon die, getting artificially > dragged into the old generation) that I had been lobbying to fix for a > while now. I think there's even an old, perhaps still open, bug for this. > > >> Is there a way to get the young gen to print an age table of objects in >> its survivor space? I couldn't find one, but perhaps I'm blind. >> > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > >> >> Also, as a confirmation, System.gc() always invokes a full gc with the >> parallel collector, right? I believe so, but just wanted to double check >> while we're on the topic. >> > > Right. (Not sure what happens if JNI critical section is in force -- > whether it's skipped or we wait for the JNI CS to exit/complete; hopefully > others can fill in the blanks/inaccuracies in my comments above, since they > are based on things that used to be a while ago in code I haven't looked at > recently.) > > -- ramki > > >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu wrote: >> >>> >>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>> Yes, I know :) This is some cruft that needs to be cleaned up. >>> >>> So my suspicion is that full gc is triggered precisely because old gen >>> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >>> surprising is that even though old gen is almost full, young gen has lots >>> of room now. In fact, this system is restarted daily so we never see >>> another young gc before the restart. >>> >>> The other odd observation is that survivor spaces are completely empty >>> after this full gc despite tenuring threshold not being adjusted. >>> >>> >>> The full gc algorithm used compacts everything (old gen and young gen) >>> into >>> the old gen unless it does not all fit. If the old gen overflows, the >>> young gen >>> is compacted into itself. Live in the young gen is compacted into eden >>> first and >>> then into the survivor spaces. >>> >>> My intuitive thinking is that there was no real reason for the full gc >>> to occur; whatever allocation failed in young could now succeed and >>> whatever was tenured fit, albeit very tightly. >>> >>> >>> Still puzzling about the full GC. Are you using CMS? If you have >>> PrintGCDetails output, >>> that might help. >>> >>> Jon >>> >>> Sent from my phone >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>> wrote: >>> >>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>> schrieb Vitaly Davidovich : >>>> >>>> > The vm args are: >>>> > >>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>> >>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >>>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>>> in your FullGC the steady state after FullGC has filled it nearly >>>> completely. >>>> >>>> Gruss >>>> Bernd >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 19:55:49 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 15:55:49 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: I captured some usage and capacity stats via jstat right after that full gc that started this email thread. It showed 0% usage of survivor spaces (which makes sense now that I know that a full gc empties that out irrespective of tenuring threshold and object age); eden usage went down to like 10%; tenured usage was very high, 98%. Last gc cause was recorded as "Allocation Failure". So it's true that the tenured doesn't have much breathing room here, but what prompted this email is I don't understand why that even matters considering young gen got cleaned up quite nicely. On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna wrote: > > By the way, as others have noted, -XX:+PrintGCDetails at max verbosity > level would be your friend to get more visibility into this. Include > -XX:+PrintHeapAtGC for even better visibility. For good measure, after the > puzzling full gc happens (and hopefully before another GC happens) capture > jstat data re the heap (old gen), for direct allocation visibility. > > -- ramki > > > On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna wrote: > >> Hi Vitaly -- >> >> >> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: >> >>> Hi Jon, >>> >>> Nope, we're not using CMS here; this is the throughput/parallel >>> collector setup. >>> >>> I was browsing some of the gc code in openjdk, and noticed a few places >>> where each generation attempts to decide (upfront from what I can tell, >>> i.e. before doing the collection) whether it thinks it's "safe" to perform >>> the collection (and if it's not, it punts to the next generation) and also >>> whether some amount of promoted bytes will fit. >>> >>> I didn't dig too much yet, but a cursory scan of that code leads me to >>> think that perhaps the defNew generation is asking the next gen (i.e. >>> tenured) whether it could handle some estimated promotion amount, and given >>> the large imbalance between Young and Tenured size, tenured is reporting >>> that things won't fit -- this then causes a full gc. Is that at all >>> possible from what you know? >>> >> >> If that were to happen, you wouldn't see the minor gc that precedes the >> full gc in the log snippet you posted. >> >> The only situation I know where a minor GC is followed immediately by a >> major is when a minor gc didn't manage to fit an allocation request in the >> space available. But, thinking more about that, it can't be because one >> would expect that Eden knows the largest object it can allocate, so if the >> request is larger than will fit in young, the allocator would just go look >> for space in the older generation. If that didn't fit, the old gen would >> precipitate a gc which would collect the entire heap (all this should be >> taken with a dose of salt as I don't have the code in front of me as I >> type, and I haven't looked at the allocation policy code in ages). >> >> >>> >>> On your first remark about compaction, just to make sure I understand, >>> you're saying that a full GC prefers to move all live objects into tenured >>> (this means taking objects out of survivor space and eden), irrespective of >>> whether their tenuring threshold has been exceeded? If that >>> compaction/migration of objects into tenured overflows tenured, then it >>> attempts to compact the young gen, with overflow into survivor space from >>> eden. So basically, this generation knows how to perform compaction and >>> it's not just a copying collection? >>> >> >> That is correct. A full gc does in fact move all survivors from young gen >> into the old gen. This is a limitation (artificial nepotism can ensue >> because of "too young" objects that will soon die, getting artificially >> dragged into the old generation) that I had been lobbying to fix for a >> while now. I think there's even an old, perhaps still open, bug for this. >> >> >>> Is there a way to get the young gen to print an age table of objects in >>> its survivor space? I couldn't find one, but perhaps I'm blind. >>> >> >> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) >> >> >>> >>> Also, as a confirmation, System.gc() always invokes a full gc with the >>> parallel collector, right? I believe so, but just wanted to double check >>> while we're on the topic. >>> >> >> Right. (Not sure what happens if JNI critical section is in force -- >> whether it's skipped or we wait for the JNI CS to exit/complete; hopefully >> others can fill in the blanks/inaccuracies in my comments above, since they >> are based on things that used to be a while ago in code I haven't looked at >> recently.) >> >> -- ramki >> >> >>> >>> Thanks >>> >>> >>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu wrote: >>> >>>> >>>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>>> >>>> Yes, I know :) This is some cruft that needs to be cleaned up. >>>> >>>> So my suspicion is that full gc is triggered precisely because old gen >>>> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >>>> surprising is that even though old gen is almost full, young gen has lots >>>> of room now. In fact, this system is restarted daily so we never see >>>> another young gc before the restart. >>>> >>>> The other odd observation is that survivor spaces are completely empty >>>> after this full gc despite tenuring threshold not being adjusted. >>>> >>>> >>>> The full gc algorithm used compacts everything (old gen and young gen) >>>> into >>>> the old gen unless it does not all fit. If the old gen overflows, the >>>> young gen >>>> is compacted into itself. Live in the young gen is compacted into eden >>>> first and >>>> then into the survivor spaces. >>>> >>>> My intuitive thinking is that there was no real reason for the full >>>> gc to occur; whatever allocation failed in young could now succeed and >>>> whatever was tenured fit, albeit very tightly. >>>> >>>> >>>> Still puzzling about the full GC. Are you using CMS? If you have >>>> PrintGCDetails output, >>>> that might help. >>>> >>>> Jon >>>> >>>> Sent from my phone >>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>>> wrote: >>>> >>>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>>> schrieb Vitaly Davidovich : >>>>> >>>>> > The vm args are: >>>>> > >>>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>>> >>>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >>>>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>>>> in your FullGC the steady state after FullGC has filled it nearly >>>>> completely. >>>>> >>>>> Gruss >>>>> Bernd >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ysr1729 at gmail.com Thu May 8 20:24:36 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 8 May 2014 13:24:36 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: The 98% old gen occupancy triggered one of my two neurons. I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! -- ramki On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich wrote: > I captured some usage and capacity stats via jstat right after that full > gc that started this email thread. It showed 0% usage of survivor spaces > (which makes sense now that I know that a full gc empties that out > irrespective of tenuring threshold and object age); eden usage went down to > like 10%; tenured usage was very high, 98%. Last gc cause was recorded as > "Allocation Failure". So it's true that the tenured doesn't have much > breathing room here, but what prompted this email is I don't understand why > that even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna wrote: > >> >> By the way, as others have noted, -XX:+PrintGCDetails at max verbosity >> level would be your friend to get more visibility into this. Include >> -XX:+PrintHeapAtGC for even better visibility. For good measure, after the >> puzzling full gc happens (and hopefully before another GC happens) capture >> jstat data re the heap (old gen), for direct allocation visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna wrote: >> >>> Hi Vitaly -- >>> >>> >>> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: >>> >>>> Hi Jon, >>>> >>>> Nope, we're not using CMS here; this is the throughput/parallel >>>> collector setup. >>>> >>>> I was browsing some of the gc code in openjdk, and noticed a few places >>>> where each generation attempts to decide (upfront from what I can tell, >>>> i.e. before doing the collection) whether it thinks it's "safe" to perform >>>> the collection (and if it's not, it punts to the next generation) and also >>>> whether some amount of promoted bytes will fit. >>>> >>>> I didn't dig too much yet, but a cursory scan of that code leads me to >>>> think that perhaps the defNew generation is asking the next gen (i.e. >>>> tenured) whether it could handle some estimated promotion amount, and given >>>> the large imbalance between Young and Tenured size, tenured is reporting >>>> that things won't fit -- this then causes a full gc. Is that at all >>>> possible from what you know? >>>> >>> >>> If that were to happen, you wouldn't see the minor gc that precedes the >>> full gc in the log snippet you posted. >>> >>> The only situation I know where a minor GC is followed immediately by a >>> major is when a minor gc didn't manage to fit an allocation request in the >>> space available. But, thinking more about that, it can't be because one >>> would expect that Eden knows the largest object it can allocate, so if the >>> request is larger than will fit in young, the allocator would just go look >>> for space in the older generation. If that didn't fit, the old gen would >>> precipitate a gc which would collect the entire heap (all this should be >>> taken with a dose of salt as I don't have the code in front of me as I >>> type, and I haven't looked at the allocation policy code in ages). >>> >>> >>>> >>>> On your first remark about compaction, just to make sure I understand, >>>> you're saying that a full GC prefers to move all live objects into tenured >>>> (this means taking objects out of survivor space and eden), irrespective of >>>> whether their tenuring threshold has been exceeded? If that >>>> compaction/migration of objects into tenured overflows tenured, then it >>>> attempts to compact the young gen, with overflow into survivor space from >>>> eden. So basically, this generation knows how to perform compaction and >>>> it's not just a copying collection? >>>> >>> >>> That is correct. A full gc does in fact move all survivors from young >>> gen into the old gen. This is a limitation (artificial nepotism can ensue >>> because of "too young" objects that will soon die, getting artificially >>> dragged into the old generation) that I had been lobbying to fix for a >>> while now. I think there's even an old, perhaps still open, bug for this. >>> >>> >>>> Is there a way to get the young gen to print an age table of objects in >>>> its survivor space? I couldn't find one, but perhaps I'm blind. >>>> >>> >>> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) >>> >>> >>>> >>>> Also, as a confirmation, System.gc() always invokes a full gc with the >>>> parallel collector, right? I believe so, but just wanted to double check >>>> while we're on the topic. >>>> >>> >>> Right. (Not sure what happens if JNI critical section is in force -- >>> whether it's skipped or we wait for the JNI CS to exit/complete; hopefully >>> others can fill in the blanks/inaccuracies in my comments above, since they >>> are based on things that used to be a while ago in code I haven't looked at >>> recently.) >>> >>> -- ramki >>> >>> >>>> >>>> Thanks >>>> >>>> >>>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu >>> > wrote: >>>> >>>>> >>>>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>>>> >>>>> Yes, I know :) This is some cruft that needs to be cleaned up. >>>>> >>>>> So my suspicion is that full gc is triggered precisely because old gen >>>>> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >>>>> surprising is that even though old gen is almost full, young gen has lots >>>>> of room now. In fact, this system is restarted daily so we never see >>>>> another young gc before the restart. >>>>> >>>>> The other odd observation is that survivor spaces are completely empty >>>>> after this full gc despite tenuring threshold not being adjusted. >>>>> >>>>> >>>>> The full gc algorithm used compacts everything (old gen and young gen) >>>>> into >>>>> the old gen unless it does not all fit. If the old gen overflows, >>>>> the young gen >>>>> is compacted into itself. Live in the young gen is compacted into >>>>> eden first and >>>>> then into the survivor spaces. >>>>> >>>>> My intuitive thinking is that there was no real reason for the full >>>>> gc to occur; whatever allocation failed in young could now succeed and >>>>> whatever was tenured fit, albeit very tightly. >>>>> >>>>> >>>>> Still puzzling about the full GC. Are you using CMS? If you have >>>>> PrintGCDetails output, >>>>> that might help. >>>>> >>>>> Jon >>>>> >>>>> Sent from my phone >>>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>>>> wrote: >>>>> >>>>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>>>> schrieb Vitaly Davidovich : >>>>>> >>>>>> > The vm args are: >>>>>> > >>>>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>>>> >>>>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites >>>>>> Xmn. >>>>>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>>>>> in your FullGC the steady state after FullGC has filled it nearly >>>>>> completely. >>>>>> >>>>>> Gruss >>>>>> Bernd >>>>>> _______________________________________________ >>>>>> hotspot-gc-use mailing list >>>>>> hotspot-gc-use at openjdk.java.net >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 20:51:42 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 16:51:42 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: So this young + full gc was the first non-forced GC of this particular jvm's lifetime. There were a couple of forced ones (i.e. System.gc()) very early on, and then the next GC event was this young + full cycle. From what I've gathered, System.gc() induced collections do not feed into stats/heuristics used for subsequent gc events -- someone please correct me if that's wrong. If that is indeed wrong, and those two early forced GCs were somehow incorporated into heuristics, then I'd like to know what exactly would be incorporated. Given that full gc always moves objects out of young into tenured, I wonder if that skews historical promotion volume stats. So 98% usage of tenured was post the full gc event; prior to that, it would've been ~20% used. On Thu, May 8, 2014 at 4:24 PM, Srinivas Ramakrishna wrote: > The 98% old gen occupancy triggered one of my two neurons. > I think there was gc policy code (don't know if it;s still there) that > would proactiively precipitate a full gc when it realized (based on > recent/historical promotion volume stats) that the next minor gc would not > be able to promote its survivors into the head room remaining in old. > (Don't ask me why it;s better to do it now rather than the next time the > young gen fills up and just rely on the same check). Again I am not looking > at the code (as it takes some effort to get to the box where I keep a copy > of the hotspot/openjdk code.) > > Hopefully Jon &co. will quickly confirm or shoot down the imaginations o > my foggy memory! > -- ramki > > > On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich wrote: > >> I captured some usage and capacity stats via jstat right after that full >> gc that started this email thread. It showed 0% usage of survivor spaces >> (which makes sense now that I know that a full gc empties that out >> irrespective of tenuring threshold and object age); eden usage went down to >> like 10%; tenured usage was very high, 98%. Last gc cause was recorded as >> "Allocation Failure". So it's true that the tenured doesn't have much >> breathing room here, but what prompted this email is I don't understand why >> that even matters considering young gen got cleaned up quite nicely. >> >> >> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna wrote: >> >>> >>> By the way, as others have noted, -XX:+PrintGCDetails at max verbosity >>> level would be your friend to get more visibility into this. Include >>> -XX:+PrintHeapAtGC for even better visibility. For good measure, after the >>> puzzling full gc happens (and hopefully before another GC happens) capture >>> jstat data re the heap (old gen), for direct allocation visibility. >>> >>> -- ramki >>> >>> >>> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna >> > wrote: >>> >>>> Hi Vitaly -- >>>> >>>> >>>> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: >>>> >>>>> Hi Jon, >>>>> >>>>> Nope, we're not using CMS here; this is the throughput/parallel >>>>> collector setup. >>>>> >>>>> I was browsing some of the gc code in openjdk, and noticed a few >>>>> places where each generation attempts to decide (upfront from what I can >>>>> tell, i.e. before doing the collection) whether it thinks it's "safe" to >>>>> perform the collection (and if it's not, it punts to the next generation) >>>>> and also whether some amount of promoted bytes will fit. >>>>> >>>>> I didn't dig too much yet, but a cursory scan of that code leads me to >>>>> think that perhaps the defNew generation is asking the next gen (i.e. >>>>> tenured) whether it could handle some estimated promotion amount, and given >>>>> the large imbalance between Young and Tenured size, tenured is reporting >>>>> that things won't fit -- this then causes a full gc. Is that at all >>>>> possible from what you know? >>>>> >>>> >>>> If that were to happen, you wouldn't see the minor gc that precedes the >>>> full gc in the log snippet you posted. >>>> >>>> The only situation I know where a minor GC is followed immediately by a >>>> major is when a minor gc didn't manage to fit an allocation request in the >>>> space available. But, thinking more about that, it can't be because one >>>> would expect that Eden knows the largest object it can allocate, so if the >>>> request is larger than will fit in young, the allocator would just go look >>>> for space in the older generation. If that didn't fit, the old gen would >>>> precipitate a gc which would collect the entire heap (all this should be >>>> taken with a dose of salt as I don't have the code in front of me as I >>>> type, and I haven't looked at the allocation policy code in ages). >>>> >>>> >>>>> >>>>> On your first remark about compaction, just to make sure I understand, >>>>> you're saying that a full GC prefers to move all live objects into tenured >>>>> (this means taking objects out of survivor space and eden), irrespective of >>>>> whether their tenuring threshold has been exceeded? If that >>>>> compaction/migration of objects into tenured overflows tenured, then it >>>>> attempts to compact the young gen, with overflow into survivor space from >>>>> eden. So basically, this generation knows how to perform compaction and >>>>> it's not just a copying collection? >>>>> >>>> >>>> That is correct. A full gc does in fact move all survivors from young >>>> gen into the old gen. This is a limitation (artificial nepotism can ensue >>>> because of "too young" objects that will soon die, getting artificially >>>> dragged into the old generation) that I had been lobbying to fix for a >>>> while now. I think there's even an old, perhaps still open, bug for this. >>>> >>>> >>>>> Is there a way to get the young gen to print an age table of objects >>>>> in its survivor space? I couldn't find one, but perhaps I'm blind. >>>>> >>>> >>>> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) >>>> >>>> >>>>> >>>>> Also, as a confirmation, System.gc() always invokes a full gc with the >>>>> parallel collector, right? I believe so, but just wanted to double check >>>>> while we're on the topic. >>>>> >>>> >>>> Right. (Not sure what happens if JNI critical section is in force -- >>>> whether it's skipped or we wait for the JNI CS to exit/complete; hopefully >>>> others can fill in the blanks/inaccuracies in my comments above, since they >>>> are based on things that used to be a while ago in code I haven't looked at >>>> recently.) >>>> >>>> -- ramki >>>> >>>> >>>>> >>>>> Thanks >>>>> >>>>> >>>>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu < >>>>> jon.masamitsu at oracle.com> wrote: >>>>> >>>>>> >>>>>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>>>>> >>>>>> Yes, I know :) This is some cruft that needs to be cleaned up. >>>>>> >>>>>> So my suspicion is that full gc is triggered precisely because old >>>>>> gen occupancy is almost 100%, but I'd appreciate confirmation on that. >>>>>> What's surprising is that even though old gen is almost full, young gen has >>>>>> lots of room now. In fact, this system is restarted daily so we never see >>>>>> another young gc before the restart. >>>>>> >>>>>> The other odd observation is that survivor spaces are completely >>>>>> empty after this full gc despite tenuring threshold not being adjusted. >>>>>> >>>>>> >>>>>> The full gc algorithm used compacts everything (old gen and young >>>>>> gen) into >>>>>> the old gen unless it does not all fit. If the old gen overflows, >>>>>> the young gen >>>>>> is compacted into itself. Live in the young gen is compacted into >>>>>> eden first and >>>>>> then into the survivor spaces. >>>>>> >>>>>> My intuitive thinking is that there was no real reason for the full >>>>>> gc to occur; whatever allocation failed in young could now succeed and >>>>>> whatever was tenured fit, albeit very tightly. >>>>>> >>>>>> >>>>>> Still puzzling about the full GC. Are you using CMS? If you have >>>>>> PrintGCDetails output, >>>>>> that might help. >>>>>> >>>>>> Jon >>>>>> >>>>>> Sent from my phone >>>>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>>>>> wrote: >>>>>> >>>>>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>>>>> schrieb Vitaly Davidovich : >>>>>>> >>>>>>> > The vm args are: >>>>>>> > >>>>>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>>>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>>>>> >>>>>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites >>>>>>> Xmn. >>>>>>> You will get 16384-12288=4gb old size, thats quite low. As you can >>>>>>> see >>>>>>> in your FullGC the steady state after FullGC has filled it nearly >>>>>>> completely. >>>>>>> >>>>>>> Gruss >>>>>>> Bernd >>>>>>> _______________________________________________ >>>>>>> hotspot-gc-use mailing list >>>>>>> hotspot-gc-use at openjdk.java.net >>>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> hotspot-gc-use mailing list >>>>>> hotspot-gc-use at openjdk.java.net >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>>> >>>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Thu May 8 21:16:29 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Thu, 08 May 2014 14:16:29 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: <536BF42D.3050202@Oracle.COM> The "problem", if you want to call it that, is that when the young generation has filled up before the next collection it is probably too late. The scavenger is optimistic and thinks everything can be promoted. It just goes ahead and starts a young collection. It gets a promotion failure if it runs out of space in the old generation, painfully recovers from the promotion failure and then causes a full collection. Instead we use the promotion history at the end of each young generation collection to decide to do a full collection preemptively. That way we can sneak in that last scavenge (usually pretty fast, and usually emptying the whole eden) before we invoke a full collection, which doesn't handle massive amounts of garbage well (e.g., in the young generation). If we were pessimistic, given Vitaly's heap layout, we'd do nothing but full collections. I think all the policy code (for the parallel scavenger) is in PSScavenge::invoke(), e.g., http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psScavenge.cpp starting at line 210. The policy decision is made in PSAdaptiveSizePolicy::should_full_GC http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psAdaptiveSizePolicy.cpp starting at line 162. Look at all those lovely fish! It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag) would tell you what choices are being made (and probably produce a lot of other output as well :-). In a product build -XX:+PrintGCDetails -XX:+PrintHeapAtGC, as has been suggested by others, should get enough information to figure out what's going on. I've cited the code for the parallel scavenger, because Vitaly said "this is the throughput/parallel collector setup". The other collectors have similar policy code. ... peter On 05/08/14 13:24, Srinivas Ramakrishna wrote: > The 98% old gen occupancy triggered one of my two neurons. > I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) > > Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! > -- ramki > > > On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > wrote: > > I captured some usage and capacity stats via jstat right after that full gc that started this email thread. It showed 0% usage of survivor spaces (which makes sense now that I know that a full gc empties that out irrespective of tenuring threshold and object age); eden usage went down to like 10%; tenured usage was very high, 98%. Last gc cause was recorded as "Allocation Failure". So it's true that the tenured doesn't have much breathing room here, but what prompted this email is I don't understand why that even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna > wrote: > > > By the way, as others have noted, -XX:+PrintGCDetails at max verbosity level would be your friend to get more visibility into this. Include -XX:+PrintHeapAtGC for even better visibility. For good measure, after the puzzling full gc happens (and hopefully before another GC happens) capture jstat data re the heap (old gen), for direct allocation visibility. > > -- ramki > > > On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna > wrote: > > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich > wrote: > > Hi Jon, > > Nope, we're not using CMS here; this is the throughput/parallel collector setup. > > I was browsing some of the gc code in openjdk, and noticed a few places where each generation attempts to decide (upfront from what I can tell, i.e. before doing the collection) whether it thinks it's "safe" to perform the collection (and if it's not, it punts to the next generation) and also whether some amount of promoted bytes will fit. > > I didn't dig too much yet, but a cursory scan of that code leads me to think that perhaps the defNew generation is asking the next gen (i.e. tenured) whether it could handle some estimated promotion amount, and given the large imbalance between Young and Tenured size, tenured is reporting that things won't fit -- this then causes a full gc. Is that at all possible from what you know? > > > If that were to happen, you wouldn't see the minor gc that precedes the full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed immediately by a major is when a minor gc didn't manage to fit an allocation request in the space available. But, thinking more about that, it can't be because one would expect that Eden knows the largest object it can allocate, so if the request is larger than will fit in young, the allocator would just go look for space in the older generation. If that didn't fit, the old gen would precipitate a gc which would collect the entire heap (all this should be taken with a dose of salt as I don't have the code in front of me as I type, and I haven't looked at the allocation policy code in ages). > > > On your first remark about compaction, just to make sure I understand, you're saying that a full GC prefers to move all live objects into tenured (this means taking objects out of survivor space and eden), irrespective of whether their tenuring threshold has been exceeded? If that compaction/migration of objects into tenured overflows tenured, then it attempts to compact the young gen, with overflow into survivor space from eden. So basically, this generation knows how to perform compaction and it's not just a copying collection? > > > That is correct. A full gc does in fact move all survivors from young gen into the old gen. This is a limitation (artificial nepotism can ensue because of "too young" objects that will soon die, getting artificially dragged into the old generation) that I had been lobbying to fix for a while now. I think there's even an old, perhaps still open, bug for this. > > > Is there a way to get the young gen to print an age table of objects in its survivor space? I couldn't find one, but perhaps I'm blind. > > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > > Also, as a confirmation, System.gc() always invokes a full gc with the parallel collector, right? I believe so, but just wanted to double check while we're on the topic. > > > Right. (Not sure what happens if JNI critical section is in force -- whether it's skipped or we wait for the JNI CS to exit/complete; hopefully others can fill in the blanks/inaccuracies in my comments above, since they are based on things that used to be a while ago in code I haven't looked at recently.) > > -- ramki > > > Thanks > > > On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu > wrote: > > > On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >> >> Yes, I know :) This is some cruft that needs to be cleaned up. >> >> So my suspicion is that full gc is triggered precisely because old gen occupancy is almost 100%, but I'd appreciate confirmation on that. What's surprising is that even though old gen is almost full, young gen has lots of room now. In fact, this system is restarted daily so we never see another young gc before the restart. >> >> The other odd observation is that survivor spaces are completely empty after this full gc despite tenuring threshold not being adjusted. >> > > The full gc algorithm used compacts everything (old gen and young gen) into > the old gen unless it does not all fit. If the old gen overflows, the young gen > is compacted into itself. Live in the young gen is compacted into eden first and > then into the survivor spaces. > >> My intuitive thinking is that there was no real reason for the full gc to occur; whatever allocation failed in young could now succeed and whatever was tenured fit, albeit very tightly. >> > > Still puzzling about the full GC. Are you using CMS? If you have PrintGCDetails output, > that might help. > > Jon > >> Sent from my phone >> >> On May 7, 2014 8:40 PM, "Bernd Eckenfels" > wrote: >> >> Am Wed, 7 May 2014 19:34:20 -0400 >> schrieb Vitaly Davidovich >: >> >> > The vm args are: >> > >> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >> >> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >> You will get 16384-12288=4gb old size, thats quite low. As you can see >> in your FullGC the steady state after FullGC has filled it nearly >> completely. >> >> Gruss >> Bernd >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From vitalyd at gmail.com Thu May 8 21:44:26 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 17:44:26 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536BF42D.3050202@Oracle.COM> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536BF42D.3050202@Oracle.COM> Message-ID: Hi Peter, Thanks for the insight. A few questions ... So we get an allocation failure in eden, scavenger starts a young collection. Eden is at ~10gb at this point, with 2 survivor spaces of 1gb each. At the time this young collection runs, tenured only has about 1gb of data (out of 4gb+ capacity). Looking at the total used heap size post young GC: 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] That remaining 4gb of live data does make the tenured generation reach 98% occupancy, but eden is now totally clean with lots of space (10gb). It's also unclear why it decided to overflow entirely into tenured space -- why not keep 1gb in the survivor space (that's the survivor space capacity in my setup) and only promote the remaining live objects to tenured? In our case, this would've been better because presumably a full GC would not have triggered and we could've finished the day without any more gc events due to a lot of headroom left in young. Instead, a full GC was triggered, took nearly 7 secs, and didn't really reclaim much -- it was a waste of time, and seems unnecessary. Also, when these young and old gc events occurred, the only prior gc events were two forced full gc's very early on in the JVM's lifetime. What historical promotion info is used by the subsequent GC event in this case? Is there effectively no promotion history (due to all prior GC events having been forced via System.gc()) or does the scavenger assume some worst case scenario there? Finally, what would be the recommended settings (other than raising max heap size and thus giving more room to tenured) for a setup such as this? That is, a JVM that runs for about 8-9 hours before being restarted. The gc allocation rate is fairly low, but does creep up over the course of the day. The amount of truly long-lived objects is somewhat close to tenured capacity, but smaller. The young gen is sized aggressively large to prevent (or at least make an effort in preventing) young GCs from occurring at all. But, if they do occur, vast majority of objects there are garbage. Thanks On Thu, May 8, 2014 at 5:16 PM, Peter B. Kessler wrote: > The "problem", if you want to call it that, is that when the young > generation has filled up before the next collection it is probably too > late. The scavenger is optimistic and thinks everything can be promoted. > It just goes ahead and starts a young collection. It gets a promotion > failure if it runs out of space in the old generation, painfully recovers > from the promotion failure and then causes a full collection. Instead we > use the promotion history at the end of each young generation collection to > decide to do a full collection preemptively. That way we can sneak in that > last scavenge (usually pretty fast, and usually emptying the whole eden) > before we invoke a full collection, which doesn't handle massive amounts of > garbage well (e.g., in the young generation). If we were pessimistic, > given Vitaly's heap layout, we'd do nothing but full collections. > > I think all the policy code (for the parallel scavenger) is in > PSScavenge::invoke(), e.g., > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ > 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psScavenge. > cpp > > starting at line 210. The policy decision is made in > PSAdaptiveSizePolicy::should_full_GC > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ > 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/ > psAdaptiveSizePolicy.cpp > > starting at line 162. Look at all those lovely fish! > > It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag) > would tell you what choices are being made (and probably produce a lot of > other output as well :-). In a product build -XX:+PrintGCDetails > -XX:+PrintHeapAtGC, as has been suggested by others, should get enough > information to figure out what's going on. > > I've cited the code for the parallel scavenger, because Vitaly said "this > is the throughput/parallel collector setup". The other collectors have > similar policy code. > > ... peter > > > On 05/08/14 13:24, Srinivas Ramakrishna wrote: > >> The 98% old gen occupancy triggered one of my two neurons. >> I think there was gc policy code (don't know if it;s still there) that >> would proactiively precipitate a full gc when it realized (based on >> recent/historical promotion volume stats) that the next minor gc would not >> be able to promote its survivors into the head room remaining in old. >> (Don't ask me why it;s better to do it now rather than the next time the >> young gen fills up and just rely on the same check). Again I am not looking >> at the code (as it takes some effort to get to the box where I keep a copy >> of the hotspot/openjdk code.) >> >> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o >> my foggy memory! >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > vitalyd at gmail.com>> wrote: >> >> I captured some usage and capacity stats via jstat right after that >> full gc that started this email thread. It showed 0% usage of survivor >> spaces (which makes sense now that I know that a full gc empties that out >> irrespective of tenuring threshold and object age); eden usage went down to >> like 10%; tenured usage was very high, 98%. Last gc cause was recorded as >> "Allocation Failure". So it's true that the tenured doesn't have much >> breathing room here, but what prompted this email is I don't understand why >> that even matters considering young gen got cleaned up quite nicely. >> >> >> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna < >> ysr1729 at gmail.com > wrote: >> >> >> By the way, as others have noted, -XX:+PrintGCDetails at max >> verbosity level would be your friend to get more visibility into this. >> Include -XX:+PrintHeapAtGC for even better visibility. For good measure, >> after the puzzling full gc happens (and hopefully before another GC >> happens) capture jstat data re the heap (old gen), for direct allocation >> visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna < >> ysr1729 at gmail.com > wrote: >> >> Hi Vitaly -- >> >> >> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich < >> vitalyd at gmail.com > wrote: >> >> Hi Jon, >> >> Nope, we're not using CMS here; this is the >> throughput/parallel collector setup. >> >> I was browsing some of the gc code in openjdk, and >> noticed a few places where each generation attempts to decide (upfront from >> what I can tell, i.e. before doing the collection) whether it thinks it's >> "safe" to perform the collection (and if it's not, it punts to the next >> generation) and also whether some amount of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that >> code leads me to think that perhaps the defNew generation is asking the >> next gen (i.e. tenured) whether it could handle some estimated promotion >> amount, and given the large imbalance between Young and Tenured size, >> tenured is reporting that things won't fit -- this then causes a full gc. >> Is that at all possible from what you know? >> >> >> If that were to happen, you wouldn't see the minor gc that >> precedes the full gc in the log snippet you posted. >> >> The only situation I know where a minor GC is followed >> immediately by a major is when a minor gc didn't manage to fit an >> allocation request in the space available. But, thinking more about that, >> it can't be because one would expect that Eden knows the largest object it >> can allocate, so if the request is larger than will fit in young, the >> allocator would just go look for space in the older generation. If that >> didn't fit, the old gen would precipitate a gc which would collect the >> entire heap (all this should be taken with a dose of salt as I don't have >> the code in front of me as I type, and I haven't looked at the allocation >> policy code in ages). >> >> >> On your first remark about compaction, just to make sure >> I understand, you're saying that a full GC prefers to move all live objects >> into tenured (this means taking objects out of survivor space and eden), >> irrespective of whether their tenuring threshold has been exceeded? If that >> compaction/migration of objects into tenured overflows tenured, then it >> attempts to compact the young gen, with overflow into survivor space from >> eden. So basically, this generation knows how to perform compaction and >> it's not just a copying collection? >> >> >> That is correct. A full gc does in fact move all survivors >> from young gen into the old gen. This is a limitation (artificial nepotism >> can ensue because of "too young" objects that will soon die, getting >> artificially dragged into the old generation) that I had been lobbying to >> fix for a while now. I think there's even an old, perhaps still open, bug >> for this. >> >> >> Is there a way to get the young gen to print an age table >> of objects in its survivor space? I couldn't find one, but perhaps I'm >> blind. >> >> >> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also >> G1?) >> >> >> Also, as a confirmation, System.gc() always invokes a >> full gc with the parallel collector, right? I believe so, but just wanted >> to double check while we're on the topic. >> >> >> Right. (Not sure what happens if JNI critical section is in >> force -- whether it's skipped or we wait for the JNI CS to exit/complete; >> hopefully others can fill in the blanks/inaccuracies in my comments above, >> since they are based on things that used to be a while ago in code I >> haven't looked at recently.) >> >> -- ramki >> >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu < >> jon.masamitsu at oracle.com > wrote: >> >> >> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >> >>> >>> Yes, I know :) This is some cruft that needs to be >>> cleaned up. >>> >>> So my suspicion is that full gc is triggered >>> precisely because old gen occupancy is almost 100%, but I'd appreciate >>> confirmation on that. What's surprising is that even though old gen is >>> almost full, young gen has lots of room now. In fact, this system is >>> restarted daily so we never see another young gc before the restart. >>> >>> The other odd observation is that survivor spaces >>> are completely empty after this full gc despite tenuring threshold not >>> being adjusted. >>> >>> >> The full gc algorithm used compacts everything (old >> gen and young gen) into >> the old gen unless it does not all fit. If the old >> gen overflows, the young gen >> is compacted into itself. Live in the young gen is >> compacted into eden first and >> then into the survivor spaces. >> >> My intuitive thinking is that there was no real >>> reason for the full gc to occur; whatever allocation failed in young could >>> now succeed and whatever was tenured fit, albeit very tightly. >>> >>> >> Still puzzling about the full GC. Are you using CMS? >> If you have PrintGCDetails output, >> that might help. >> >> Jon >> >> Sent from my phone >>> >>> >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" < >>> bernd-2014 at eckenfels.net > wrote: >>> >>> Am Wed, 7 May 2014 19:34:20 -0400 >>> schrieb Vitaly Davidovich >> vitalyd at gmail.com>>: >>> >>> >>> > The vm args are: >>> > >>> > -Xms16384m -Xmx16384m -Xmn16384m >>> -XX:NewSize=12288m >>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>> >>> Hmm... you have confliciting arguments here, >>> MaxNewSize overwrites Xmn. >>> You will get 16384-12288=4gb old size, thats >>> quite low. As you can see >>> in your FullGC the steady state after FullGC has >>> filled it nearly >>> completely. >>> >>> Gruss >>> Bernd >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > hotspot-gc-use at openjdk.java.net> >> >> http://mail.openjdk.java.net/ >> mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > openjdk.java.net> >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc- >> use >> >> >> >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 21:45:34 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 17:45:34 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536BF42D.3050202@Oracle.COM> Message-ID: By the way, would the ParNew collector handle this type of setup better than PS? Or hard to say? On Thu, May 8, 2014 at 5:44 PM, Vitaly Davidovich wrote: > Hi Peter, > > Thanks for the insight. A few questions ... > > So we get an allocation failure in eden, scavenger starts a young > collection. Eden is at ~10gb at this point, with 2 survivor spaces of 1gb > each. At the time this young collection runs, tenured only has about 1gb > of data (out of 4gb+ capacity). Looking at the total used heap size post > young GC: > > 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] > > That remaining 4gb of live data does make the tenured generation reach 98% > occupancy, but eden is now totally clean with lots of space (10gb). It's > also unclear why it decided to overflow entirely into tenured space -- why > not keep 1gb in the survivor space (that's the survivor space capacity in > my setup) and only promote the remaining live objects to tenured? In our > case, this would've been better because presumably a full GC would not have > triggered and we could've finished the day without any more gc events due > to a lot of headroom left in young. Instead, a full GC was triggered, took > nearly 7 secs, and didn't really reclaim much -- it was a waste of time, > and seems unnecessary. > > Also, when these young and old gc events occurred, the only prior gc > events were two forced full gc's very early on in the JVM's lifetime. What > historical promotion info is used by the subsequent GC event in this case? > Is there effectively no promotion history (due to all prior GC events > having been forced via System.gc()) or does the scavenger assume some worst > case scenario there? > > Finally, what would be the recommended settings (other than raising max > heap size and thus giving more room to tenured) for a setup such as this? > That is, a JVM that runs for about 8-9 hours before being restarted. The > gc allocation rate is fairly low, but does creep up over the course of the > day. The amount of truly long-lived objects is somewhat close to tenured > capacity, but smaller. The young gen is sized aggressively large to > prevent (or at least make an effort in preventing) young GCs from occurring > at all. But, if they do occur, vast majority of objects there are garbage. > > Thanks > > > > > > > > On Thu, May 8, 2014 at 5:16 PM, Peter B. Kessler < > Peter.B.Kessler at oracle.com> wrote: > >> The "problem", if you want to call it that, is that when the young >> generation has filled up before the next collection it is probably too >> late. The scavenger is optimistic and thinks everything can be promoted. >> It just goes ahead and starts a young collection. It gets a promotion >> failure if it runs out of space in the old generation, painfully recovers >> from the promotion failure and then causes a full collection. Instead we >> use the promotion history at the end of each young generation collection to >> decide to do a full collection preemptively. That way we can sneak in that >> last scavenge (usually pretty fast, and usually emptying the whole eden) >> before we invoke a full collection, which doesn't handle massive amounts of >> garbage well (e.g., in the young generation). If we were pessimistic, >> given Vitaly's heap layout, we'd do nothing but full collections. >> >> I think all the policy code (for the parallel scavenger) is in >> PSScavenge::invoke(), e.g., >> >> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ >> 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psScavenge. >> cpp >> >> starting at line 210. The policy decision is made in >> PSAdaptiveSizePolicy::should_full_GC >> >> http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ >> 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/ >> psAdaptiveSizePolicy.cpp >> >> starting at line 162. Look at all those lovely fish! >> >> It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag) >> would tell you what choices are being made (and probably produce a lot of >> other output as well :-). In a product build -XX:+PrintGCDetails >> -XX:+PrintHeapAtGC, as has been suggested by others, should get enough >> information to figure out what's going on. >> >> I've cited the code for the parallel scavenger, because Vitaly said "this >> is the throughput/parallel collector setup". The other collectors have >> similar policy code. >> >> ... peter >> >> >> On 05/08/14 13:24, Srinivas Ramakrishna wrote: >> >>> The 98% old gen occupancy triggered one of my two neurons. >>> I think there was gc policy code (don't know if it;s still there) that >>> would proactiively precipitate a full gc when it realized (based on >>> recent/historical promotion volume stats) that the next minor gc would not >>> be able to promote its survivors into the head room remaining in old. >>> (Don't ask me why it;s better to do it now rather than the next time the >>> young gen fills up and just rely on the same check). Again I am not looking >>> at the code (as it takes some effort to get to the box where I keep a copy >>> of the hotspot/openjdk code.) >>> >>> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o >>> my foggy memory! >>> -- ramki >>> >>> >>> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich >> vitalyd at gmail.com>> wrote: >>> >>> I captured some usage and capacity stats via jstat right after that >>> full gc that started this email thread. It showed 0% usage of survivor >>> spaces (which makes sense now that I know that a full gc empties that out >>> irrespective of tenuring threshold and object age); eden usage went down to >>> like 10%; tenured usage was very high, 98%. Last gc cause was recorded as >>> "Allocation Failure". So it's true that the tenured doesn't have much >>> breathing room here, but what prompted this email is I don't understand why >>> that even matters considering young gen got cleaned up quite nicely. >>> >>> >>> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna < >>> ysr1729 at gmail.com > wrote: >>> >>> >>> By the way, as others have noted, -XX:+PrintGCDetails at max >>> verbosity level would be your friend to get more visibility into this. >>> Include -XX:+PrintHeapAtGC for even better visibility. For good measure, >>> after the puzzling full gc happens (and hopefully before another GC >>> happens) capture jstat data re the heap (old gen), for direct allocation >>> visibility. >>> >>> -- ramki >>> >>> >>> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna < >>> ysr1729 at gmail.com > wrote: >>> >>> Hi Vitaly -- >>> >>> >>> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich < >>> vitalyd at gmail.com > wrote: >>> >>> Hi Jon, >>> >>> Nope, we're not using CMS here; this is the >>> throughput/parallel collector setup. >>> >>> I was browsing some of the gc code in openjdk, and >>> noticed a few places where each generation attempts to decide (upfront from >>> what I can tell, i.e. before doing the collection) whether it thinks it's >>> "safe" to perform the collection (and if it's not, it punts to the next >>> generation) and also whether some amount of promoted bytes will fit. >>> >>> I didn't dig too much yet, but a cursory scan of that >>> code leads me to think that perhaps the defNew generation is asking the >>> next gen (i.e. tenured) whether it could handle some estimated promotion >>> amount, and given the large imbalance between Young and Tenured size, >>> tenured is reporting that things won't fit -- this then causes a full gc. >>> Is that at all possible from what you know? >>> >>> >>> If that were to happen, you wouldn't see the minor gc that >>> precedes the full gc in the log snippet you posted. >>> >>> The only situation I know where a minor GC is followed >>> immediately by a major is when a minor gc didn't manage to fit an >>> allocation request in the space available. But, thinking more about that, >>> it can't be because one would expect that Eden knows the largest object it >>> can allocate, so if the request is larger than will fit in young, the >>> allocator would just go look for space in the older generation. If that >>> didn't fit, the old gen would precipitate a gc which would collect the >>> entire heap (all this should be taken with a dose of salt as I don't have >>> the code in front of me as I type, and I haven't looked at the allocation >>> policy code in ages). >>> >>> >>> On your first remark about compaction, just to make sure >>> I understand, you're saying that a full GC prefers to move all live objects >>> into tenured (this means taking objects out of survivor space and eden), >>> irrespective of whether their tenuring threshold has been exceeded? If that >>> compaction/migration of objects into tenured overflows tenured, then it >>> attempts to compact the young gen, with overflow into survivor space from >>> eden. So basically, this generation knows how to perform compaction and >>> it's not just a copying collection? >>> >>> >>> That is correct. A full gc does in fact move all survivors >>> from young gen into the old gen. This is a limitation (artificial nepotism >>> can ensue because of "too young" objects that will soon die, getting >>> artificially dragged into the old generation) that I had been lobbying to >>> fix for a while now. I think there's even an old, perhaps still open, bug >>> for this. >>> >>> >>> Is there a way to get the young gen to print an age >>> table of objects in its survivor space? I couldn't find one, but perhaps >>> I'm blind. >>> >>> >>> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also >>> G1?) >>> >>> >>> Also, as a confirmation, System.gc() always invokes a >>> full gc with the parallel collector, right? I believe so, but just wanted >>> to double check while we're on the topic. >>> >>> >>> Right. (Not sure what happens if JNI critical section is in >>> force -- whether it's skipped or we wait for the JNI CS to exit/complete; >>> hopefully others can fill in the blanks/inaccuracies in my comments above, >>> since they are based on things that used to be a while ago in code I >>> haven't looked at recently.) >>> >>> -- ramki >>> >>> >>> Thanks >>> >>> >>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu < >>> jon.masamitsu at oracle.com > wrote: >>> >>> >>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>>> >>>> Yes, I know :) This is some cruft that needs to be >>>> cleaned up. >>>> >>>> So my suspicion is that full gc is triggered >>>> precisely because old gen occupancy is almost 100%, but I'd appreciate >>>> confirmation on that. What's surprising is that even though old gen is >>>> almost full, young gen has lots of room now. In fact, this system is >>>> restarted daily so we never see another young gc before the restart. >>>> >>>> The other odd observation is that survivor spaces >>>> are completely empty after this full gc despite tenuring threshold not >>>> being adjusted. >>>> >>>> >>> The full gc algorithm used compacts everything (old >>> gen and young gen) into >>> the old gen unless it does not all fit. If the old >>> gen overflows, the young gen >>> is compacted into itself. Live in the young gen is >>> compacted into eden first and >>> then into the survivor spaces. >>> >>> My intuitive thinking is that there was no real >>>> reason for the full gc to occur; whatever allocation failed in young could >>>> now succeed and whatever was tenured fit, albeit very tightly. >>>> >>>> >>> Still puzzling about the full GC. Are you using >>> CMS? If you have PrintGCDetails output, >>> that might help. >>> >>> Jon >>> >>> Sent from my phone >>>> >>>> >>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" < >>>> bernd-2014 at eckenfels.net > wrote: >>>> >>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>> schrieb Vitaly Davidovich >>> vitalyd at gmail.com>>: >>>> >>>> >>>> > The vm args are: >>>> > >>>> > -Xms16384m -Xmx16384m -Xmn16384m >>>> -XX:NewSize=12288m >>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>> >>>> Hmm... you have confliciting arguments here, >>>> MaxNewSize overwrites Xmn. >>>> You will get 16384-12288=4gb old size, thats >>>> quite low. As you can see >>>> in your FullGC the steady state after FullGC >>>> has filled it nearly >>>> completely. >>>> >>>> Gruss >>>> Bernd >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>> hotspot-gc-use at openjdk.java.net> >>>> >>>> http://mail.openjdk.java.net/ >>>> mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>> hotspot-gc-use at openjdk.java.net> >>>> http://mail.openjdk.java.net/ >>>> mailman/listinfo/hotspot-gc-use >>>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> openjdk.java.net> >>> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Thu May 8 22:40:18 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Thu, 08 May 2014 15:40:18 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536BF42D.3050202@Oracle.COM> Message-ID: <536C07D2.6020407@Oracle.COM> On 05/08/14 14:44, Vitaly Davidovich wrote: > Hi Peter, > > Thanks for the insight. A few questions ... > > So we get an allocation failure in eden, scavenger starts a young collection. Eden is at ~10gb at this point, with 2 survivor spaces of 1gb each. At the time this young collection runs, tenured only has about 1gb of data (out of 4gb+ capacity). Looking at the total used heap size post young GC: > > 29524.949: [GC 11279905K->4112377K(15728640K), 1.6319030 secs] > > That remaining 4gb of live data does make the tenured generation reach 98% occupancy, but eden is now totally clean with lots of space (10gb). It's also unclear why it decided to overflow entirely into tenured space -- why not keep 1gb in the survivor space (that's the survivor space capacity in my setup) and only promote the remaining live objects to tenured? In our case, this would've been better because presumably a full GC would not have triggered and we could've finished the day without any more gc events due to a lot of headroom left in young. Instead, a full GC was triggered, took nearly 7 secs, and didn't really reclaim much -- it was a waste of time, and seems unnecessary. Without the output of -XX:+PrintHeapAtGC, I'm not going to try to guess the state of the eden and survivors before and after a young generation collection. E.g., how do you know that there isn't 1GB of space in a survivor at the end of the collection you cited? How full was the old generation before this collection? How much of the eden and survivor space survived this collection? (You say, below "eden usage went down to like 10%". Taken literally, that implies a promotion failure with 1GB of stuff not fitting in the old generation: that would be a promotion failure and probably disaster for performance. If I interpret "eden" to mean young space, and remember that your survivors are 10% of the size of your eden, then this might mean that eden is empty and one survivor is full: that's just as things should be. But I don't like speculating. > Also, when these young and old gc events occurred, the only prior gc events were two forced full gc's very early on in the JVM's lifetime. What historical promotion info is used by the subsequent GC event in this case? Is there effectively no promotion history (due to all prior GC events having been forced via System.gc()) or does the scavenger assume some worst case scenario there? You could look at the code (I pointed you at the tip of the iceberg, but maybe this one doesn't go down that far :-), or you could wait for someone else to provide the answer. That code has changed enough that I don't want to give you stale data. I know there are initial values and rolling averages, policies around the various kinds of collections, etc., but I can't recite the details. (Also, I apologize for pointing to an older OpenJDK changeset. If you really go digging, start from the changeset that corresponds to the JVM you are running.) > Finally, what would be the recommended settings (other than raising max heap size and thus giving more room to tenured) for a setup such as this? That is, a JVM that runs for about 8-9 hours before being restarted. The gc allocation rate is fairly low, but does creep up over the course of the day. The amount of truly long-lived objects is somewhat close to tenured capacity, but smaller. The young gen is sized aggressively large to prevent (or at least make an effort in preventing) young GCs from occurring at all. But, if they do occur, vast majority of objects there are garbage. If you are not expecting, or hope to prevent, any young collections, then why have survivor spaces at all? That's 2GB that isn't in the eden where it could be useful staving off young collections. The downside of smaller survivors is that when a young collection happens objects (presumably short-lived objects) may make it into the old generation, which might then fill up and cause a full collection. But if you don't have any young generation collections, you won't have any full collections either. (Maybe unless you have objects that are so large they get allocated directly in the old generation. There's policy code for that, too.) > Thanks > By the way, would the ParNew collector handle this type of setup better than PS? Or hard to say? The scavenger for ParNew is not that different from the one for PS. There might be policy differences around the edges, but if your eden is large enough that you don't do any young generation collections, the allocation parts are probably indistinguishable. It seems brittle to be depending on there being no collections. Usage patterns change over time. Code changes over time. Libraries change over time. JVM's change over time. Machines change over time. Mostly Java programmers don't think about allocating a few objects here and there, but it all adds up. You might sleep easier at night by doubling the size of the heap, even if that means buying more memory. But you'd still have to worry about that pending collection. You made it to 8+ hours before the collection you show: how much more time do you need? ... peter > On Thu, May 8, 2014 at 5:16 PM, Peter B. Kessler > wrote: > > The "problem", if you want to call it that, is that when the young generation has filled up before the next collection it is probably too late. The scavenger is optimistic and thinks everything can be promoted. It just goes ahead and starts a young collection. It gets a promotion failure if it runs out of space in the old generation, painfully recovers from the promotion failure and then causes a full collection. Instead we use the promotion history at the end of each young generation collection to decide to do a full collection preemptively. That way we can sneak in that last scavenge (usually pretty fast, and usually emptying the whole eden) before we invoke a full collection, which doesn't handle massive amounts of garbage well (e.g., in the young generation). If we were pessimistic, given Vitaly's heap layout, we'd do nothing but full collections. > > I think all the policy code (for the parallel scavenger) is in PSScavenge::invoke(), e.g., > > http://hg.openjdk.java.net/__jdk8/jdk8/hotspot/file/__2f6dc76eb8e5/src/share/vm/gc___implementation/__parallelScavenge/psScavenge.__cpp > > starting at line 210. The policy decision is made in PSAdaptiveSizePolicy::should___full_GC > > http://hg.openjdk.java.net/__jdk8/jdk8/hotspot/file/__2f6dc76eb8e5/src/share/vm/gc___implementation/__parallelScavenge/__psAdaptiveSizePolicy.cpp > > starting at line 162. Look at all those lovely fish! > > It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag) would tell you what choices are being made (and probably produce a lot of other output as well :-). In a product build -XX:+PrintGCDetails -XX:+PrintHeapAtGC, as has been suggested by others, should get enough information to figure out what's going on. > > I've cited the code for the parallel scavenger, because Vitaly said "this is the throughput/parallel collector setup". The other collectors have similar policy code. > > ... peter > > > On 05/08/14 13:24, Srinivas Ramakrishna wrote: > > The 98% old gen occupancy triggered one of my two neurons. > I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) > > Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! > -- ramki > > > On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich >> wrote: > > I captured some usage and capacity stats via jstat right after that full gc that started this email thread. It showed 0% usage of survivor spaces (which makes sense now that I know that a full gc empties that out irrespective of tenuring threshold and object age); eden usage went down to like 10%; tenured usage was very high, 98%. Last gc cause was recorded as "Allocation Failure". So it's true that the tenured doesn't have much breathing room here, but what prompted this email is I don't understand why that even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna >> wrote: > > > By the way, as others have noted, -XX:+PrintGCDetails at max verbosity level would be your friend to get more visibility into this. Include -XX:+PrintHeapAtGC for even better visibility. For good measure, after the puzzling full gc happens (and hopefully before another GC happens) capture jstat data re the heap (old gen), for direct allocation visibility. > > -- ramki > > > On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna >> wrote: > > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich >> wrote: > > Hi Jon, > > Nope, we're not using CMS here; this is the throughput/parallel collector setup. > > I was browsing some of the gc code in openjdk, and noticed a few places where each generation attempts to decide (upfront from what I can tell, i.e. before doing the collection) whether it thinks it's "safe" to perform the collection (and if it's not, it punts to the next generation) and also whether some amount of promoted bytes will fit. > > I didn't dig too much yet, but a cursory scan of that code leads me to think that perhaps the defNew generation is asking the next gen (i.e. tenured) whether it could handle some estimated promotion amount, and given the large imbalance between Young and Tenured size, tenured is reporting that things won't fit -- this then causes a full gc. Is that at all possible from what you know? > > > If that were to happen, you wouldn't see the minor gc that precedes the full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed immediately by a major is when a minor gc didn't manage to fit an allocation request in the space available. But, thinking more about that, it can't be because one would expect that Eden knows the largest object it can allocate, so if the request is larger than will fit in young, the allocator would just go look for space in the older generation. If that didn't fit, the old gen would precipitate a gc which would collect the entire heap (all this should be taken with a dose of salt as I don't have the code in front of me as I type, and I haven't looked at the allocation policy code in ages). > > > On your first remark about compaction, just to make sure I understand, you're saying that a full GC prefers to move all live objects into tenured (this means taking objects out of survivor space and eden), irrespective of whether their tenuring threshold has been exceeded? If that compaction/migration of objects into tenured overflows tenured, then it attempts to compact the young gen, with overflow into survivor space from eden. So basically, this generation knows how to perform compaction and it's not just a copying collection? > > > That is correct. A full gc does in fact move all survivors from young gen into the old gen. This is a limitation (artificial nepotism can ensue because of "too young" objects that will soon die, getting artificially dragged into the old generation) that I had been lobbying to fix for a while now. I think there's even an old, perhaps still open, bug for this. > > > Is there a way to get the young gen to print an age table of objects in its survivor space? I couldn't find one, but perhaps I'm blind. > > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > > Also, as a confirmation, System.gc() always invokes a full gc with the parallel collector, right? I believe so, but just wanted to double check while we're on the topic. > > > Right. (Not sure what happens if JNI critical section is in force -- whether it's skipped or we wait for the JNI CS to exit/complete; hopefully others can fill in the blanks/inaccuracies in my comments above, since they are based on things that used to be a while ago in code I haven't looked at recently.) > > -- ramki > > > Thanks > > > On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu >> wrote: > > > On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: > > > Yes, I know :) This is some cruft that needs to be cleaned up. > > So my suspicion is that full gc is triggered precisely because old gen occupancy is almost 100%, but I'd appreciate confirmation on that. What's surprising is that even though old gen is almost full, young gen has lots of room now. In fact, this system is restarted daily so we never see another young gc before the restart. > > The other odd observation is that survivor spaces are completely empty after this full gc despite tenuring threshold not being adjusted. > > > The full gc algorithm used compacts everything (old gen and young gen) into > the old gen unless it does not all fit. If the old gen overflows, the young gen > is compacted into itself. Live in the young gen is compacted into eden first and > then into the survivor spaces. > > My intuitive thinking is that there was no real reason for the full gc to occur; whatever allocation failed in young could now succeed and whatever was tenured fit, albeit very tightly. > > > Still puzzling about the full GC. Are you using CMS? If you have PrintGCDetails output, > that might help. > > Jon > > Sent from my phone > > > On May 7, 2014 8:40 PM, "Bernd Eckenfels" >> wrote: > > Am Wed, 7 May 2014 19:34:20 -0400 > schrieb Vitaly Davidovich >>: > > > > The vm args are: > > > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. > You will get 16384-12288=4gb old size, thats quite low. As you can see > in your FullGC the steady state after FullGC has filled it nearly > completely. > > Gruss > Bernd > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > From jon.masamitsu at oracle.com Thu May 8 22:40:59 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 08 May 2014 15:40:59 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> Message-ID: <536C07FB.8030402@oracle.com> On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: > The 98% old gen occupancy triggered one of my two neurons. > I think there was gc policy code (don't know if it;s still there) that > would proactiively precipitate a full gc when it realized (based on > recent/historical promotion volume stats) that the next minor gc would > not be able to promote its survivors into the head room remaining in > old. (Don't ask me why it;s better to do it now rather than the next > time the young gen fills up and just rely on the same check). Again I > am not looking at the code (as it takes some effort to get to the box > where I keep a copy of the hotspot/openjdk code.) The UseParallelGC collector will do a full GC after a young GC if the UseParallelGC thinks the next young GC will not succeed (per Peter's explanation). I don't think the ParNew GC will do that. I looked for that code but did not find it. I looked in the do_collection() code and the ParNew::collect() code. The only case I could find where a full GC followed a young GC with ParNew was if the collection failed to free enough space for the allocation. Given the amount of free space in the young gen after the collection, that's unlikely. Or course, there could be a bug. Jon > Hopefully Jon &co. will quickly confirm or shoot down the imaginations > o my foggy memory! > -- ramki > > > On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > wrote: > > I captured some usage and capacity stats via jstat right after > that full gc that started this email thread. It showed 0% usage > of survivor spaces (which makes sense now that I know that a full > gc empties that out irrespective of tenuring threshold and object > age); eden usage went down to like 10%; tenured usage was very > high, 98%. Last gc cause was recorded as "Allocation Failure". > So it's true that the tenured doesn't have much breathing room > here, but what prompted this email is I don't understand why that > even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna > > wrote: > > > By the way, as others have noted, -XX:+PrintGCDetails at max > verbosity level would be your friend to get more visibility > into this. Include -XX:+PrintHeapAtGC for even better > visibility. For good measure, after the puzzling full gc > happens (and hopefully before another GC happens) capture > jstat data re the heap (old gen), for direct allocation > visibility. > > -- ramki > > > On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna > > wrote: > > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich > > wrote: > > Hi Jon, > > Nope, we're not using CMS here; this is the > throughput/parallel collector setup. > > I was browsing some of the gc code in openjdk, and > noticed a few places where each generation attempts to > decide (upfront from what I can tell, i.e. before > doing the collection) whether it thinks it's "safe" to > perform the collection (and if it's not, it punts to > the next generation) and also whether some amount of > promoted bytes will fit. > > I didn't dig too much yet, but a cursory scan of that > code leads me to think that perhaps the defNew > generation is asking the next gen (i.e. tenured) > whether it could handle some estimated promotion > amount, and given the large imbalance between Young > and Tenured size, tenured is reporting that things > won't fit -- this then causes a full gc. Is that at > all possible from what you know? > > > If that were to happen, you wouldn't see the minor gc that > precedes the full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed > immediately by a major is when a minor gc didn't manage to > fit an allocation request in the space available. But, > thinking more about that, it can't be because one would > expect that Eden knows the largest object it can allocate, > so if the request is larger than will fit in young, the > allocator would just go look for space in the older > generation. If that didn't fit, the old gen would > precipitate a gc which would collect the entire heap (all > this should be taken with a dose of salt as I don't have > the code in front of me as I type, and I haven't looked at > the allocation policy code in ages). > > > On your first remark about compaction, just to make > sure I understand, you're saying that a full GC > prefers to move all live objects into tenured (this > means taking objects out of survivor space and eden), > irrespective of whether their tenuring threshold has > been exceeded? If that compaction/migration of objects > into tenured overflows tenured, then it attempts to > compact the young gen, with overflow into survivor > space from eden. So basically, this generation knows > how to perform compaction and it's not just a copying > collection? > > > That is correct. A full gc does in fact move all survivors > from young gen into the old gen. This is a limitation > (artificial nepotism can ensue because of "too young" > objects that will soon die, getting artificially dragged > into the old generation) that I had been lobbying to fix > for a while now. I think there's even an old, perhaps > still open, bug for this. > > > Is there a way to get the young gen to print an age > table of objects in its survivor space? I couldn't > find one, but perhaps I'm blind. > > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps > also G1?) > > > Also, as a confirmation, System.gc() always invokes a > full gc with the parallel collector, right? I believe > so, but just wanted to double check while we're on the > topic. > > > Right. (Not sure what happens if JNI critical section is > in force -- whether it's skipped or we wait for the JNI CS > to exit/complete; hopefully others can fill in the > blanks/inaccuracies in my comments above, since they are > based on things that used to be a while ago in code I > haven't looked at recently.) > > -- ramki > > > Thanks > > > On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu > > wrote: > > > On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >> >> Yes, I know :) This is some cruft that needs to >> be cleaned up. >> >> So my suspicion is that full gc is triggered >> precisely because old gen occupancy is almost >> 100%, but I'd appreciate confirmation on that. >> What's surprising is that even though old gen is >> almost full, young gen has lots of room now. In >> fact, this system is restarted daily so we never >> see another young gc before the restart. >> >> The other odd observation is that survivor spaces >> are completely empty after this full gc despite >> tenuring threshold not being adjusted. >> > > The full gc algorithm used compacts everything > (old gen and young gen) into > the old gen unless it does not all fit. If the old > gen overflows, the young gen > is compacted into itself. Live in the young gen is > compacted into eden first and > then into the survivor spaces. > >> My intuitive thinking is that there was no real >> reason for the full gc to occur; whatever >> allocation failed in young could now succeed and >> whatever was tenured fit, albeit very tightly. >> > > Still puzzling about the full GC. Are you using > CMS? If you have PrintGCDetails output, > that might help. > > Jon > >> Sent from my phone >> >> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >> > > wrote: >> >> Am Wed, 7 May 2014 19:34:20 -0400 >> schrieb Vitaly Davidovich > >: >> >> > The vm args are: >> > >> > -Xms16384m -Xmx16384m -Xmn16384m >> -XX:NewSize=12288m >> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >> >> Hmm... you have confliciting arguments here, >> MaxNewSize overwrites Xmn. >> You will get 16384-12288=4gb old size, thats >> quite low. As you can see >> in your FullGC the steady state after FullGC >> has filled it nearly >> completely. >> >> Gruss >> Bernd >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Thu May 8 22:57:57 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 18:57:57 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536C07FB.8030402@oracle.com> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> Message-ID: Jon, Thanks. So ParNew behavior of not triggering a full gc preemptively seems a better fit for my usecase. In fact, we will not have another young gc in our setup, allocation rate, and workload. What's the purpose of doing a preemptive full gc (with all the baggage it comes with) in parallel old? Why not just wait until the next young collection (if that even happens) and take the full gc hit then? I'm failing to see the advantage of taking that hit eagerly, even after reading Peter's description. Is it to avoid promotion failure that it thinks will happen next time? And if so, it thinks doing the preemptive full gc is faster than handling a promotion failure next time? Thanks guys Sent from my phone On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: The 98% old gen occupancy triggered one of my two neurons. I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) The UseParallelGC collector will do a full GC after a young GC if the UseParallelGC thinks the next young GC will not succeed (per Peter's explanation). I don't think the ParNew GC will do that. I looked for that code but did not find it. I looked in the do_collection() code and the ParNew::collect() code. The only case I could find where a full GC followed a young GC with ParNew was if the collection failed to free enough space for the allocation. Given the amount of free space in the young gen after the collection, that's unlikely. Or course, there could be a bug. Jon Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! -- ramki On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich wrote: > I captured some usage and capacity stats via jstat right after that full > gc that started this email thread. It showed 0% usage of survivor spaces > (which makes sense now that I know that a full gc empties that out > irrespective of tenuring threshold and object age); eden usage went down to > like 10%; tenured usage was very high, 98%. Last gc cause was recorded as > "Allocation Failure". So it's true that the tenured doesn't have much > breathing room here, but what prompted this email is I don't understand why > that even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna wrote: > >> >> By the way, as others have noted, -XX:+PrintGCDetails at max verbosity >> level would be your friend to get more visibility into this. Include >> -XX:+PrintHeapAtGC for even better visibility. For good measure, after the >> puzzling full gc happens (and hopefully before another GC happens) capture >> jstat data re the heap (old gen), for direct allocation visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna wrote: >> >>> Hi Vitaly -- >>> >>> >>> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich wrote: >>> >>>> Hi Jon, >>>> >>>> Nope, we're not using CMS here; this is the throughput/parallel >>>> collector setup. >>>> >>>> I was browsing some of the gc code in openjdk, and noticed a few >>>> places where each generation attempts to decide (upfront from what I can >>>> tell, i.e. before doing the collection) whether it thinks it's "safe" to >>>> perform the collection (and if it's not, it punts to the next generation) >>>> and also whether some amount of promoted bytes will fit. >>>> >>>> I didn't dig too much yet, but a cursory scan of that code leads me >>>> to think that perhaps the defNew generation is asking the next gen (i.e. >>>> tenured) whether it could handle some estimated promotion amount, and given >>>> the large imbalance between Young and Tenured size, tenured is reporting >>>> that things won't fit -- this then causes a full gc. Is that at all >>>> possible from what you know? >>>> >>> >>> If that were to happen, you wouldn't see the minor gc that precedes >>> the full gc in the log snippet you posted. >>> >>> The only situation I know where a minor GC is followed immediately by >>> a major is when a minor gc didn't manage to fit an allocation request in >>> the space available. But, thinking more about that, it can't be because one >>> would expect that Eden knows the largest object it can allocate, so if the >>> request is larger than will fit in young, the allocator would just go look >>> for space in the older generation. If that didn't fit, the old gen would >>> precipitate a gc which would collect the entire heap (all this should be >>> taken with a dose of salt as I don't have the code in front of me as I >>> type, and I haven't looked at the allocation policy code in ages). >>> >>> >>>> >>>> On your first remark about compaction, just to make sure I >>>> understand, you're saying that a full GC prefers to move all live objects >>>> into tenured (this means taking objects out of survivor space and eden), >>>> irrespective of whether their tenuring threshold has been exceeded? If that >>>> compaction/migration of objects into tenured overflows tenured, then it >>>> attempts to compact the young gen, with overflow into survivor space from >>>> eden. So basically, this generation knows how to perform compaction and >>>> it's not just a copying collection? >>>> >>> >>> That is correct. A full gc does in fact move all survivors from young >>> gen into the old gen. This is a limitation (artificial nepotism can ensue >>> because of "too young" objects that will soon die, getting artificially >>> dragged into the old generation) that I had been lobbying to fix for a >>> while now. I think there's even an old, perhaps still open, bug for this. >>> >>> >>>> Is there a way to get the young gen to print an age table of objects >>>> in its survivor space? I couldn't find one, but perhaps I'm blind. >>>> >>> >>> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) >>> >>> >>>> >>>> Also, as a confirmation, System.gc() always invokes a full gc with >>>> the parallel collector, right? I believe so, but just wanted to double >>>> check while we're on the topic. >>>> >>> >>> Right. (Not sure what happens if JNI critical section is in force -- >>> whether it's skipped or we wait for the JNI CS to exit/complete; hopefully >>> others can fill in the blanks/inaccuracies in my comments above, since they >>> are based on things that used to be a while ago in code I haven't looked at >>> recently.) >>> >>> -- ramki >>> >>> >>>> >>>> Thanks >>>> >>>> >>>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu >>> > wrote: >>>> >>>>> >>>>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>>>> >>>>> Yes, I know :) This is some cruft that needs to be cleaned up. >>>>> >>>>> So my suspicion is that full gc is triggered precisely because old gen >>>>> occupancy is almost 100%, but I'd appreciate confirmation on that. What's >>>>> surprising is that even though old gen is almost full, young gen has lots >>>>> of room now. In fact, this system is restarted daily so we never see >>>>> another young gc before the restart. >>>>> >>>>> The other odd observation is that survivor spaces are completely empty >>>>> after this full gc despite tenuring threshold not being adjusted. >>>>> >>>>> >>>>> The full gc algorithm used compacts everything (old gen and young >>>>> gen) into >>>>> the old gen unless it does not all fit. If the old gen overflows, >>>>> the young gen >>>>> is compacted into itself. Live in the young gen is compacted into >>>>> eden first and >>>>> then into the survivor spaces. >>>>> >>>>> My intuitive thinking is that there was no real reason for the full >>>>> gc to occur; whatever allocation failed in young could now succeed and >>>>> whatever was tenured fit, albeit very tightly. >>>>> >>>>> >>>>> Still puzzling about the full GC. Are you using CMS? If you have >>>>> PrintGCDetails output, >>>>> that might help. >>>>> >>>>> Jon >>>>> >>>>> Sent from my phone >>>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>>>> wrote: >>>>> >>>>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>>>> schrieb Vitaly Davidovich : >>>>>> >>>>>> > The vm args are: >>>>>> > >>>>>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>>>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>>>> >>>>>> Hmm... you have confliciting arguments here, MaxNewSize overwrites >>>>>> Xmn. >>>>>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>>>>> in your FullGC the steady state after FullGC has filled it nearly >>>>>> completely. >>>>>> >>>>>> Gruss >>>>>> Bernd >>>>>> _______________________________________________ >>>>>> hotspot-gc-use mailing list >>>>>> hotspot-gc-use at openjdk.java.net >>>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> hotspot-gc-use mailing list >>>>> hotspot-gc-use at openjdk.java.net >>>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.masamitsu at oracle.com Thu May 8 23:04:38 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Thu, 08 May 2014 16:04:38 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> Message-ID: <536C0D86.5040405@oracle.com> On 05/08/2014 03:57 PM, Vitaly Davidovich wrote: > > Jon, > > Thanks. So ParNew behavior of not triggering a full gc preemptively > seems a better fit for my usecase. In fact, we will not have another > young gc in our setup, allocation rate, and workload. What's the > purpose of doing a preemptive full gc (with all the baggage it comes > with) in parallel old? Why not just wait until the next young > collection (if that even happens) and take the full gc hit then? I'm > failing to see the advantage of taking that hit eagerly, even after > reading Peter's description. Is it to avoid promotion failure that it > thinks will happen next time? And if so, it thinks doing the > preemptive full gc is faster than handling a promotion failure next time? > We're already at a safepoint after the young GC so we save that cost. I inherited that policy so can't say for sure but that's what I've always thought. Jon > Thanks guys > > Sent from my phone > > > On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: >> The 98% old gen occupancy triggered one of my two neurons. >> I think there was gc policy code (don't know if it;s still there) >> that would proactiively precipitate a full gc when it realized (based >> on recent/historical promotion volume stats) that the next minor gc >> would not be able to promote its survivors into the head room >> remaining in old. (Don't ask me why it;s better to do it now rather >> than the next time the young gen fills up and just rely on the same >> check). Again I am not looking at the code (as it takes some effort >> to get to the box where I keep a copy of the hotspot/openjdk code.) > > The UseParallelGC collector will do a full GC after a young GC if the > UseParallelGC > thinks the next young GC will not succeed (per Peter's explanation). > I don't think > the ParNew GC will do that. I looked for that code but did not find > it. I looked in > the do_collection() code and the ParNew::collect() code. > > The only case I could find where a full GC followed a young GC with > ParNew was > if the collection failed to free enough space for the allocation. > Given the amount > of free space in the young gen after the collection, that's unlikely. > Or course, there > could be a bug. > > Jon > >> Hopefully Jon &co. will quickly confirm or shoot down the >> imaginations o my foggy memory! >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > > wrote: >> >> I captured some usage and capacity stats via jstat right after >> that full gc that started this email thread. It showed 0% usage >> of survivor spaces (which makes sense now that I know that a full >> gc empties that out irrespective of tenuring threshold and object >> age); eden usage went down to like 10%; tenured usage was very >> high, 98%. Last gc cause was recorded as "Allocation Failure". >> So it's true that the tenured doesn't have much breathing room >> here, but what prompted this email is I don't understand why that >> even matters considering young gen got cleaned up quite nicely. >> >> >> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna >> > wrote: >> >> >> By the way, as others have noted, -XX:+PrintGCDetails at max >> verbosity level would be your friend to get more visibility >> into this. Include -XX:+PrintHeapAtGC for even better >> visibility. For good measure, after the puzzling full gc >> happens (and hopefully before another GC happens) capture >> jstat data re the heap (old gen), for direct allocation >> visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna >> > wrote: >> >> Hi Vitaly -- >> >> >> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich >> > wrote: >> >> Hi Jon, >> >> Nope, we're not using CMS here; this is the >> throughput/parallel collector setup. >> >> I was browsing some of the gc code in openjdk, and >> noticed a few places where each generation attempts >> to decide (upfront from what I can tell, i.e. before >> doing the collection) whether it thinks it's "safe" >> to perform the collection (and if it's not, it punts >> to the next generation) and also whether some amount >> of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that >> code leads me to think that perhaps the defNew >> generation is asking the next gen (i.e. tenured) >> whether it could handle some estimated promotion >> amount, and given the large imbalance between Young >> and Tenured size, tenured is reporting that things >> won't fit -- this then causes a full gc. Is that at >> all possible from what you know? >> >> >> If that were to happen, you wouldn't see the minor gc >> that precedes the full gc in the log snippet you posted. >> >> The only situation I know where a minor GC is followed >> immediately by a major is when a minor gc didn't manage >> to fit an allocation request in the space available. But, >> thinking more about that, it can't be because one would >> expect that Eden knows the largest object it can >> allocate, so if the request is larger than will fit in >> young, the allocator would just go look for space in the >> older generation. If that didn't fit, the old gen would >> precipitate a gc which would collect the entire heap (all >> this should be taken with a dose of salt as I don't have >> the code in front of me as I type, and I haven't looked >> at the allocation policy code in ages). >> >> >> On your first remark about compaction, just to make >> sure I understand, you're saying that a full GC >> prefers to move all live objects into tenured (this >> means taking objects out of survivor space and eden), >> irrespective of whether their tenuring threshold has >> been exceeded? If that compaction/migration of >> objects into tenured overflows tenured, then it >> attempts to compact the young gen, with overflow into >> survivor space from eden. So basically, this >> generation knows how to perform compaction and it's >> not just a copying collection? >> >> >> That is correct. A full gc does in fact move all >> survivors from young gen into the old gen. This is a >> limitation (artificial nepotism can ensue because of "too >> young" objects that will soon die, getting artificially >> dragged into the old generation) that I had been lobbying >> to fix for a while now. I think there's even an old, >> perhaps still open, bug for this. >> >> >> Is there a way to get the young gen to print an age >> table of objects in its survivor space? I couldn't >> find one, but perhaps I'm blind. >> >> >> +PrintTenuringDistribution (for ParNew/DefNew, perhaps >> also G1?) >> >> >> Also, as a confirmation, System.gc() always invokes a >> full gc with the parallel collector, right? I believe >> so, but just wanted to double check while we're on >> the topic. >> >> >> Right. (Not sure what happens if JNI critical section is >> in force -- whether it's skipped or we wait for the JNI >> CS to exit/complete; hopefully others can fill in the >> blanks/inaccuracies in my comments above, since they are >> based on things that used to be a while ago in code I >> haven't looked at recently.) >> >> -- ramki >> >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu >> > > wrote: >> >> >> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>> Yes, I know :) This is some cruft that needs to >>> be cleaned up. >>> >>> So my suspicion is that full gc is triggered >>> precisely because old gen occupancy is almost >>> 100%, but I'd appreciate confirmation on that. >>> What's surprising is that even though old gen is >>> almost full, young gen has lots of room now. In >>> fact, this system is restarted daily so we never >>> see another young gc before the restart. >>> >>> The other odd observation is that survivor >>> spaces are completely empty after this full gc >>> despite tenuring threshold not being adjusted. >>> >> >> The full gc algorithm used compacts everything >> (old gen and young gen) into >> the old gen unless it does not all fit. If the >> old gen overflows, the young gen >> is compacted into itself. Live in the young gen >> is compacted into eden first and >> then into the survivor spaces. >> >>> My intuitive thinking is that there was no real >>> reason for the full gc to occur; whatever >>> allocation failed in young could now succeed and >>> whatever was tenured fit, albeit very tightly. >>> >> >> Still puzzling about the full GC. Are you using >> CMS? If you have PrintGCDetails output, >> that might help. >> >> Jon >> >>> Sent from my phone >>> >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" >>> >> > wrote: >>> >>> Am Wed, 7 May 2014 19:34:20 -0400 >>> schrieb Vitaly Davidovich >> >: >>> >>> > The vm args are: >>> > >>> > -Xms16384m -Xmx16384m -Xmn16384m >>> -XX:NewSize=12288m >>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>> >>> Hmm... you have confliciting arguments here, >>> MaxNewSize overwrites Xmn. >>> You will get 16384-12288=4gb old size, thats >>> quite low. As you can see >>> in your FullGC the steady state after FullGC >>> has filled it nearly >>> completely. >>> >>> Gruss >>> Bernd >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Thu May 8 23:11:01 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Thu, 08 May 2014 16:11:01 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> Message-ID: <536C0F05.8010905@Oracle.COM> Recovering from promotion failure is slow. The advantage of scavenges is that you only touch the live objects, and there aren't many of those. When a scavenge finishes successfully, you can just reset the allocation pointer in the eden because everything is either unreachable, or has been copied somewhere else. When a promotion fails, you have an eden with some live object in it, but you don't know where they are. So (at least with techniques we know about) you have to pick up each young generation object and decide if it's still reachable or not, whether it has already been copied out, and compact the live objects into the space in the eden, and then run around updating all the pointers to the live objects that you moved. Touching each object in eden is painful (because there are lots of them) and not terribly satisfying (because most of them are reachable). Much better to do a successful scavenge that empties the young generation and a full collection on the old generation to create space for the *next* scavenge using a collector that's designed for the old generation. Your situation is unusual. You might have to do more work to get the behavior you want. ... peter On 05/08/14 15:57, Vitaly Davidovich wrote: > Jon, > > Thanks. So ParNew behavior of not triggering a full gc preemptively seems a better fit for my usecase. In fact, we will not have another young gc in our setup, allocation rate, and workload. What's the purpose of doing a preemptive full gc (with all the baggage it comes with) in parallel old? Why not just wait until the next young collection (if that even happens) and take the full gc hit then? I'm failing to see the advantage of taking that hit eagerly, even after reading Peter's description. Is it to avoid promotion failure that it thinks will happen next time? And if so, it thinks doing the preemptive full gc is faster than handling a promotion failure next time? > > Thanks guys > > Sent from my phone > > > On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: >> The 98% old gen occupancy triggered one of my two neurons. >> I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) > > The UseParallelGC collector will do a full GC after a young GC if the UseParallelGC > thinks the next young GC will not succeed (per Peter's explanation). I don't think > the ParNew GC will do that. I looked for that code but did not find it. I looked in > the do_collection() code and the ParNew::collect() code. > > The only case I could find where a full GC followed a young GC with ParNew was > if the collection failed to free enough space for the allocation. Given the amount > of free space in the young gen after the collection, that's unlikely. Or course, there > could be a bug. > > Jon > >> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > wrote: >> >> I captured some usage and capacity stats via jstat right after that full gc that started this email thread. It showed 0% usage of survivor spaces (which makes sense now that I know that a full gc empties that out irrespective of tenuring threshold and object age); eden usage went down to like 10%; tenured usage was very high, 98%. Last gc cause was recorded as "Allocation Failure". So it's true that the tenured doesn't have much breathing room here, but what prompted this email is I don't understand why that even matters considering young gen got cleaned up quite nicely. >> >> >> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna > wrote: >> >> >> By the way, as others have noted, -XX:+PrintGCDetails at max verbosity level would be your friend to get more visibility into this. Include -XX:+PrintHeapAtGC for even better visibility. For good measure, after the puzzling full gc happens (and hopefully before another GC happens) capture jstat data re the heap (old gen), for direct allocation visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna > wrote: >> >> Hi Vitaly -- >> >> >> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich > wrote: >> >> Hi Jon, >> >> Nope, we're not using CMS here; this is the throughput/parallel collector setup. >> >> I was browsing some of the gc code in openjdk, and noticed a few places where each generation attempts to decide (upfront from what I can tell, i.e. before doing the collection) whether it thinks it's "safe" to perform the collection (and if it's not, it punts to the next generation) and also whether some amount of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that code leads me to think that perhaps the defNew generation is asking the next gen (i.e. tenured) whether it could handle some estimated promotion amount, and given the large imbalance between Young and Tenured size, tenured is reporting that things won't fit -- this then causes a full gc. Is that at all possible from what you know? >> >> >> If that were to happen, you wouldn't see the minor gc that precedes the full gc in the log snippet you posted. >> >> The only situation I know where a minor GC is followed immediately by a major is when a minor gc didn't manage to fit an allocation request in the space available. But, thinking more about that, it can't be because one would expect that Eden knows the largest object it can allocate, so if the request is larger than will fit in young, the allocator would just go look for space in the older generation. If that didn't fit, the old gen would precipitate a gc which would collect the entire heap (all this should be taken with a dose of salt as I don't have the code in front of me as I type, and I haven't looked at the allocation policy code in ages). >> >> >> On your first remark about compaction, just to make sure I understand, you're saying that a full GC prefers to move all live objects into tenured (this means taking objects out of survivor space and eden), irrespective of whether their tenuring threshold has been exceeded? If that compaction/migration of objects into tenured overflows tenured, then it attempts to compact the young gen, with overflow into survivor space from eden. So basically, this generation knows how to perform compaction and it's not just a copying collection? >> >> >> That is correct. A full gc does in fact move all survivors from young gen into the old gen. This is a limitation (artificial nepotism can ensue because of "too young" objects that will soon die, getting artificially dragged into the old generation) that I had been lobbying to fix for a while now. I think there's even an old, perhaps still open, bug for this. >> >> >> Is there a way to get the young gen to print an age table of objects in its survivor space? I couldn't find one, but perhaps I'm blind. >> >> >> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) >> >> >> Also, as a confirmation, System.gc() always invokes a full gc with the parallel collector, right? I believe so, but just wanted to double check while we're on the topic. >> >> >> Right. (Not sure what happens if JNI critical section is in force -- whether it's skipped or we wait for the JNI CS to exit/complete; hopefully others can fill in the blanks/inaccuracies in my comments above, since they are based on things that used to be a while ago in code I haven't looked at recently.) >> >> -- ramki >> >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu > wrote: >> >> >> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>> Yes, I know :) This is some cruft that needs to be cleaned up. >>> >>> So my suspicion is that full gc is triggered precisely because old gen occupancy is almost 100%, but I'd appreciate confirmation on that. What's surprising is that even though old gen is almost full, young gen has lots of room now. In fact, this system is restarted daily so we never see another young gc before the restart. >>> >>> The other odd observation is that survivor spaces are completely empty after this full gc despite tenuring threshold not being adjusted. >>> >> >> The full gc algorithm used compacts everything (old gen and young gen) into >> the old gen unless it does not all fit. If the old gen overflows, the young gen >> is compacted into itself. Live in the young gen is compacted into eden first and >> then into the survivor spaces. >> >>> My intuitive thinking is that there was no real reason for the full gc to occur; whatever allocation failed in young could now succeed and whatever was tenured fit, albeit very tightly. >>> >> >> Still puzzling about the full GC. Are you using CMS? If you have PrintGCDetails output, >> that might help. >> >> Jon >> >>> Sent from my phone >>> >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" > wrote: >>> >>> Am Wed, 7 May 2014 19:34:20 -0400 >>> schrieb Vitaly Davidovich >: >>> >>> > The vm args are: >>> > >>> > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m >>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>> >>> Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. >>> You will get 16384-12288=4gb old size, thats quite low. As you can see >>> in your FullGC the steady state after FullGC has filled it nearly >>> completely. >>> >>> Gruss >>> Bernd >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> >> >> >> > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > From ysr1729 at gmail.com Thu May 8 23:16:23 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 8 May 2014 16:16:23 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536BF42D.3050202@Oracle.COM> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536BF42D.3050202@Oracle.COM> Message-ID: Hi Peter -- Thanks! (and so nice to hear another familiar voice from the good ol' days again!) I didn't look at the code detail you provide (thanks!) but it seems as though the check for possible promotion failure that was done at the end of the previous scavenge could instead be done, with, as far as I can see, no loss of generality or any loss if information, instead before starting the current GC, and thus avoid the promotion failure that the full gc at the end of the previous gc was trying to avoid. All we then end up doing is allowing one whole allocation epoch before doing the imminent full gc. I am sure I am missing something here that I will find when I get time to read through the details of the actual code again, and use one both of my remaining neurons. :-) -- ramki On Thu, May 8, 2014 at 2:16 PM, Peter B. Kessler wrote: > The "problem", if you want to call it that, is that when the young > generation has filled up before the next collection it is probably too > late. The scavenger is optimistic and thinks everything can be promoted. > It just goes ahead and starts a young collection. It gets a promotion > failure if it runs out of space in the old generation, painfully recovers > from the promotion failure and then causes a full collection. Instead we > use the promotion history at the end of each young generation collection to > decide to do a full collection preemptively. That way we can sneak in that > last scavenge (usually pretty fast, and usually emptying the whole eden) > before we invoke a full collection, which doesn't handle massive amounts of > garbage well (e.g., in the young generation). If we were pessimistic, > given Vitaly's heap layout, we'd do nothing but full collections. > > I think all the policy code (for the parallel scavenger) is in > PSScavenge::invoke(), e.g., > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ > 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/psScavenge. > cpp > > starting at line 210. The policy decision is made in > PSAdaptiveSizePolicy::should_full_GC > > http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/ > 2f6dc76eb8e5/src/share/vm/gc_implementation/parallelScavenge/ > psAdaptiveSizePolicy.cpp > > starting at line 162. Look at all those lovely fish! > > It looks like setting -XX:+PrintGCDetails -XX:+Verbose (a "develop" flag) > would tell you what choices are being made (and probably produce a lot of > other output as well :-). In a product build -XX:+PrintGCDetails > -XX:+PrintHeapAtGC, as has been suggested by others, should get enough > information to figure out what's going on. > > I've cited the code for the parallel scavenger, because Vitaly said "this > is the throughput/parallel collector setup". The other collectors have > similar policy code. > > ... peter > > > On 05/08/14 13:24, Srinivas Ramakrishna wrote: > >> The 98% old gen occupancy triggered one of my two neurons. >> I think there was gc policy code (don't know if it;s still there) that >> would proactiively precipitate a full gc when it realized (based on >> recent/historical promotion volume stats) that the next minor gc would not >> be able to promote its survivors into the head room remaining in old. >> (Don't ask me why it;s better to do it now rather than the next time the >> young gen fills up and just rely on the same check). Again I am not looking >> at the code (as it takes some effort to get to the box where I keep a copy >> of the hotspot/openjdk code.) >> >> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o >> my foggy memory! >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich > vitalyd at gmail.com>> wrote: >> >> I captured some usage and capacity stats via jstat right after that >> full gc that started this email thread. It showed 0% usage of survivor >> spaces (which makes sense now that I know that a full gc empties that out >> irrespective of tenuring threshold and object age); eden usage went down to >> like 10%; tenured usage was very high, 98%. Last gc cause was recorded as >> "Allocation Failure". So it's true that the tenured doesn't have much >> breathing room here, but what prompted this email is I don't understand why >> that even matters considering young gen got cleaned up quite nicely. >> >> >> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna < >> ysr1729 at gmail.com > wrote: >> >> >> By the way, as others have noted, -XX:+PrintGCDetails at max >> verbosity level would be your friend to get more visibility into this. >> Include -XX:+PrintHeapAtGC for even better visibility. For good measure, >> after the puzzling full gc happens (and hopefully before another GC >> happens) capture jstat data re the heap (old gen), for direct allocation >> visibility. >> >> -- ramki >> >> >> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna < >> ysr1729 at gmail.com > wrote: >> >> Hi Vitaly -- >> >> >> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich < >> vitalyd at gmail.com > wrote: >> >> Hi Jon, >> >> Nope, we're not using CMS here; this is the >> throughput/parallel collector setup. >> >> I was browsing some of the gc code in openjdk, and >> noticed a few places where each generation attempts to decide (upfront from >> what I can tell, i.e. before doing the collection) whether it thinks it's >> "safe" to perform the collection (and if it's not, it punts to the next >> generation) and also whether some amount of promoted bytes will fit. >> >> I didn't dig too much yet, but a cursory scan of that >> code leads me to think that perhaps the defNew generation is asking the >> next gen (i.e. tenured) whether it could handle some estimated promotion >> amount, and given the large imbalance between Young and Tenured size, >> tenured is reporting that things won't fit -- this then causes a full gc. >> Is that at all possible from what you know? >> >> >> If that were to happen, you wouldn't see the minor gc that >> precedes the full gc in the log snippet you posted. >> >> The only situation I know where a minor GC is followed >> immediately by a major is when a minor gc didn't manage to fit an >> allocation request in the space available. But, thinking more about that, >> it can't be because one would expect that Eden knows the largest object it >> can allocate, so if the request is larger than will fit in young, the >> allocator would just go look for space in the older generation. If that >> didn't fit, the old gen would precipitate a gc which would collect the >> entire heap (all this should be taken with a dose of salt as I don't have >> the code in front of me as I type, and I haven't looked at the allocation >> policy code in ages). >> >> >> On your first remark about compaction, just to make sure >> I understand, you're saying that a full GC prefers to move all live objects >> into tenured (this means taking objects out of survivor space and eden), >> irrespective of whether their tenuring threshold has been exceeded? If that >> compaction/migration of objects into tenured overflows tenured, then it >> attempts to compact the young gen, with overflow into survivor space from >> eden. So basically, this generation knows how to perform compaction and >> it's not just a copying collection? >> >> >> That is correct. A full gc does in fact move all survivors >> from young gen into the old gen. This is a limitation (artificial nepotism >> can ensue because of "too young" objects that will soon die, getting >> artificially dragged into the old generation) that I had been lobbying to >> fix for a while now. I think there's even an old, perhaps still open, bug >> for this. >> >> >> Is there a way to get the young gen to print an age table >> of objects in its survivor space? I couldn't find one, but perhaps I'm >> blind. >> >> >> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also >> G1?) >> >> >> Also, as a confirmation, System.gc() always invokes a >> full gc with the parallel collector, right? I believe so, but just wanted >> to double check while we're on the topic. >> >> >> Right. (Not sure what happens if JNI critical section is in >> force -- whether it's skipped or we wait for the JNI CS to exit/complete; >> hopefully others can fill in the blanks/inaccuracies in my comments above, >> since they are based on things that used to be a while ago in code I >> haven't looked at recently.) >> >> -- ramki >> >> >> Thanks >> >> >> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu < >> jon.masamitsu at oracle.com > wrote: >> >> >> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >> >>> >>> Yes, I know :) This is some cruft that needs to be >>> cleaned up. >>> >>> So my suspicion is that full gc is triggered >>> precisely because old gen occupancy is almost 100%, but I'd appreciate >>> confirmation on that. What's surprising is that even though old gen is >>> almost full, young gen has lots of room now. In fact, this system is >>> restarted daily so we never see another young gc before the restart. >>> >>> The other odd observation is that survivor spaces >>> are completely empty after this full gc despite tenuring threshold not >>> being adjusted. >>> >>> >> The full gc algorithm used compacts everything (old >> gen and young gen) into >> the old gen unless it does not all fit. If the old >> gen overflows, the young gen >> is compacted into itself. Live in the young gen is >> compacted into eden first and >> then into the survivor spaces. >> >> My intuitive thinking is that there was no real >>> reason for the full gc to occur; whatever allocation failed in young could >>> now succeed and whatever was tenured fit, albeit very tightly. >>> >>> >> Still puzzling about the full GC. Are you using CMS? >> If you have PrintGCDetails output, >> that might help. >> >> Jon >> >> Sent from my phone >>> >>> >>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" < >>> bernd-2014 at eckenfels.net > wrote: >>> >>> Am Wed, 7 May 2014 19:34:20 -0400 >>> schrieb Vitaly Davidovich >> vitalyd at gmail.com>>: >>> >>> >>> > The vm args are: >>> > >>> > -Xms16384m -Xmx16384m -Xmn16384m >>> -XX:NewSize=12288m >>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>> >>> Hmm... you have confliciting arguments here, >>> MaxNewSize overwrites Xmn. >>> You will get 16384-12288=4gb old size, thats >>> quite low. As you can see >>> in your FullGC the steady state after FullGC has >>> filled it nearly >>> completely. >>> >>> Gruss >>> Bernd >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > hotspot-gc-use at openjdk.java.net> >> >> http://mail.openjdk.java.net/ >> mailman/listinfo/hotspot-gc-use >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net > openjdk.java.net> >> >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc- >> use >> >> >> >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Fri May 9 01:04:51 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Thu, 8 May 2014 21:04:51 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536C0F05.8010905@Oracle.COM> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> <536C0F05.8010905@Oracle.COM> Message-ID: Thanks Peter, I understand the mess that a promotion failure causes now. I'm interested in your opinion on Ramki's last point, which is to defer the full gc until the next scavenge (I.e. remember that you think you may have promotion failure on next scavenge, and then do a full gc right before that next scavenge). I think you'll find that there are many JVM deployments out there that either restart their JVM daily or force GC off peak hours. For those cases, you want to keep on running out of eden as much as possible since it's likely that there won't be a next scavenge, either because jvm is restarted or a forced gc is induced off hours, at which point you don't care how long it takes. It sounds like that's what ParNew does, so maybe that's worth a try. Also, in my example here, the induced GC took nearly 7 secs (as compared to 1+ sec for young with a larger space) on a fairly small tenured and reclaimed some very nominal amount - one could say it was a waste of time doing it, but I do appreciate that this setup is not the norm. Thanks, this has been a very educational discussion. Sent from my phone On May 8, 2014 7:11 PM, "Peter B. Kessler" wrote: > Recovering from promotion failure is slow. The advantage of scavenges is > that you only touch the live objects, and there aren't many of those. When > a scavenge finishes successfully, you can just reset the allocation pointer > in the eden because everything is either unreachable, or has been copied > somewhere else. When a promotion fails, you have an eden with some live > object in it, but you don't know where they are. So (at least with > techniques we know about) you have to pick up each young generation object > and decide if it's still reachable or not, whether it has already been > copied out, and compact the live objects into the space in the eden, and > then run around updating all the pointers to the live objects that you > moved. Touching each object in eden is painful (because there are lots of > them) and not terribly satisfying (because most of them are reachable). > > Much better to do a successful scavenge that empties the young generation > and a full collection on the old generation to create space for the *next* > scavenge using a collector that's designed for the old generation. > > Your situation is unusual. You might have to do more work to get the > behavior you want. > > ... peter > > On 05/08/14 15:57, Vitaly Davidovich wrote: > >> Jon, >> >> Thanks. So ParNew behavior of not triggering a full gc preemptively >> seems a better fit for my usecase. In fact, we will not have another young >> gc in our setup, allocation rate, and workload. What's the purpose of >> doing a preemptive full gc (with all the baggage it comes with) in parallel >> old? Why not just wait until the next young collection (if that even >> happens) and take the full gc hit then? I'm failing to see the advantage of >> taking that hit eagerly, even after reading Peter's description. Is it to >> avoid promotion failure that it thinks will happen next time? And if so, it >> thinks doing the preemptive full gc is faster than handling a promotion >> failure next time? >> >> Thanks guys >> >> Sent from my phone >> >> >> On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: >> >>> The 98% old gen occupancy triggered one of my two neurons. >>> I think there was gc policy code (don't know if it;s still there) that >>> would proactiively precipitate a full gc when it realized (based on >>> recent/historical promotion volume stats) that the next minor gc would not >>> be able to promote its survivors into the head room remaining in old. >>> (Don't ask me why it;s better to do it now rather than the next time the >>> young gen fills up and just rely on the same check). Again I am not looking >>> at the code (as it takes some effort to get to the box where I keep a copy >>> of the hotspot/openjdk code.) >>> >> >> The UseParallelGC collector will do a full GC after a young GC if the >> UseParallelGC >> thinks the next young GC will not succeed (per Peter's explanation). I >> don't think >> the ParNew GC will do that. I looked for that code but did not find it. >> I looked in >> the do_collection() code and the ParNew::collect() code. >> >> The only case I could find where a full GC followed a young GC with >> ParNew was >> if the collection failed to free enough space for the allocation. Given >> the amount >> of free space in the young gen after the collection, that's unlikely. Or >> course, there >> could be a bug. >> >> Jon >> >> Hopefully Jon &co. will quickly confirm or shoot down the imaginations o >>> my foggy memory! >>> -- ramki >>> >>> >>> On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich >> vitalyd at gmail.com>> wrote: >>> >>> I captured some usage and capacity stats via jstat right after that >>> full gc that started this email thread. It showed 0% usage of survivor >>> spaces (which makes sense now that I know that a full gc empties that out >>> irrespective of tenuring threshold and object age); eden usage went down to >>> like 10%; tenured usage was very high, 98%. Last gc cause was recorded as >>> "Allocation Failure". So it's true that the tenured doesn't have much >>> breathing room here, but what prompted this email is I don't understand why >>> that even matters considering young gen got cleaned up quite nicely. >>> >>> >>> On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna < >>> ysr1729 at gmail.com > wrote: >>> >>> >>> By the way, as others have noted, -XX:+PrintGCDetails at max >>> verbosity level would be your friend to get more visibility into this. >>> Include -XX:+PrintHeapAtGC for even better visibility. For good measure, >>> after the puzzling full gc happens (and hopefully before another GC >>> happens) capture jstat data re the heap (old gen), for direct allocation >>> visibility. >>> >>> -- ramki >>> >>> >>> On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna < >>> ysr1729 at gmail.com > wrote: >>> >>> Hi Vitaly -- >>> >>> >>> On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich < >>> vitalyd at gmail.com > wrote: >>> >>> Hi Jon, >>> >>> Nope, we're not using CMS here; this is the >>> throughput/parallel collector setup. >>> >>> I was browsing some of the gc code in openjdk, and >>> noticed a few places where each generation attempts to decide (upfront from >>> what I can tell, i.e. before doing the collection) whether it thinks it's >>> "safe" to perform the collection (and if it's not, it punts to the next >>> generation) and also whether some amount of promoted bytes will fit. >>> >>> I didn't dig too much yet, but a cursory scan of that >>> code leads me to think that perhaps the defNew generation is asking the >>> next gen (i.e. tenured) whether it could handle some estimated promotion >>> amount, and given the large imbalance between Young and Tenured size, >>> tenured is reporting that things won't fit -- this then causes a full gc. >>> Is that at all possible from what you know? >>> >>> >>> If that were to happen, you wouldn't see the minor gc that >>> precedes the full gc in the log snippet you posted. >>> >>> The only situation I know where a minor GC is followed >>> immediately by a major is when a minor gc didn't manage to fit an >>> allocation request in the space available. But, thinking more about that, >>> it can't be because one would expect that Eden knows the largest object it >>> can allocate, so if the request is larger than will fit in young, the >>> allocator would just go look for space in the older generation. If that >>> didn't fit, the old gen would precipitate a gc which would collect the >>> entire heap (all this should be taken with a dose of salt as I don't have >>> the code in front of me as I type, and I haven't looked at the allocation >>> policy code in ages). >>> >>> >>> On your first remark about compaction, just to make sure >>> I understand, you're saying that a full GC prefers to move all live objects >>> into tenured (this means taking objects out of survivor space and eden), >>> irrespective of whether their tenuring threshold has been exceeded? If that >>> compaction/migration of objects into tenured overflows tenured, then it >>> attempts to compact the young gen, with overflow into survivor space from >>> eden. So basically, this generation knows how to perform compaction and >>> it's not just a copying collection? >>> >>> >>> That is correct. A full gc does in fact move all survivors >>> from young gen into the old gen. This is a limitation (artificial nepotism >>> can ensue because of "too young" objects that will soon die, getting >>> artificially dragged into the old generation) that I had been lobbying to >>> fix for a while now. I think there's even an old, perhaps still open, bug >>> for this. >>> >>> >>> Is there a way to get the young gen to print an age >>> table of objects in its survivor space? I couldn't find one, but perhaps >>> I'm blind. >>> >>> >>> +PrintTenuringDistribution (for ParNew/DefNew, perhaps also >>> G1?) >>> >>> >>> Also, as a confirmation, System.gc() always invokes a >>> full gc with the parallel collector, right? I believe so, but just wanted >>> to double check while we're on the topic. >>> >>> >>> Right. (Not sure what happens if JNI critical section is in >>> force -- whether it's skipped or we wait for the JNI CS to exit/complete; >>> hopefully others can fill in the blanks/inaccuracies in my comments above, >>> since they are based on things that used to be a while ago in code I >>> haven't looked at recently.) >>> >>> -- ramki >>> >>> >>> Thanks >>> >>> >>> On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu < >>> jon.masamitsu at oracle.com > wrote: >>> >>> >>> On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: >>> >>>> >>>> Yes, I know :) This is some cruft that needs to be >>>> cleaned up. >>>> >>>> So my suspicion is that full gc is triggered >>>> precisely because old gen occupancy is almost 100%, but I'd appreciate >>>> confirmation on that. What's surprising is that even though old gen is >>>> almost full, young gen has lots of room now. In fact, this system is >>>> restarted daily so we never see another young gc before the restart. >>>> >>>> The other odd observation is that survivor spaces >>>> are completely empty after this full gc despite tenuring threshold not >>>> being adjusted. >>>> >>>> >>> The full gc algorithm used compacts everything (old >>> gen and young gen) into >>> the old gen unless it does not all fit. If the old >>> gen overflows, the young gen >>> is compacted into itself. Live in the young gen is >>> compacted into eden first and >>> then into the survivor spaces. >>> >>> My intuitive thinking is that there was no real >>>> reason for the full gc to occur; whatever allocation failed in young could >>>> now succeed and whatever was tenured fit, albeit very tightly. >>>> >>>> >>> Still puzzling about the full GC. Are you using >>> CMS? If you have PrintGCDetails output, >>> that might help. >>> >>> Jon >>> >>> Sent from my phone >>>> >>>> On May 7, 2014 8:40 PM, "Bernd Eckenfels" < >>>> bernd-2014 at eckenfels.net > wrote: >>>> >>>> Am Wed, 7 May 2014 19:34:20 -0400 >>>> schrieb Vitaly Davidovich >>> vitalyd at gmail.com>>: >>>> >>>> > The vm args are: >>>> > >>>> > -Xms16384m -Xmx16384m -Xmn16384m >>>> -XX:NewSize=12288m >>>> > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 >>>> >>>> Hmm... you have confliciting arguments here, >>>> MaxNewSize overwrites Xmn. >>>> You will get 16384-12288=4gb old size, thats >>>> quite low. As you can see >>>> in your FullGC the steady state after FullGC >>>> has filled it nearly >>>> completely. >>>> >>>> Gruss >>>> Bernd >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>> hotspot-gc-use at openjdk.java.net> >>>> http://mail.openjdk.java.net/ >>>> mailman/listinfo/hotspot-gc-use >>>> >>>> >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>> hotspot-gc-use at openjdk.java.net> >>>> http://mail.openjdk.java.net/ >>>> mailman/listinfo/hotspot-gc-use >>>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> hotspot-gc-use at openjdk.java.net> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >> openjdk.java.net> >>> http://mail.openjdk.java.net/ >>> mailman/listinfo/hotspot-gc-use >>> >>> >>> >>> >>> >>> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Fri May 9 18:01:28 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Fri, 09 May 2014 11:01:28 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> <536C0F05.8010905@Oracle.COM> Message-ID: <536D17F8.3000504@Oracle.COM> On 05/08/14 18:04, Vitaly Davidovich wrote: > Thanks Peter, I understand the mess that a promotion failure causes now. I'm interested in your opinion on Ramki's last point, which is to defer the full gc until the next scavenge (I.e. remember that you think you may have promotion failure on next scavenge, and then do a full gc right before that next scavenge). The algorithm used for a full collection is not well-suited for a heap in which there's a lot of garbage. It involves (at least) two passes: an object-graph-order marking pass to identify live objects, and then an address-order pass that looks at every object and moves it if it is live (for the compacting collectiors), or puts it on a free-list if it isn't (for the non-moving collectors). In contrast, scavenging is a single object-graph-order pass that examines only the live objects. That's why it is such a win for edens where we expect the garbage ratio to be high. Time a young generation collection on a typical 10GB eden, and one on a similarly-populated 10GB old generation. For science! If we wait until the eden is full again, when we know the old generation is also full, then we can't scavenge the young generation. Maybe that wouldn't bother you because you are hoping there is no next collection. You've chosen to use the throughput collector, where the focus is on getting the collections done in the most efficient manner. Ramki is suggesting the low-pause collector, where the focus is on doing most of the collection work concurrent with application work. If there are cycles to spare (CPU and memory) that might complete a full collection without interfering with the application, so maybe Ramki is not as concerned about cost of a failed promotion and full collection. One size does not fit all. You haven't said why you made the choice you did. > I think you'll find that there are many JVM deployments out there that either restart their JVM daily or force GC off peak hours. For those cases, you want to keep on running out of eden as much as possible since it's likely that there won't be a next scavenge, either because jvm is restarted or a forced gc is induced off hours, at which point you don't care how long it takes. It sounds like that's what ParNew does, so maybe that's worth a try. Certainly there are applications that only run during banking hours. Or run with hot spares and fail over rather than take a GC pause. Just as certainly there are applications that run for months and shape their heaps to provide the levels of service they need, with the help of the right collector. When benchmarking application performance I've been known to run with -Xms512g -Xmx512g -Xmn384g just to see how things go without any interference from the collector. (I'm in Oracle Labs, but I probably could arrange a sales representative to call you if you want to buy a big machine. :-) I still haven't seen the output of -XX:+PrintHeapAtGC from your application. It may be that we can squeeze enough space into the eden to put off the collection for long enough. Though, it seems brittle. What was the result of running with smaller survivor spaces to give more space to the eden? What was the result of running a larger heap with more space in the young generation? What happens if you run with an extravagant -Xms64g -Xmx64g -Xmn32g? This might seem farcical on a machine, for example, with only 16GB of RAM, but if you really don't care about the duration of the forced collections before the day begins, and really think you don't allocate more than 16GB during the day, then your operating system might well swap out the parts of the heap that you aren't using any more and keep in memory the parts of the heap that you are using. If you ever collect it will be a disaster. If you ever need more live data than you have memory, you will page yourself to death. (Object-graph-order traversals of your swap space! I'm curious how that works out off an SSD.) Brittle isn't the word for this; maybe "pre-stressed"? I have a hard time even suggesting such a setup, but ! you seem d etermined. If it works, write it up. ... peter > Also, in my example here, the induced GC took nearly 7 secs (as compared to 1+ sec for young with a larger space) on a fairly small tenured and reclaimed some very nominal amount - one could say it was a waste of time doing it, but I do appreciate that this setup is not the norm. > > Thanks, this has been a very educational discussion. > > Sent from my phone > > On May 8, 2014 7:11 PM, "Peter B. Kessler" > wrote: > > Recovering from promotion failure is slow. The advantage of scavenges is that you only touch the live objects, and there aren't many of those. When a scavenge finishes successfully, you can just reset the allocation pointer in the eden because everything is either unreachable, or has been copied somewhere else. When a promotion fails, you have an eden with some live object in it, but you don't know where they are. So (at least with techniques we know about) you have to pick up each young generation object and decide if it's still reachable or not, whether it has already been copied out, and compact the live objects into the space in the eden, and then run around updating all the pointers to the live objects that you moved. Touching each object in eden is painful (because there are lots of them) and not terribly satisfying (because most of them are reachable). > > Much better to do a successful scavenge that empties the young generation and a full collection on the old generation to create space for the *next* scavenge using a collector that's designed for the old generation. > > Your situation is unusual. You might have to do more work to get the behavior you want. > > ... peter > > On 05/08/14 15:57, Vitaly Davidovich wrote: > > Jon, > > Thanks. So ParNew behavior of not triggering a full gc preemptively seems a better fit for my usecase. In fact, we will not have another young gc in our setup, allocation rate, and workload. What's the purpose of doing a preemptive full gc (with all the baggage it comes with) in parallel old? Why not just wait until the next young collection (if that even happens) and take the full gc hit then? I'm failing to see the advantage of taking that hit eagerly, even after reading Peter's description. Is it to avoid promotion failure that it thinks will happen next time? And if so, it thinks doing the preemptive full gc is faster than handling a promotion failure next time? > > Thanks guys > > Sent from my phone > > > On 05/08/2014 01:24 PM, Srinivas Ramakrishna wrote: > > The 98% old gen occupancy triggered one of my two neurons. > I think there was gc policy code (don't know if it;s still there) that would proactiively precipitate a full gc when it realized (based on recent/historical promotion volume stats) that the next minor gc would not be able to promote its survivors into the head room remaining in old. (Don't ask me why it;s better to do it now rather than the next time the young gen fills up and just rely on the same check). Again I am not looking at the code (as it takes some effort to get to the box where I keep a copy of the hotspot/openjdk code.) > > > The UseParallelGC collector will do a full GC after a young GC if the UseParallelGC > thinks the next young GC will not succeed (per Peter's explanation). I don't think > the ParNew GC will do that. I looked for that code but did not find it. I looked in > the do_collection() code and the ParNew::collect() code. > > The only case I could find where a full GC followed a young GC with ParNew was > if the collection failed to free enough space for the allocation. Given the amount > of free space in the young gen after the collection, that's unlikely. Or course, there > could be a bug. > > Jon > > Hopefully Jon &co. will quickly confirm or shoot down the imaginations o my foggy memory! > -- ramki > > > On Thu, May 8, 2014 at 12:55 PM, Vitaly Davidovich >> wrote: > > I captured some usage and capacity stats via jstat right after that full gc that started this email thread. It showed 0% usage of survivor spaces (which makes sense now that I know that a full gc empties that out irrespective of tenuring threshold and object age); eden usage went down to like 10%; tenured usage was very high, 98%. Last gc cause was recorded as "Allocation Failure". So it's true that the tenured doesn't have much breathing room here, but what prompted this email is I don't understand why that even matters considering young gen got cleaned up quite nicely. > > > On Thu, May 8, 2014 at 3:36 PM, Srinivas Ramakrishna >> wrote: > > > By the way, as others have noted, -XX:+PrintGCDetails at max verbosity level would be your friend to get more visibility into this. Include -XX:+PrintHeapAtGC for even better visibility. For good measure, after the puzzling full gc happens (and hopefully before another GC happens) capture jstat data re the heap (old gen), for direct allocation visibility. > > -- ramki > > > On Thu, May 8, 2014 at 12:34 PM, Srinivas Ramakrishna >> wrote: > > Hi Vitaly -- > > > On Thu, May 8, 2014 at 11:38 AM, Vitaly Davidovich >> wrote: > > Hi Jon, > > Nope, we're not using CMS here; this is the throughput/parallel collector setup. > > I was browsing some of the gc code in openjdk, and noticed a few places where each generation attempts to decide (upfront from what I can tell, i.e. before doing the collection) whether it thinks it's "safe" to perform the collection (and if it's not, it punts to the next generation) and also whether some amount of promoted bytes will fit. > > I didn't dig too much yet, but a cursory scan of that code leads me to think that perhaps the defNew generation is asking the next gen (i.e. tenured) whether it could handle some estimated promotion amount, and given the large imbalance between Young and Tenured size, tenured is reporting that things won't fit -- this then causes a full gc. Is that at all possible from what you know? > > > If that were to happen, you wouldn't see the minor gc that precedes the full gc in the log snippet you posted. > > The only situation I know where a minor GC is followed immediately by a major is when a minor gc didn't manage to fit an allocation request in the space available. But, thinking more about that, it can't be because one would expect that Eden knows the largest object it can allocate, so if the request is larger than will fit in young, the allocator would just go look for space in the older generation. If that didn't fit, the old gen would precipitate a gc which would collect the entire heap (all this should be taken with a dose of salt as I don't have the code in front of me as I type, and I haven't looked at the allocation policy code in ages). > > > On your first remark about compaction, just to make sure I understand, you're saying that a full GC prefers to move all live objects into tenured (this means taking objects out of survivor space and eden), irrespective of whether their tenuring threshold has been exceeded? If that compaction/migration of objects into tenured overflows tenured, then it attempts to compact the young gen, with overflow into survivor space from eden. So basically, this generation knows how to perform compaction and it's not just a copying collection? > > > That is correct. A full gc does in fact move all survivors from young gen into the old gen. This is a limitation (artificial nepotism can ensue because of "too young" objects that will soon die, getting artificially dragged into the old generation) that I had been lobbying to fix for a while now. I think there's even an old, perhaps still open, bug for this. > > > Is there a way to get the young gen to print an age table of objects in its survivor space? I couldn't find one, but perhaps I'm blind. > > > +PrintTenuringDistribution (for ParNew/DefNew, perhaps also G1?) > > > Also, as a confirmation, System.gc() always invokes a full gc with the parallel collector, right? I believe so, but just wanted to double check while we're on the topic. > > > Right. (Not sure what happens if JNI critical section is in force -- whether it's skipped or we wait for the JNI CS to exit/complete; hopefully others can fill in the blanks/inaccuracies in my comments above, since they are based on things that used to be a while ago in code I haven't looked at recently.) > > -- ramki > > > Thanks > > > On Thu, May 8, 2014 at 1:39 PM, Jon Masamitsu >> wrote: > > > On 05/07/2014 05:55 PM, Vitaly Davidovich wrote: > > > Yes, I know :) This is some cruft that needs to be cleaned up. > > So my suspicion is that full gc is triggered precisely because old gen occupancy is almost 100%, but I'd appreciate confirmation on that. What's surprising is that even though old gen is almost full, young gen has lots of room now. In fact, this system is restarted daily so we never see another young gc before the restart. > > The other odd observation is that survivor spaces are completely empty after this full gc despite tenuring threshold not being adjusted. > > > The full gc algorithm used compacts everything (old gen and young gen) into > the old gen unless it does not all fit. If the old gen overflows, the young gen > is compacted into itself. Live in the young gen is compacted into eden first and > then into the survivor spaces. > > My intuitive thinking is that there was no real reason for the full gc to occur; whatever allocation failed in young could now succeed and whatever was tenured fit, albeit very tightly. > > > Still puzzling about the full GC. Are you using CMS? If you have PrintGCDetails output, > that might help. > > Jon > > Sent from my phone > > On May 7, 2014 8:40 PM, "Bernd Eckenfels" >> wrote: > > Am Wed, 7 May 2014 19:34:20 -0400 > schrieb Vitaly Davidovich >>: > > > The vm args are: > > > > -Xms16384m -Xmx16384m -Xmn16384m -XX:NewSize=12288m > > -XX:MaxNewSize=12288m -XX:SurvivorRatio=10 > > Hmm... you have confliciting arguments here, MaxNewSize overwrites Xmn. > You will get 16384-12288=4gb old size, thats quite low. As you can see > in your FullGC the steady state after FullGC has filled it nearly > completely. > > Gruss > Bernd > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > > > > > > > > > _________________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.__net > http://mail.openjdk.java.net/__mailman/listinfo/hotspot-gc-__use > From ysr1729 at gmail.com Fri May 9 20:39:25 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 9 May 2014 13:39:25 -0700 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: <536D17F8.3000504@Oracle.COM> References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> <536C0F05.8010905@Oracle.COM> <536D17F8.3000504@Oracle.COM> Message-ID: Hi Peter -- On Fri, May 9, 2014 at 11:01 AM, Peter B. Kessler < Peter.B.Kessler at oracle.com> wrote: > On 05/08/14 18:04, Vitaly Davidovich wrote: > >> Thanks Peter, I understand the mess that a promotion failure causes now. >> I'm interested in your opinion on Ramki's last point, which is to defer >> the full gc until the next scavenge (I.e. remember that you think you may >> have promotion failure on next scavenge, and then do a full gc right before >> that next scavenge). >> > > The algorithm used for a full collection is not well-suited for a heap in > which there's a lot of garbage. It involves (at least) two passes: an > object-graph-order marking pass to identify live objects, and then an > address-order pass that looks at every object and moves it if it is live > (for the compacting collectiors), or puts it on a free-list if it isn't > (for the non-moving collectors). In contrast, scavenging is a single > object-graph-order pass that examines only the live objects. That's why it > is such a win for edens where we expect the garbage ratio to be high. > > Time a young generation collection on a typical 10GB eden, and one on a > similarly-populated 10GB old generation. For science! > Thanks for that crucial reminder. Indeed in the case of the ParNew collector that Vitaly appears to be using (why? if you don't use CMS in the old gen?), doing a successful parallel scavenge versus a slow serial compaction that includes those passes serially over an Eden that is full of garbage is even more stark than in the case of the parallel old collector where at least portions of those passes are done multi-threaded. -- ramki > If we wait until the eden is full again, when we know the old generation > is also full, then we can't scavenge the young generation. Maybe that > wouldn't bother you because you are hoping there is no next collection. > You've chosen to use the throughput collector, where the focus is on > getting the collections done in the most efficient manner. Ramki is > suggesting the low-pause collector, where the focus is on doing most of the > collection work concurrent with application work. If there are cycles to > spare (CPU and memory) that might complete a full collection without > interfering with the application, so maybe Ramki is not as concerned about > cost of a failed promotion and full collection. One size does not fit all. > You haven't said why you made the choice you did. -------------- next part -------------- An HTML attachment was scrubbed... URL: From vitalyd at gmail.com Fri May 9 20:58:11 2014 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 9 May 2014 16:58:11 -0400 Subject: ParNew - how does it decide if Full GC is needed In-Reply-To: References: <20140508024024.000054a6.bernd-2014@eckenfels.net> <536BC13C.5020307@oracle.com> <536C07FB.8030402@oracle.com> <536C0F05.8010905@Oracle.COM> <536D17F8.3000504@Oracle.COM> Message-ID: We're using parallel scavenge, not ParNew. I mentioned ParNew a few times in this thread only because it doesn't attempt to do a full gc preemptively (based on what Jon said) like PS, and thought maybe it's worth a shot for our usecase. But it's PS that started this thread ... Thanks Sent from my phone On May 9, 2014 4:39 PM, "Srinivas Ramakrishna" wrote: > > Hi Peter -- > > On Fri, May 9, 2014 at 11:01 AM, Peter B. Kessler < > Peter.B.Kessler at oracle.com> wrote: > >> On 05/08/14 18:04, Vitaly Davidovich wrote: >> >>> Thanks Peter, I understand the mess that a promotion failure causes now. >>> I'm interested in your opinion on Ramki's last point, which is to defer >>> the full gc until the next scavenge (I.e. remember that you think you may >>> have promotion failure on next scavenge, and then do a full gc right before >>> that next scavenge). >>> >> >> The algorithm used for a full collection is not well-suited for a heap in >> which there's a lot of garbage. It involves (at least) two passes: an >> object-graph-order marking pass to identify live objects, and then an >> address-order pass that looks at every object and moves it if it is live >> (for the compacting collectiors), or puts it on a free-list if it isn't >> (for the non-moving collectors). In contrast, scavenging is a single >> object-graph-order pass that examines only the live objects. That's why it >> is such a win for edens where we expect the garbage ratio to be high. >> >> Time a young generation collection on a typical 10GB eden, and one on a >> similarly-populated 10GB old generation. For science! >> > > Thanks for that crucial reminder. Indeed in the case of the ParNew > collector that Vitaly appears to be using (why? if you don't use CMS in the > old gen?), doing a successful parallel scavenge versus a slow serial > compaction that includes those passes serially over an Eden that is full of > garbage is even more stark than in the case of the parallel old collector > where at least portions of those passes are done multi-threaded. > > -- ramki > > >> If we wait until the eden is full again, when we know the old generation >> is also full, then we can't scavenge the young generation. Maybe that >> wouldn't bother you because you are hoping there is no next collection. >> You've chosen to use the throughput collector, where the focus is on >> getting the collections done in the most efficient manner. Ramki is >> suggesting the low-pause collector, where the focus is on doing most of the >> collection work concurrent with application work. If there are cycles to >> spare (CPU and memory) that might complete a full collection without >> interfering with the application, so maybe Ramki is not as concerned about >> cost of a failed promotion and full collection. One size does not fit all. >> You haven't said why you made the choice you did. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mbruce at gmail.com Thu May 29 13:45:43 2014 From: mbruce at gmail.com (Matt Bruce) Date: Thu, 29 May 2014 09:45:43 -0400 Subject: strange ergonomics GC's Message-ID: Hi - I was wondering if anyone else has come across random Ergonomics GC?s when their VM is specifically set up to never do Ergonomics GCs. Here are our settings for VM: 1. Set our min max heap with -Xmx, -Xms 2. Set our GC to be -XX:+UseParallelGC 3. Set perm size via -XX:PermSize= 4. Set -XX:MaxPermSize 5. Set -XX:ReservedCodeCache= 6. Set -XX:InitialCodeCache= 7. Set -XX:NewSize= 8. Set -XX:MaxNewSize= (same value as new size) 9. Set -XX:SurvivorRatio We then turn off a bunch of ergonomics via: 1. -XX:-UseAdaptiveSizePolicy 2. -XX:-UseAdaptiveGCBoundary 3. -XX:-UseAdaptiveGCBoundary The hope is that with all that, we would never see a GC for "Ergonomics". However, every once in a while, when we have a TON of free memory still (think Gigabytes of Eden space free, and Gigabytes of old gen free), the VM will suddenly kick into a really long GC (Ergonomics). I want to stop that from happening, but I don't know what option I'm missing so that it won't do that. There are probably still one or two pieces of memory that the VM thinks it can toy with, such that it bothers to do an Ergonomics GC. Or potentially it?s a bug with the VM. Many Thanks. Matt Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.B.Kessler at Oracle.COM Thu May 29 20:47:39 2014 From: Peter.B.Kessler at Oracle.COM (Peter B. Kessler) Date: Thu, 29 May 2014 13:47:39 -0700 Subject: Minor GC difference Java 7 vs Java 8 In-Reply-To: References: Message-ID: <53879CEB.2070803@Oracle.COM> Are the -XX:+PrintGCDetails "[Times: user=0.01 sys=0.00, real=0.03 secs]" reports for the long pauses different from the short pauses? I'm hoping for some anomalous sys time, or user/real ratio, that would indicate it was something happening on the machine that is interfering with the collector. But you'd think that would show up as occasional 15ms blips in your message processing latency outside of when the collector goes off. Does -XX:+PrintHeapAtGC show anything anomalous about the space occupancy after the long pauses? E.g., more objects getting copied to the survivor space, or promoted to the old generation? You could infer the numbers from -XX:+PrintGCDetails output if you didn't want to deal with the volume produced by -XX:+PrintHeapAtGC. You don't say how large or how stable your old generation size is. If you have to get new pages from the OS to expand the old generation, or give pages back to the OS because the old generation can shrink, that's extra work. You can infer this traffic from -XX:+PrintHeapAtGC output by looking at the "committed" values for the generations. E.g., in "ParOldGen total 43008K, used 226K [0xba400000, 0xbce00000, 0xe4e00000)" those three hex numbers are the start address for the generation, the end of the committed memory for that generation, and the end of the reserved memory for that generation. There's a similar report for the young generation. Running with -Xms equal to -Xmx should prevent pages from being acquired from or returned to the OS during the run. Are you running with -XX:+AlwaysPreTouch? Even if you've reserved and committed the address space, the first time you touch new pages the OS wants to zero them, which takes time. That flags forces all the zeroing at initialization. If you know your page size, you should be able to see the generations (mostly the old generation) crossing a page boundary for the first time in the -XX:+PrintHeapAtGC output. Or it could be some change in the collector between JDK-6 and JDK-7. Posting some log snippets might let sharper eyes see something. ... peter On 04/30/14 07:58, Chris Hurst wrote: > Hi, > > Has anyone seen anything similar to this ... > > On java 6 (range of versions 32bit Solaris) application , using parallel old gc, non adapative. Using a very heavy test performance load we see minor GC's around the 5ms mark and some very rare say 3or4 ish instances in 12 hours say 20ms pauses the number of pauses is random (though always few compares with the total number of GC's) and large ~20ms (this value appears the same for all such points.) We have a large number of minor GC's in our runs, only a full GC at startup. These freak GC's can be bunched or spread out and we can run for many hours without one (though doing minor GC's). > > What's odd is that if I use Java 7 (range of versions 32bit) the result is very close but the spikes (1 or 2 arguably less) are now 30-40ms (depends on run arguably even rarer). Has anyone experienced anything similar why would Java 7 up to double a minor GC / The GC throughput is approximately the same arguably 7 is better throughput just but that freak minor GC makes it usable due to latency. > > In terms of the change in spike height (20 (J6)vs40(J7)) this is very reproducible though the number of points and when they occur varies slightly. The over all GC graph , throughput is similar otherwise as is the resultant memory dump at the end. The test should be constant load, multiple clients just doing the same thing over and over. > > Has anyone seen anything similar, I was hoping someone might have seen a change in defaults, thread timeout, default data structure size change that would account for this. I was hoping the marked increase might be a give away to someone as its way off our average minor GC time. > > We have looked at gclogs, heap dumps, processor activity, background processes, amount of disc access, safepoints etc etc. , we trace message rate into out of the application for variation, compare heap dumps at end etc. nothing stands out so far. > > Chris > > > > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >