From yaoshengzhe at gmail.com Mon Feb 10 11:39:52 2014 From: yaoshengzhe at gmail.com (yao) Date: Mon, 10 Feb 2014 11:39:52 -0800 Subject: G1 GC heap size is not bounded ? Message-ID: Hi All, We've enabled G1 GC on our cluster for about 1 month and recently we observed the heap size keeps growing (via RES column in top), though very slowly. My question is, is there a way to bound heap size for G1 GC ? We set heap size to 82G *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * We found RES column is about 100G, (a few days ago it was about 93G) *$ top* PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java >From previous discussion, Thomas Schatzl pointed out this might be due to large RSet. From below lines in gc log, we found RSet size is about 10.5G. So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there are still unexplained 7.5G data occupied our off-heap. RSet log: * Concurrent RS processed -1601420092 cards Of 651507426 completed buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) by mutator threads. Conc RS threads times(s) 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 \ 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem set sizes = 10980692K. Max = 16182K. Static structures = 563K, free_lists = 78882K. 197990656 occupied cards represented. Max size region = 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], size = 16183K, occupied = 3474K. Did 0 coarsenings.* Thanks -Shengzhe -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140210/f5b2f1ca/attachment.html From yu.zhang at oracle.com Mon Feb 10 12:58:07 2014 From: yu.zhang at oracle.com (YU ZHANG) Date: Mon, 10 Feb 2014 12:58:07 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: Message-ID: <52F93D5F.3040204@oracle.com> Shengzhe, Can you get native memory statistics? It should at least give some idea which part of native memory is growing. You need -XX:+UnlockDiagnosticVMOptions to print native memory data. Thanks, Jenny On 2/10/2014 11:39 AM, yao wrote: > Hi All, > > We've enabled G1 GC on our cluster for about 1 month and recently we > observed the heap size keeps growing (via RES column in top), though > very slowly. My question is, is there a way to bound heap size for G1 GC ? > > We set heap size to 82G > > /-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC > / > We found RES column is about 100G, (a few days ago it was about 93G) > > *$ top* > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java > > From previous discussion, Thomas Schatzl pointed out this might be due > to large RSet. From below lines in gc log, we found RSet size is about > 10.5G. So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see > there are still unexplained 7.5G data occupied our off-heap. > > RSet log: > > / Concurrent RS processed -1601420092 cards > Of 651507426 completed buffers: > 634241940 ( 97.3%) by conc RS threads. > 17265486 ( 2.7%) by mutator threads. > Conc RS threads times(s) > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 \ > 0.00 0.00 0.00 0.00 0.00 0.00 > Total heap region rem set sizes = 10980692K. Max = 16182K. > Static structures = 563K, free_lists = 78882K. > 197990656 occupied cards represented. > Max size region = > 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], > size = 16183K, occupied = 3474K. > Did 0 coarsenings./ > > Thanks > -Shengzhe > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140210/1a6cacee/attachment.html From yaoshengzhe at gmail.com Mon Feb 10 13:30:18 2014 From: yaoshengzhe at gmail.com (yao) Date: Mon, 10 Feb 2014 13:30:18 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: <52F93D5F.3040204@oracle.com> References: <52F93D5F.3040204@oracle.com> Message-ID: Hi Jenny, I've already enabled this option, do you know which parts in gc logs show the native memory stats info ? Thanks Shengzhe On Mon, Feb 10, 2014 at 12:58 PM, YU ZHANG wrote: > Shengzhe, > > Can you get native memory statistics? It should at least give some idea > which part of native memory is growing. You need > > -XX:+UnlockDiagnosticVMOptions > > to print native memory data. > > Thanks, > Jenny > > On 2/10/2014 11:39 AM, yao wrote: > > Hi All, > > We've enabled G1 GC on our cluster for about 1 month and recently we > observed the heap size keeps growing (via RES column in top), though very > slowly. My question is, is there a way to bound heap size for G1 GC ? > > We set heap size to 82G > > > *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * > We found RES column is about 100G, (a few days ago it was about 93G) > > *$ top* > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java > > From previous discussion, Thomas Schatzl pointed out this might be due to > large RSet. From below lines in gc log, we found RSet size is about 10.5G. > So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there are > still unexplained 7.5G data occupied our off-heap. > > RSet log: > > > > > > > > > > > > > * Concurrent RS processed -1601420092 cards Of 651507426 completed > buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) > by mutator threads. Conc RS threads times(s) 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 \ > 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem > set sizes = 10980692K. Max = 16182K. Static structures = 563K, > free_lists = 78882K. 197990656 occupied cards represented. Max size > region = 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], > size = 16183K, occupied = 3474K. Did 0 coarsenings.* > > Thanks > -Shengzhe > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140210/f8e0011f/attachment.html From ysr1729 at gmail.com Mon Feb 10 13:36:21 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Mon, 10 Feb 2014 13:36:21 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: Message-ID: Hi Shengzhe -- What's the version of JDK where you're running into this issue, and has the JVM had any STW full gc's because of mixed gc's not keeping up? -- ramki On Mon, Feb 10, 2014 at 11:39 AM, yao wrote: > Hi All, > > We've enabled G1 GC on our cluster for about 1 month and recently we > observed the heap size keeps growing (via RES column in top), though very > slowly. My question is, is there a way to bound heap size for G1 GC ? > > We set heap size to 82G > > > *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * > We found RES column is about 100G, (a few days ago it was about 93G) > > *$ top* > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java > > From previous discussion, Thomas Schatzl pointed out this might be due to > large RSet. From below lines in gc log, we found RSet size is about 10.5G. > So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there are > still unexplained 7.5G data occupied our off-heap. > > RSet log: > > > > > > > > > > > > > * Concurrent RS processed -1601420092 cards Of 651507426 completed > buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) > by mutator threads. Conc RS threads times(s) 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > \ 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem > set sizes = 10980692K. Max = 16182K. Static structures = 563K, free_lists > = 78882K. 197990656 occupied cards represented. Max size region = > 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], size = > 16183K, occupied = 3474K. Did 0 coarsenings.* > > Thanks > -Shengzhe > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140210/ecf44e01/attachment.html From bernd-2014 at eckenfels.net Mon Feb 10 13:47:07 2014 From: bernd-2014 at eckenfels.net (Bernd Eckenfels) Date: Mon, 10 Feb 2014 22:47:07 +0100 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: <52F93D5F.3040204@oracle.com> Message-ID: <20140210224707.00004e56.bernd-2014@eckenfels.net> Hello, there is also an "jcmd VM.native_memory" for JDK8 (however it crashes with the latest build on my Win7 x64 - so I cannot check if the output is of any use) Gruss Bernd Am Mon, 10 Feb 2014 13:30:18 -0800 schrieb yao : > Hi Jenny, > > I've already enabled this option, do you know which parts in gc logs > show the native memory stats info ? > > Thanks > Shengzhe > > > On Mon, Feb 10, 2014 at 12:58 PM, YU ZHANG > wrote: > > > Shengzhe, > > > > Can you get native memory statistics? It should at least give some > > idea which part of native memory is growing. You need > > > > -XX:+UnlockDiagnosticVMOptions > > > > to print native memory data. > > > > Thanks, > > Jenny > > > > On 2/10/2014 11:39 AM, yao wrote: > > > > Hi All, > > > > We've enabled G1 GC on our cluster for about 1 month and recently > > we observed the heap size keeps growing (via RES column in top), > > though very slowly. My question is, is there a way to bound heap > > size for G1 GC ? > > > > We set heap size to 82G > > > > > > *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions > > -XX:+UseG1GC * We found RES column is about 100G, (a few days ago > > it was about 93G) > > > > *$ top* > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java > > > > From previous discussion, Thomas Schatzl pointed out this might be > > due to large RSet. From below lines in gc log, we found RSet size > > is about 10.5G. So we get Xmx + RSet = 82G + 10.5G = 92.5G, here > > you can see there are still unexplained 7.5G data occupied our > > off-heap. > > > > RSet log: > > > > > > > > > > > > > > > > > > > > > > > > > > * Concurrent RS processed -1601420092 cards Of 651507426 completed > > buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 > > ( 2.7%) by mutator threads. Conc RS threads times(s) > > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > > 0.00 0.00 \ 0.00 0.00 0.00 0.00 0.00 > > 0.00 Total heap region rem set sizes = 10980692K. Max = > > 16182K. Static structures = 563K, free_lists = 78882K. > > 197990656 occupied cards represented. Max size region = > > 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], > > size = 16183K, occupied = 3474K. Did 0 coarsenings.* > > > > Thanks > > -Shengzhe > > > > > > _______________________________________________ > > hotspot-gc-use mailing > > listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > > From bernd-2014 at eckenfels.net Mon Feb 10 14:00:37 2014 From: bernd-2014 at eckenfels.net (Bernd Eckenfels) Date: Mon, 10 Feb 2014 23:00:37 +0100 Subject: G1 GC heap size is not bounded ? In-Reply-To: <20140210224707.00004e56.bernd-2014@eckenfels.net> References: <52F93D5F.3040204@oracle.com> <20140210224707.00004e56.bernd-2014@eckenfels.net> Message-ID: <20140210230037.0000653d.bernd-2014@eckenfels.net> Hello, I actually found the reason for my crashes (old JVM wath in path). If I remove it from the path the following works (I use jconsole as a sample Java application to monitor, if you use java.exe -jar instead the -J has to be removed: C:\>"c:\Program Files\Java\jdk1.8.0\bin\jconsole" -J-XX:NativeMemoryTracking=detail C:\>"c:\Program Files\Java\jdk1.8.0\bin\jcmd" -l 5408 sun.tools.jcmd.JCmd -l 5732 sun.tools.jconsole.JConsole C:\>"c:\Program Files\Java\jdk1.8.0\bin\jcmd" 5732 VM.native_memory 5732: Native Memory Tracking: Total: reserved=3538105KB, committed=175085KB - Java Heap (reserved=2080768KB, committed=17920KB) (mmap: reserved=2080768KB, committed=17920KB) - Class (reserved=1070434KB, committed=23266KB) (classes #2260) (malloc=9570KB, #1230) (mmap: reserved=1060864KB, committed=13696KB) etc. From yaoshengzhe at gmail.com Mon Feb 10 14:06:09 2014 From: yaoshengzhe at gmail.com (yao) Date: Mon, 10 Feb 2014 14:06:09 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: Message-ID: Hi Ramki, JDK version is 1.7.0_40 -Shengzhe On Mon, Feb 10, 2014 at 1:36 PM, Srinivas Ramakrishna wrote: > Hi Shengzhe -- > > What's the version of JDK where you're running into this issue, and has > the JVM had any STW full gc's because of mixed gc's not keeping up? > > -- ramki > > > On Mon, Feb 10, 2014 at 11:39 AM, yao wrote: > >> Hi All, >> >> We've enabled G1 GC on our cluster for about 1 month and recently we >> observed the heap size keeps growing (via RES column in top), though very >> slowly. My question is, is there a way to bound heap size for G1 GC ? >> >> We set heap size to 82G >> >> >> *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * >> We found RES column is about 100G, (a few days ago it was about 93G) >> >> *$ top* >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java >> >> From previous discussion, Thomas Schatzl pointed out this might be due to >> large RSet. From below lines in gc log, we found RSet size is about 10.5G. >> So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there are >> still unexplained 7.5G data occupied our off-heap. >> >> RSet log: >> >> >> >> >> >> >> >> >> >> >> >> >> * Concurrent RS processed -1601420092 cards Of 651507426 completed >> buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) >> by mutator threads. Conc RS threads times(s) 0.00 0.00 >> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >> \ 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem >> set sizes = 10980692K. Max = 16182K. Static structures = 563K, free_lists >> = 78882K. 197990656 occupied cards represented. Max size region = >> 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], size = >> 16183K, occupied = 3474K. Did 0 coarsenings.* >> >> Thanks >> -Shengzhe >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140210/a7c7fb26/attachment.html From yaoshengzhe at gmail.com Tue Feb 18 11:41:13 2014 From: yaoshengzhe at gmail.com (yao) Date: Tue, 18 Feb 2014 11:41:13 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: Message-ID: Hi All, We've tracked our system memory usage for a week and found it seems stop increasing when reaching 100G. Besides RSet, it looks like G1 also allocates other data structure in off-heap manner. I have a follow up question, could we restrict the application memory usage equal to what we set in -Xmx when G1 is enabled ? Assume application itself doesn't allocate data in off-heap manner, that will make application memory usage more predictable. Thanks Shengzhe On Mon, Feb 10, 2014 at 2:06 PM, yao wrote: > Hi Ramki, > > JDK version is 1.7.0_40 > > -Shengzhe > > > On Mon, Feb 10, 2014 at 1:36 PM, Srinivas Ramakrishna wrote: > >> Hi Shengzhe -- >> >> What's the version of JDK where you're running into this issue, and has >> the JVM had any STW full gc's because of mixed gc's not keeping up? >> >> -- ramki >> >> >> On Mon, Feb 10, 2014 at 11:39 AM, yao wrote: >> >>> Hi All, >>> >>> We've enabled G1 GC on our cluster for about 1 month and recently we >>> observed the heap size keeps growing (via RES column in top), though very >>> slowly. My question is, is there a way to bound heap size for G1 GC ? >>> >>> We set heap size to 82G >>> >>> >>> *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * >>> We found RES column is about 100G, (a few days ago it was about 93G) >>> >>> *$ top* >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>> 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java >>> >>> From previous discussion, Thomas Schatzl pointed out this might be due >>> to large RSet. From below lines in gc log, we found RSet size is about >>> 10.5G. So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there >>> are still unexplained 7.5G data occupied our off-heap. >>> >>> RSet log: >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> * Concurrent RS processed -1601420092 cards Of 651507426 completed >>> buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) >>> by mutator threads. Conc RS threads times(s) 0.00 0.00 >>> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >>> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >>> \ 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem >>> set sizes = 10980692K. Max = 16182K. Static structures = 563K, free_lists >>> = 78882K. 197990656 occupied cards represented. Max size region = >>> 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], size = >>> 16183K, occupied = 3474K. Did 0 coarsenings.* >>> >>> Thanks >>> -Shengzhe >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140218/1c4e5898/attachment.html From yu.zhang at oracle.com Tue Feb 18 22:45:23 2014 From: yu.zhang at oracle.com (YU ZHANG) Date: Tue, 18 Feb 2014 22:45:23 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: References: Message-ID: <53045303.4010109@oracle.com> Shengzhe, All gcs have some memory overhead. G1 might have more compared to other gcs. Can you share your data about what have increased? Thanks, Jenny On 2/18/2014 11:41 AM, yao wrote: > Hi All, > > We've tracked our system memory usage for a week and found it seems > stop increasing when reaching 100G. Besides RSet, it looks like G1 > also allocates other data structure in off-heap manner. I have a > follow up question, could we restrict the application memory usage > equal to what we set in -Xmx when G1 is enabled ? Assume application > itself doesn't allocate data in off-heap manner, that will make > application memory usage more predictable. > > Thanks > Shengzhe > > > On Mon, Feb 10, 2014 at 2:06 PM, yao > wrote: > > Hi Ramki, > > JDK version is 1.7.0_40 > > -Shengzhe > > > On Mon, Feb 10, 2014 at 1:36 PM, Srinivas Ramakrishna > > wrote: > > Hi Shengzhe -- > > What's the version of JDK where you're running into this > issue, and has the JVM had any STW full gc's because of mixed > gc's not keeping up? > > -- ramki > > > On Mon, Feb 10, 2014 at 11:39 AM, yao > wrote: > > Hi All, > > We've enabled G1 GC on our cluster for about 1 month and > recently we observed the heap size keeps growing (via RES > column in top), though very slowly. My question is, is > there a way to bound heap size for G1 GC ? > > We set heap size to 82G > > /-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions > -XX:+UseG1GC > / > We found RES column is about 100G, (a few days ago it was > about 93G) > > *$ top* > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 > 177771:41 java > > From previous discussion, Thomas Schatzl pointed out this > might be due to large RSet. From below lines in gc log, we > found RSet size is about 10.5G. So we get Xmx + RSet = 82G > + 10.5G = 92.5G, here you can see there are still > unexplained 7.5G data occupied our off-heap. > > RSet log: > > / Concurrent RS processed -1601420092 cards > Of 651507426 completed buffers: > 634241940 ( 97.3%) by conc RS threads. > 17265486 ( 2.7%) by mutator threads. > Conc RS threads times(s) > 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 0.00 \ > 0.00 0.00 0.00 0.00 0.00 0.00 > Total heap region rem set sizes = 10980692K. Max = 16182K. > Static structures = 563K, free_lists = 78882K. > 197990656 occupied cards represented. > Max size region = > 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], > size = 16183K, occupied = 3474K. > Did 0 coarsenings./ > > Thanks > -Shengzhe > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140218/52b97c52/attachment.html From yaoshengzhe at gmail.com Wed Feb 19 13:04:37 2014 From: yaoshengzhe at gmail.com (yao) Date: Wed, 19 Feb 2014 13:04:37 -0800 Subject: G1 GC heap size is not bounded ? In-Reply-To: <53045303.4010109@oracle.com> References: <53045303.4010109@oracle.com> Message-ID: > > All gcs have some memory overhead. G1 might have more compared to other > gcs. > > Can you share your data about what have increased? In our case, we give 82G to JVM (via -Xmx) and the overall memory usage is around 100G with G1. In contrast, the overall memory usage with CMS in similar case is close to the number set via Xmx. On Tue, Feb 18, 2014 at 10:45 PM, YU ZHANG wrote: > Shengzhe, > > All gcs have some memory overhead. G1 might have more compared to other > gcs. > > Can you share your data about what have increased? > > Thanks, > Jenny > > On 2/18/2014 11:41 AM, yao wrote: > > Hi All, > > We've tracked our system memory usage for a week and found it seems stop > increasing when reaching 100G. Besides RSet, it looks like G1 also > allocates other data structure in off-heap manner. I have a follow up > question, could we restrict the application memory usage equal to what we > set in -Xmx when G1 is enabled ? Assume application itself doesn't allocate > data in off-heap manner, that will make application memory usage more > predictable. > > Thanks > Shengzhe > > > On Mon, Feb 10, 2014 at 2:06 PM, yao wrote: > >> Hi Ramki, >> >> JDK version is 1.7.0_40 >> >> -Shengzhe >> >> >> On Mon, Feb 10, 2014 at 1:36 PM, Srinivas Ramakrishna wrote: >> >>> Hi Shengzhe -- >>> >>> What's the version of JDK where you're running into this issue, and >>> has the JVM had any STW full gc's because of mixed gc's not keeping up? >>> >>> -- ramki >>> >>> >>> On Mon, Feb 10, 2014 at 11:39 AM, yao wrote: >>> >>>> Hi All, >>>> >>>> We've enabled G1 GC on our cluster for about 1 month and recently we >>>> observed the heap size keeps growing (via RES column in top), though very >>>> slowly. My question is, is there a way to bound heap size for G1 GC ? >>>> >>>> We set heap size to 82G >>>> >>>> >>>> *-Xms83868m -Xmx83868m -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC * >>>> We found RES column is about 100G, (a few days ago it was about 93G) >>>> >>>> *$ top* >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>> 5757 hbase 20 0 104g *100g* 5240 S 271.3 79.6 177771:41 java >>>> >>>> From previous discussion, Thomas Schatzl pointed out this might be due >>>> to large RSet. From below lines in gc log, we found RSet size is about >>>> 10.5G. So we get Xmx + RSet = 82G + 10.5G = 92.5G, here you can see there >>>> are still unexplained 7.5G data occupied our off-heap. >>>> >>>> RSet log: >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> * Concurrent RS processed -1601420092 cards Of 651507426 completed >>>> buffers: 634241940 ( 97.3%) by conc RS threads. 17265486 ( 2.7%) >>>> by mutator threads. Conc RS threads times(s) 0.00 0.00 >>>> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >>>> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 \ >>>> 0.00 0.00 0.00 0.00 0.00 0.00 Total heap region rem >>>> set sizes = 10980692K. Max = 16182K. Static structures = 563K, >>>> free_lists = 78882K. 197990656 occupied cards represented. Max size >>>> region = 165:(O)[0x00007f0ce0000000,0x00007f0ce2000000,0x00007f0ce2000000], >>>> size = 16183K, occupied = 3474K. Did 0 coarsenings.* >>>> >>>> Thanks >>>> -Shengzhe >>>> >>>> _______________________________________________ >>>> hotspot-gc-use mailing list >>>> hotspot-gc-use at openjdk.java.net >>>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>>> >>>> >>> >> > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140219/fb5a7fc5/attachment.html From kirtiteja at gmail.com Thu Feb 20 01:24:33 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Thu, 20 Feb 2014 01:24:33 -0800 Subject: G1 GC - pauses much larger than target Message-ID: Hi, I am trying out G1 collector for our application. Our application runs with 2GB heap and we expect relatively low latency. The pause time target is set to 25ms. There are much bigger pauses (and unexplained) in order of few 100s of ms. This is not a rare occurence and i can see this 15-20 times in 6-7 hours runs. We use deterministic GC in jrockit for 1.6 and want to upgrade to 1.7 or even 1.8 after the next months release. Explaining and tuning these unexplained large pauses is critical for us to upgrade. Can anyone please help in identifying where this time is spent or how to bring it down? Below is the log for one such occurrence and also the JVM parameters for this run - My observations - 1) real time is much larger than the user time. This server has 2 processors with 8 cores each and hyper-threading. So, for most of time the progress is blocked. 2) Start time is 14840.246, end time for worker is 14840270.2 and end time for pause is 14840.764. So, the time is spent after the parallel phase is completed and before the pause finishes. I can add more logs if required. I can also run it in same env with different parameters if there are suggestions. 2014-02-20T02:15:42.580+0000: 14840.246: Application time: 8.5619840 seconds 2014-02-20T02:15:42.581+0000: 14840.247: [GC pause (young) Desired survivor size 83886080 bytes, new threshold 15 (max 15) - age 1: 2511184 bytes, 2511184 total - age 2: 1672024 bytes, 4183208 total - age 3: 1733824 bytes, 5917032 total - age 4: 1663920 bytes, 7580952 total - age 5: 1719944 bytes, 9300896 total - age 6: 1641904 bytes, 10942800 total - age 7: 1796976 bytes, 12739776 total - age 8: 1706344 bytes, 14446120 total - age 9: 1722920 bytes, 16169040 total - age 10: 1729176 bytes, 17898216 total - age 11: 1500056 bytes, 19398272 total - age 12: 1486520 bytes, 20884792 total - age 13: 1618272 bytes, 22503064 total - age 14: 1492840 bytes, 23995904 total - age 15: 1486920 bytes, 25482824 total 14840.247: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 12196, predicted base time: 7.85 ms, remaining time: 17.15 ms, target pause time: 25.00 ms] 14840.247: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 146 regions, survivors: 7 regions, predicted young region time: 8.76 ms] 14840.247: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 146 regions, survivors: 7 regions, old: 0 regions, predicted pause time: 16.60 ms, target pause time: 25.00 ms] , 0.0247660 secs] [Parallel Time: 23.2 ms, GC Workers: 9] [GC Worker Start (ms): Min: 14840247.4, Avg: 14840247.6, Max: 14840247.8, Diff: 0.4] [Ext Root Scanning (ms): Min: 3.5, Avg: 4.1, Max: 5.4, Diff: 1.9, Sum: 37.2] [Update RS (ms): Min: 1.1, Avg: 2.2, Max: 2.8, Diff: 1.7, Sum: 19.8] [Processed Buffers: Min: 5, Avg: 9.4, Max: 15, Diff: 10, Sum: 85] [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.7] [Object Copy (ms): Min: 15.8, Avg: 16.0, Max: 16.2, Diff: 0.4, Sum: 144.2] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.8] [GC Worker Total (ms): Min: 22.3, Avg: 22.5, Max: 22.7, Diff: 0.4, Sum: 202.7] [GC Worker End (ms): Min: 14840270.1, Avg: 14840270.1, Max: 14840270.2, Diff: 0.2] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 1.1 ms] [Choose CSet: 0.1 ms] [Ref Proc: 0.4 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.3 ms] [Eden: 1168.0M(1168.0M)->0.0B(1160.0M) Survivors: 56.0M->64.0M Heap: 1718.4M(2048.0M)->563.4M(2048.0M)] [Times: user=0.21 sys=0.00, real=0.52 secs] 2014-02-20T02:15:43.098+0000: 14840.764: Total time for which application threads were stopped: 0.5178390 seconds JVM parameters - -server -Xmx2g -Xms2g -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+UseG1GC -XX:MaxGCPauseMillis=25 -XX:ParallelGCThreads=9 -XX:ConcGCThreads=4 -XX:G1HeapRegionSize=8M -XX:+PrintTLAB -XX:+AggressiveOpts -XX:+PrintFlagsFinal -Xloggc:/integral/logs/gc.log -verbose:gc -XX:+PrintTenuringDistribution -XX:+PrintGCDateStamps -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCDetails -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3026 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140220/202afb4a/attachment.html From ysr1729 at gmail.com Thu Feb 20 02:04:36 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Thu, 20 Feb 2014 02:04:36 -0800 Subject: G1 GC - pauses much larger than target In-Reply-To: References: Message-ID: Probably some post-GC clean up?... nmethod sweep, monitor list cleanup, and other housekeeping. There's a trace flag that displays this in more detail: bool TraceSafepointCleanupTime = false {product} Additionally, PrintSafepointStatistics might shed light . Since you have it already enabled, you could probably look at the data for this particular pause. -- ramki On Thu, Feb 20, 2014 at 1:24 AM, Kirti Teja Rao wrote: > Hi, > > I am trying out G1 collector for our application. Our application runs > with 2GB heap and we expect relatively low latency. The pause time target > is set to 25ms. There are much bigger pauses (and unexplained) in order of > few 100s of ms. This is not a rare occurence and i can see this 15-20 times > in 6-7 hours runs. We use deterministic GC in jrockit for 1.6 and want to > upgrade to 1.7 or even 1.8 after the next months release. Explaining and > tuning these unexplained large pauses is critical for us to upgrade. > Can anyone please help in identifying where this time is spent or how to > bring it down? > > Below is the log for one such occurrence and also the JVM parameters for > this run - > > My observations - > 1) real time is much larger than the user time. This server has 2 > processors with 8 cores each and hyper-threading. So, for most of time the > progress is blocked. > 2) Start time is 14840.246, end time for worker is 14840270.2 and end time > for pause is 14840.764. So, the time is spent after the parallel phase is > completed and before the pause finishes. > > I can add more logs if required. I can also run it in same env with > different parameters if there are suggestions. > > 2014-02-20T02:15:42.580+0000: 14840.246: Application time: 8.5619840 > seconds > 2014-02-20T02:15:42.581+0000: 14840.247: [GC pause (young) > Desired survivor size 83886080 bytes, new threshold 15 (max 15) > - age 1: 2511184 bytes, 2511184 total > - age 2: 1672024 bytes, 4183208 total > - age 3: 1733824 bytes, 5917032 total > - age 4: 1663920 bytes, 7580952 total > - age 5: 1719944 bytes, 9300896 total > - age 6: 1641904 bytes, 10942800 total > - age 7: 1796976 bytes, 12739776 total > - age 8: 1706344 bytes, 14446120 total > - age 9: 1722920 bytes, 16169040 total > - age 10: 1729176 bytes, 17898216 total > - age 11: 1500056 bytes, 19398272 total > - age 12: 1486520 bytes, 20884792 total > - age 13: 1618272 bytes, 22503064 total > - age 14: 1492840 bytes, 23995904 total > - age 15: 1486920 bytes, 25482824 total > 14840.247: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 12196, predicted base time: 7.85 ms, remaining time: 17.15 > ms, target pause time: 25.00 ms] > 14840.247: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 146 regions, survivors: 7 regions, predicted young region time: 8.76 > ms] > 14840.247: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 146 regions, survivors: 7 regions, old: 0 regions, predicted pause time: > 16.60 ms, target pause time: 25.00 ms] > , 0.0247660 secs] > [Parallel Time: 23.2 ms, GC Workers: 9] > [GC Worker Start (ms): Min: 14840247.4, Avg: 14840247.6, Max: > 14840247.8, Diff: 0.4] > [Ext Root Scanning (ms): Min: 3.5, Avg: 4.1, Max: 5.4, Diff: 1.9, > Sum: 37.2] > [Update RS (ms): Min: 1.1, Avg: 2.2, Max: 2.8, Diff: 1.7, Sum: 19.8] > [Processed Buffers: Min: 5, Avg: 9.4, Max: 15, Diff: 10, Sum: 85] > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.7] > [Object Copy (ms): Min: 15.8, Avg: 16.0, Max: 16.2, Diff: 0.4, Sum: > 144.2] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: > 0.8] > [GC Worker Total (ms): Min: 22.3, Avg: 22.5, Max: 22.7, Diff: 0.4, > Sum: 202.7] > [GC Worker End (ms): Min: 14840270.1, Avg: 14840270.1, Max: > 14840270.2, Diff: 0.2] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 1.1 ms] > [Choose CSet: 0.1 ms] > [Ref Proc: 0.4 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.3 ms] > [Eden: 1168.0M(1168.0M)->0.0B(1160.0M) Survivors: 56.0M->64.0M Heap: > 1718.4M(2048.0M)->563.4M(2048.0M)] > [Times: user=0.21 sys=0.00, real=0.52 secs] > 2014-02-20T02:15:43.098+0000: 14840.764: Total time for which application > threads were stopped: 0.5178390 seconds > > JVM parameters - > > -server -Xmx2g -Xms2g -XX:PermSize=128m -XX:MaxPermSize=128m > -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:ParallelGCThreads=9 -XX:ConcGCThreads=4 > -XX:G1HeapRegionSize=8M -XX:+PrintTLAB -XX:+AggressiveOpts > -XX:+PrintFlagsFinal -Xloggc:/integral/logs/gc.log -verbose:gc > -XX:+PrintTenuringDistribution -XX:+PrintGCDateStamps > -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCDetails > -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime > -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 > -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3026 > -Dcom.sun.management.jmxremote.local.only=false > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140220/d3050ce2/attachment.html From kirtiteja at gmail.com Thu Feb 20 12:12:47 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Thu, 20 Feb 2014 12:12:47 -0800 Subject: G1 GC - pauses much larger than target In-Reply-To: References: Message-ID: Hi, I re-ran the test with TraceSafepointCleanupTime enabled. I did not find anything out of ordinary. Safepoint cleanup is showing only sub-milliseconds. Below are the logs for one such occurrence. I can also run the app with -XX:+PrintHeapAtGC -XX:+PrintHeapAtGCExtended if it helps. I did ran it once earlier could not find anything out of ordinary. safepoint trace - 1711.616: [deflating idle monitors, 0.0000510 secs] 1711.616: [updating inline caches, 0.0000010 secs] 1711.616: [compilation policy safepoint handler, 0.0001860 secs] 1711.616: [sweeping nmethods, 0.0000040 secs] vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count 1711.616: G1IncCollectionPause [ 94 0 0 ] [ 0 0 0 0 235 ] 0 gc log - 2014-02-20T19:49:45.588+0000: 1711.616: Application time: 4.5889130 seconds 2014-02-20T19:49:45.588+0000: 1711.616: [GC pause (young) Desired survivor size 83886080 bytes, new threshold 15 (max 15) - age 1: 1428296 bytes, 1428296 total - age 2: 918104 bytes, 2346400 total - age 3: 1126320 bytes, 3472720 total - age 4: 838696 bytes, 4311416 total - age 5: 975512 bytes, 5286928 total - age 6: 813872 bytes, 6100800 total - age 7: 975504 bytes, 7076304 total - age 8: 801600 bytes, 7877904 total - age 9: 966256 bytes, 8844160 total - age 10: 801536 bytes, 9645696 total - age 11: 964048 bytes, 10609744 total - age 12: 859568 bytes, 11469312 total - age 13: 931344 bytes, 12400656 total - age 14: 921024 bytes, 13321680 total - age 15: 891616 bytes, 14213296 total 1711.616: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 9059, predicted base time: 6.60 ms, remaining time: 18.40 ms, target pause time: 25.00 ms] 1711.616: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 150 regions, survivors: 3 regions, predicted young region time: 5.94 ms] 1711.616: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 150 regions, survivors: 3 regions, old: 0 regions, predicted pause time: 12.54 ms, target pause time: 25.00 ms] , 0.0180830 secs] [Parallel Time: 17.0 ms, GC Workers: 9] [GC Worker Start (ms): Min: 1711616.5, Avg: 1711616.6, Max: 1711616.8, Diff: 0.3] [Ext Root Scanning (ms): Min: 3.5, Avg: 4.8, Max: 6.0, Diff: 2.4, Sum: 43.1] [Update RS (ms): Min: 0.7, Avg: 1.7, Max: 2.7, Diff: 2.0, Sum: 15.2] [Processed Buffers: Min: 4, Avg: 8.1, Max: 17, Diff: 13, Sum: 73] [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 0.8] [Object Copy (ms): Min: 9.7, Avg: 9.8, Max: 10.0, Diff: 0.3, Sum: 88.1] [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, Sum: 1.1] [GC Worker Total (ms): Min: 16.3, Avg: 16.5, Max: 16.6, Diff: 0.4, Sum: 148.4] [GC Worker End (ms): Min: 1711633.0, Avg: 1711633.1, Max: 1711633.3, Diff: 0.3] [Code Root Fixup: 0.0 ms] [Clear CT: 0.4 ms] [Other: 0.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.3 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.2 ms] [Eden: 1200.0M(1200.0M)->0.0B(1200.0M) Survivors: 24.0M->24.0M Heap: 1566.3M(2048.0M)->370.6M(2048.0M)] [Times: user=0.15 sys=0.00, real=0.24 secs] 2014-02-20T19:49:45.824+0000: 1711.852: Total time for which application threads were stopped: 0.2361710 seconds On Thu, Feb 20, 2014 at 2:04 AM, Srinivas Ramakrishna wrote: > Probably some post-GC clean up?... nmethod sweep, monitor list cleanup, > and other housekeeping. There's a trace flag that displays this in more > detail: > > bool TraceSafepointCleanupTime = false > {product} > > Additionally, PrintSafepointStatistics might shed light . Since you have > it already enabled, you could probably look at the data for this particular > pause. > -- ramki > > > On Thu, Feb 20, 2014 at 1:24 AM, Kirti Teja Rao wrote: > >> Hi, >> >> I am trying out G1 collector for our application. Our application runs >> with 2GB heap and we expect relatively low latency. The pause time target >> is set to 25ms. There are much bigger pauses (and unexplained) in order of >> few 100s of ms. This is not a rare occurence and i can see this 15-20 times >> in 6-7 hours runs. We use deterministic GC in jrockit for 1.6 and want to >> upgrade to 1.7 or even 1.8 after the next months release. Explaining and >> tuning these unexplained large pauses is critical for us to upgrade. >> Can anyone please help in identifying where this time is spent or how to >> bring it down? >> >> Below is the log for one such occurrence and also the JVM parameters for >> this run - >> >> My observations - >> 1) real time is much larger than the user time. This server has 2 >> processors with 8 cores each and hyper-threading. So, for most of time the >> progress is blocked. >> 2) Start time is 14840.246, end time for worker is 14840270.2 and end >> time for pause is 14840.764. So, the time is spent after the parallel phase >> is completed and before the pause finishes. >> >> I can add more logs if required. I can also run it in same env with >> different parameters if there are suggestions. >> >> 2014-02-20T02:15:42.580+0000: 14840.246: Application time: 8.5619840 >> seconds >> 2014-02-20T02:15:42.581+0000: 14840.247: [GC pause (young) >> Desired survivor size 83886080 bytes, new threshold 15 (max 15) >> - age 1: 2511184 bytes, 2511184 total >> - age 2: 1672024 bytes, 4183208 total >> - age 3: 1733824 bytes, 5917032 total >> - age 4: 1663920 bytes, 7580952 total >> - age 5: 1719944 bytes, 9300896 total >> - age 6: 1641904 bytes, 10942800 total >> - age 7: 1796976 bytes, 12739776 total >> - age 8: 1706344 bytes, 14446120 total >> - age 9: 1722920 bytes, 16169040 total >> - age 10: 1729176 bytes, 17898216 total >> - age 11: 1500056 bytes, 19398272 total >> - age 12: 1486520 bytes, 20884792 total >> - age 13: 1618272 bytes, 22503064 total >> - age 14: 1492840 bytes, 23995904 total >> - age 15: 1486920 bytes, 25482824 total >> 14840.247: [G1Ergonomics (CSet Construction) start choosing CSet, >> _pending_cards: 12196, predicted base time: 7.85 ms, remaining time: 17.15 >> ms, target pause time: 25.00 ms] >> 14840.247: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 146 regions, survivors: 7 regions, predicted young region time: 8.76 >> ms] >> 14840.247: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: >> 146 regions, survivors: 7 regions, old: 0 regions, predicted pause time: >> 16.60 ms, target pause time: 25.00 ms] >> , 0.0247660 secs] >> [Parallel Time: 23.2 ms, GC Workers: 9] >> [GC Worker Start (ms): Min: 14840247.4, Avg: 14840247.6, Max: >> 14840247.8, Diff: 0.4] >> [Ext Root Scanning (ms): Min: 3.5, Avg: 4.1, Max: 5.4, Diff: 1.9, >> Sum: 37.2] >> [Update RS (ms): Min: 1.1, Avg: 2.2, Max: 2.8, Diff: 1.7, Sum: 19.8] >> [Processed Buffers: Min: 5, Avg: 9.4, Max: 15, Diff: 10, Sum: 85] >> [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.7] >> [Object Copy (ms): Min: 15.8, Avg: 16.0, Max: 16.2, Diff: 0.4, Sum: >> 144.2] >> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: >> 0.0] >> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, >> Sum: 0.8] >> [GC Worker Total (ms): Min: 22.3, Avg: 22.5, Max: 22.7, Diff: 0.4, >> Sum: 202.7] >> [GC Worker End (ms): Min: 14840270.1, Avg: 14840270.1, Max: >> 14840270.2, Diff: 0.2] >> [Code Root Fixup: 0.0 ms] >> [Clear CT: 0.4 ms] >> [Other: 1.1 ms] >> [Choose CSet: 0.1 ms] >> [Ref Proc: 0.4 ms] >> [Ref Enq: 0.0 ms] >> [Free CSet: 0.3 ms] >> [Eden: 1168.0M(1168.0M)->0.0B(1160.0M) Survivors: 56.0M->64.0M Heap: >> 1718.4M(2048.0M)->563.4M(2048.0M)] >> [Times: user=0.21 sys=0.00, real=0.52 secs] >> 2014-02-20T02:15:43.098+0000: 14840.764: Total time for which application >> threads were stopped: 0.5178390 seconds >> >> JVM parameters - >> >> -server -Xmx2g -Xms2g -XX:PermSize=128m -XX:MaxPermSize=128m >> -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+UseG1GC >> -XX:MaxGCPauseMillis=25 -XX:ParallelGCThreads=9 -XX:ConcGCThreads=4 >> -XX:G1HeapRegionSize=8M -XX:+PrintTLAB -XX:+AggressiveOpts >> -XX:+PrintFlagsFinal -Xloggc:/integral/logs/gc.log -verbose:gc >> -XX:+PrintTenuringDistribution -XX:+PrintGCDateStamps >> -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCDetails >> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime >> -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1 >> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=3026 >> -Dcom.sun.management.jmxremote.local.only=false >> -Dcom.sun.management.jmxremote.authenticate=false >> -Dcom.sun.management.jmxremote.ssl=false >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140220/295f0bbe/attachment.html From yu.zhang at oracle.com Thu Feb 20 12:42:03 2014 From: yu.zhang at oracle.com (YU ZHANG) Date: Thu, 20 Feb 2014 12:42:03 -0800 Subject: G1 GC - pauses much larger than target In-Reply-To: References: Message-ID: <5306689B.6060301@oracle.com> Kirti, If real time is longer than user time, it means the gc threads were waiting for cpu. How is your cpu utilization? Thanks, Jenny On 2/20/2014 12:12 PM, Kirti Teja Rao wrote: > Hi, > > I re-ran the test with TraceSafepointCleanupTime enabled. I did not > find anything out of ordinary. > Safepoint cleanup is showing only sub-milliseconds. Below are the logs > for one such occurrence. > > I can also run the app with -XX:+PrintHeapAtGC > -XX:+PrintHeapAtGCExtended if it helps. I did ran it once earlier > could not find anything out of ordinary. > > > > safepoint trace - > 1711.616: [deflating idle monitors, 0.0000510 secs] > 1711.616: [updating inline caches, 0.0000010 secs] > 1711.616: [compilation policy safepoint handler, 0.0001860 secs] > 1711.616: [sweeping nmethods, 0.0000040 secs] > vmop [threads: total initially_running > wait_to_block] [time: spin block sync cleanup vmop] page_trap_count > 1711.616: G1IncCollectionPause [ 94 0 > 0 ] [ 0 0 0 0 235 ] 0 > > gc log - > > 2014-02-20T19:49:45.588+0000: 1711.616: Application time: 4.5889130 > seconds > 2014-02-20T19:49:45.588+0000: 1711.616: [GC pause (young) > Desired survivor size 83886080 bytes, new threshold 15 (max 15) > - age 1: 1428296 bytes, 1428296 total > - age 2: 918104 bytes, 2346400 total > - age 3: 1126320 bytes, 3472720 total > - age 4: 838696 bytes, 4311416 total > - age 5: 975512 bytes, 5286928 total > - age 6: 813872 bytes, 6100800 total > - age 7: 975504 bytes, 7076304 total > - age 8: 801600 bytes, 7877904 total > - age 9: 966256 bytes, 8844160 total > - age 10: 801536 bytes, 9645696 total > - age 11: 964048 bytes, 10609744 total > - age 12: 859568 bytes, 11469312 total > - age 13: 931344 bytes, 12400656 total > - age 14: 921024 bytes, 13321680 total > - age 15: 891616 bytes, 14213296 total > 1711.616: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 9059, predicted base time: 6.60 ms, remaining time: > 18.40 ms, target pause time: 25.00 ms] > 1711.616: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 150 regions, survivors: 3 regions, predicted young region > time: 5.94 ms] > 1711.616: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 150 regions, survivors: 3 regions, old: 0 regions, predicted > pause time: 12.54 ms, target pause time: 25.00 ms] > , 0.0180830 secs] > [Parallel Time: 17.0 ms, GC Workers: 9] > [GC Worker Start (ms): Min: 1711616.5, Avg: 1711616.6, Max: > 1711616.8, Diff: 0.3] > [Ext Root Scanning (ms): Min: 3.5, Avg: 4.8, Max: 6.0, Diff: > 2.4, Sum: 43.1] > [Update RS (ms): Min: 0.7, Avg: 1.7, Max: 2.7, Diff: 2.0, Sum: 15.2] > [Processed Buffers: Min: 4, Avg: 8.1, Max: 17, Diff: 13, Sum: 73] > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.1, Sum: 0.8] > [Object Copy (ms): Min: 9.7, Avg: 9.8, Max: 10.0, Diff: 0.3, > Sum: 88.1] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.3, > Sum: 1.1] > [GC Worker Total (ms): Min: 16.3, Avg: 16.5, Max: 16.6, Diff: > 0.4, Sum: 148.4] > [GC Worker End (ms): Min: 1711633.0, Avg: 1711633.1, Max: > 1711633.3, Diff: 0.3] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 0.8 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.3 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.2 ms] > [Eden: 1200.0M(1200.0M)->0.0B(1200.0M) Survivors: 24.0M->24.0M > Heap: 1566.3M(2048.0M)->370.6M(2048.0M)] > [Times: user=0.15 sys=0.00, real=0.24 secs] > 2014-02-20T19:49:45.824+0000: 1711.852: Total time for which > application threads were stopped: 0.2361710 seconds > > > > > > > On Thu, Feb 20, 2014 at 2:04 AM, Srinivas Ramakrishna > > wrote: > > Probably some post-GC clean up?... nmethod sweep, monitor list > cleanup, and other housekeeping. There's a trace flag that > displays this in more detail: > > bool TraceSafepointCleanupTime = false {product} > > Additionally, PrintSafepointStatistics might shed light . Since > you have it already enabled, you could probably look at the data > for this particular pause. > -- ramki > > > On Thu, Feb 20, 2014 at 1:24 AM, Kirti Teja Rao > > wrote: > > Hi, > > I am trying out G1 collector for our application. Our > application runs with 2GB heap and we expect relatively low > latency. The pause time target is set to 25ms. There are much > bigger pauses (and unexplained) in order of few 100s of ms. > This is not a rare occurence and i can see this 15-20 times in > 6-7 hours runs. We use deterministic GC in jrockit for 1.6 and > want to upgrade to 1.7 or even 1.8 after the next months > release. Explaining and tuning these unexplained large pauses > is critical for us to upgrade. > Can anyone please help in identifying where this time is spent > or how to bring it down? > > Below is the log for one such occurrence and also the JVM > parameters for this run - > > My observations - > 1) real time is much larger than the user time. This server > has 2 processors with 8 cores each and hyper-threading. So, > for most of time the progress is blocked. > 2) Start time is 14840.246, end time for worker is 14840270.2 > and end time for pause is 14840.764. So, the time is spent > after the parallel phase is completed and before the pause > finishes. > > I can add more logs if required. I can also run it in same env > with different parameters if there are suggestions. > > 2014-02-20T02:15:42.580+0000: 14840.246: Application time: > 8.5619840 seconds > 2014-02-20T02:15:42.581+0000: 14840.247: [GC pause (young) > Desired survivor size 83886080 bytes, new threshold 15 (max 15) > - age 1: 2511184 bytes, 2511184 total > - age 2: 1672024 bytes, 4183208 total > - age 3: 1733824 bytes, 5917032 total > - age 4: 1663920 bytes, 7580952 total > - age 5: 1719944 bytes, 9300896 total > - age 6: 1641904 bytes, 10942800 total > - age 7: 1796976 bytes, 12739776 total > - age 8: 1706344 bytes, 14446120 total > - age 9: 1722920 bytes, 16169040 total > - age 10: 1729176 bytes, 17898216 total > - age 11: 1500056 bytes, 19398272 total > - age 12: 1486520 bytes, 20884792 total > - age 13: 1618272 bytes, 22503064 total > - age 14: 1492840 bytes, 23995904 total > - age 15: 1486920 bytes, 25482824 total > 14840.247: [G1Ergonomics (CSet Construction) start choosing > CSet, _pending_cards: 12196, predicted base time: 7.85 ms, > remaining time: 17.15 ms, target pause time: 25.00 ms] > 14840.247: [G1Ergonomics (CSet Construction) add young > regions to CSet, eden: 146 regions, survivors: 7 regions, > predicted young region time: 8.76 ms] > 14840.247: [G1Ergonomics (CSet Construction) finish choosing > CSet, eden: 146 regions, survivors: 7 regions, old: 0 regions, > predicted pause time: 16.60 ms, target pause time: 25.00 ms] > , 0.0247660 secs] > [Parallel Time: 23.2 ms, GC Workers: 9] > [GC Worker Start (ms): Min: 14840247.4, Avg: 14840247.6, > Max: 14840247.8, Diff: 0.4] > [Ext Root Scanning (ms): Min: 3.5, Avg: 4.1, Max: 5.4, > Diff: 1.9, Sum: 37.2] > [Update RS (ms): Min: 1.1, Avg: 2.2, Max: 2.8, Diff: > 1.7, Sum: 19.8] > [Processed Buffers: Min: 5, Avg: 9.4, Max: 15, Diff: > 10, Sum: 85] > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, > Sum: 0.7] > [Object Copy (ms): Min: 15.8, Avg: 16.0, Max: 16.2, > Diff: 0.4, Sum: 144.2] > [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: > 0.0, Sum: 0.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, > Diff: 0.2, Sum: 0.8] > [GC Worker Total (ms): Min: 22.3, Avg: 22.5, Max: 22.7, > Diff: 0.4, Sum: 202.7] > [GC Worker End (ms): Min: 14840270.1, Avg: 14840270.1, > Max: 14840270.2, Diff: 0.2] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.4 ms] > [Other: 1.1 ms] > [Choose CSet: 0.1 ms] > [Ref Proc: 0.4 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.3 ms] > [Eden: 1168.0M(1168.0M)->0.0B(1160.0M) Survivors: > 56.0M->64.0M Heap: 1718.4M(2048.0M)->563.4M(2048.0M)] > [Times: user=0.21 sys=0.00, real=0.52 secs] > 2014-02-20T02:15:43.098+0000: 14840.764: Total time for which > application threads were stopped: 0.5178390 seconds > > JVM parameters - > > -server -Xmx2g -Xms2g -XX:PermSize=128m -XX:MaxPermSize=128m > -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:+UseG1GC > -XX:MaxGCPauseMillis=25 -XX:ParallelGCThreads=9 > -XX:ConcGCThreads=4 -XX:G1HeapRegionSize=8M -XX:+PrintTLAB > -XX:+AggressiveOpts -XX:+PrintFlagsFinal > -Xloggc:/integral/logs/gc.log -verbose:gc > -XX:+PrintTenuringDistribution -XX:+PrintGCDateStamps > -XX:+PrintAdaptiveSizePolicy -XX:+PrintGCDetails > -XX:+PrintGCApplicationConcurrentTime > -XX:+PrintGCApplicationStoppedTime > -XX:+PrintSafepointStatistics > -XX:PrintSafepointStatisticsCount=1 > -Dcom.sun.management.jmxremote > -Dcom.sun.management.jmxremote.port=3026 > -Dcom.sun.management.jmxremote.local.only=false > -Dcom.sun.management.jmxremote.authenticate=false > -Dcom.sun.management.jmxremote.ssl=false > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140220/71cad4a7/attachment-0001.html From Andreas.Mueller at mgm-tp.com Fri Feb 21 00:44:49 2014 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Fri, 21 Feb 2014 08:44:49 +0000 Subject: AW: Re: G1 GC - pauses much larger than target Message-ID: <46FF8393B58AD84D95E444264805D98FBDE16B1D@edata01.mgm-edv.de> I resend the text of my reply. It was blocked from the mailing list because the graphic was a bit too large. Sorry for that but I could not make it smaller. Von: Andreas M?ller Gesendet: Freitag, 21. Februar 2014 09:39 An: 'kirtiteja at gmail.com' Cc: 'hotspot-gc-use at openjdk.java.net' Betreff: Re: G1 GC - pauses much larger than target Hi Kirti, > I am trying out G1 collector for our application. Our application runs with 2GB heap and we expect relatively low latency. > The pause time target is set to 25ms. There >are much bigger pauses (and unexplained) in order of few 100s of ms. > This is not a rare occurence and i can see this 15-20 times in 6-7 hours runs. This conforms to what I have observed in extended tests: G1's control of GC pause duration is limited to a rather narrow range. Even in that range, only new gen pauses do follow the pause time target well while "mixed" pauses tend to overshoot with considerable probability. Find attached a graphic which shows what I mean: - New gen pauses (red) do follow the target very well from 150-800 millis - With a target below 150 the actual new gen pauses remain flat at 150-180 millis - "mixed" pauses (blue) do not follow the target well and some of them will always take 500-700 millis, whatever the target be - There are other pauses (remark etc., green) which are short but completely independent of the target value The range with reasonable control depends on the heap size, the application and the hardware. I measured the graphic attached on a 6-core Xeon/2GHz server running Java 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. (For which the pause durations achieved are not bad at all!) The application was a synthetic benchmark described here: http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ With the same benchmark but only 10 GB of overall heap size on a Oracle T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of plot but the range with reasonable pause time control was now 60-180 millis. Again the pause durations reached were by themselves not bad at all. But the idea of setting a pause time target and expecting it to be followed in a meaningful way is to some extent misleading. These results on G1's pause time control will be published soon on the blog of the link above. Best regards Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140221/d39c23d2/attachment.html From kirtiteja at gmail.com Fri Feb 21 11:27:38 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Fri, 21 Feb 2014 11:27:38 -0800 Subject: G1 GC - pauses much larger than target In-Reply-To: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> Message-ID: Hi, @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some of these pauses are as large as 1 second and these are always observed after the parallel phase, I assume this is the phase were G1 would need the most amount of CPU. @Andreas - Most of these pauses are in young collection and are not showing in the parallel/serial phases shown in GC log. The pauses i observe are unreasonable 1.5+ sec for a heap of 2 GB. @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater than user time, I believe G1 is blocked on some resource. The application i run is not swapping and also there is more headroom in memory. CPU is less than 35%.There are other applications running on the machine which log quite a bit and can cause the iowait avg queue size to spike upto 20-30 occasionally. Does G1 logging happen during the pause time? Can a slow disk or high disk IO affect these timings? Is there anything else that we can try to uncover the cause for these pauses? 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 seconds 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) Desired survivor size 81788928 bytes, new threshold 15 (max 15) - age 1: 564704 bytes, 564704 total - age 2: 18504 bytes, 583208 total - age 3: 18552 bytes, 601760 total - age 4: 18776 bytes, 620536 total - age 5: 197048 bytes, 817584 total - age 6: 18712 bytes, 836296 total - age 7: 18456 bytes, 854752 total - age 8: 18920 bytes, 873672 total - age 9: 18456 bytes, 892128 total - age 10: 18456 bytes, 910584 total - age 11: 18456 bytes, 929040 total - age 12: 18456 bytes, 947496 total - age 13: 18488 bytes, 965984 total - age 14: 18456 bytes, 984440 total - age 15: 18456 bytes, 1002896 total 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 ms, target pause time: 30.00 ms] 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 ms] 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: 8.67 ms, target pause time: 30.00 ms] , 0.0079290 secs] [Parallel Time: 6.0 ms, GC Workers: 9] [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: 12675970.8, Diff: 0.7] [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, Sum: 36.3] [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: 50.1] [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: 12675975.9, Diff: 0.1] [Code Root Fixup: 0.0 ms] [Clear CT: 0.5 ms] [Other: 1.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.5 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.7 ms] [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] [Times: user=0.06 sys=0.00, real=1.54 secs] 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application threads were stopped: 1.5421650 seconds On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller wrote: > Hi Kirti, > > > > > I am trying out G1 collector for our application. Our application runs > with 2GB heap and we expect relatively low latency. > > > The pause time target is set to 25ms. There >are much bigger pauses (and > unexplained) in order of few 100s of ms. > > > This is not a rare occurence and i can see this 15-20 times in 6-7 hours > runs. > > > > This conforms to what I have observed in extended tests: > > G1's control of GC pause duration is limited to a rather narrow range. > > Even in that range, only new gen pauses do follow the pause time target > well while "mixed" pauses tend to overshoot with considerable probability. > > Find attached a graphic which shows what I mean: > > - New gen pauses (red) do follow the target very well from 150-800 > millis > > - With a target below 150 the actual new gen pauses remain flat at > 150-180 millis > > - "mixed" pauses (blue) do not follow the target well and some of > them will always take 500-700 millis, whatever the target be > > - There are other pauses (remark etc., green) which are short but > completely independent of the target value > > > > The range with reasonable control depends on the heap size, the > application and the hardware. > > I measured the graphic attached on a 6-core Xeon/2GHz server running Java > 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. > > (For which the pause durations achieved are not bad at all!) > > The application was a synthetic benchmark described here: > http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ > > With the same benchmark but only 10 GB of overall heap size on a Oracle T3 > server running Java 7u45 on Solaris/SPARC I got a very similar kind of plot > but the range with reasonable pause time control was now 60-180 millis. > > Again the pause durations reached were by themselves not bad at all. But > the idea of setting a pause time target and expecting it to be followed in > a meaningful way is to some extent misleading. > > > > These results on G1's pause time control will be published soon on the > blog of the link above. > > > > Best regards > > Andreas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140221/e8809310/attachment.html From chkwok at digibites.nl Fri Feb 21 11:33:08 2014 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Fri, 21 Feb 2014 20:33:08 +0100 Subject: G1 GC - pauses much larger than target In-Reply-To: References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> Message-ID: Are you running on Linux? Rarely touched pages get swapped out aggressively, try setting vm.swappiness to 0 and check sar -B for disk page in/out stats. On 21 Feb 2014 20:28, "Kirti Teja Rao" wrote: > Hi, > > @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some > of these pauses are as large as 1 second and these are always observed > after the parallel phase, I assume this is the phase were G1 would need the > most amount of CPU. > > @Andreas - Most of these pauses are in young collection and are not > showing in the parallel/serial phases shown in GC log. The pauses i observe > are unreasonable 1.5+ sec for a heap of 2 GB. > > @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far > greater than user time, I believe G1 is blocked on some resource. The > application i run is not swapping and also there is more headroom in > memory. CPU is less than 35%.There are other applications running on the > machine which log quite a bit and can cause the iowait avg queue size to > spike upto 20-30 occasionally. Does G1 logging happen during the pause > time? Can a slow disk or high disk IO affect these timings? > > Is there anything else that we can try to uncover the cause for these > pauses? > > > 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 > seconds > 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) > Desired survivor size 81788928 bytes, new threshold 15 (max 15) > - age 1: 564704 bytes, 564704 total > - age 2: 18504 bytes, 583208 total > - age 3: 18552 bytes, 601760 total > - age 4: 18776 bytes, 620536 total > - age 5: 197048 bytes, 817584 total > - age 6: 18712 bytes, 836296 total > - age 7: 18456 bytes, 854752 total > - age 8: 18920 bytes, 873672 total > - age 9: 18456 bytes, 892128 total > - age 10: 18456 bytes, 910584 total > - age 11: 18456 bytes, 929040 total > - age 12: 18456 bytes, 947496 total > - age 13: 18488 bytes, 965984 total > - age 14: 18456 bytes, 984440 total > - age 15: 18456 bytes, 1002896 total > 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 > ms, target pause time: 30.00 ms] > 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 > ms] > 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: > 8.67 ms, target pause time: 30.00 ms] > , 0.0079290 secs] > [Parallel Time: 6.0 ms, GC Workers: 9] > [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: > 12675970.8, Diff: 0.7] > [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, > Sum: 36.3] > [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] > [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] > [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.2] > [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: > 50.1] > [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: > 12675975.9, Diff: 0.1] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.5 ms] > [Other: 1.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.5 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.7 ms] > [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K > Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] > [Times: user=0.06 sys=0.00, real=1.54 secs] > 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application > threads were stopped: 1.5421650 seconds > > > > On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < > Andreas.Mueller at mgm-tp.com> wrote: > >> Hi Kirti, >> >> >> >> > I am trying out G1 collector for our application. Our application runs >> with 2GB heap and we expect relatively low latency. >> >> > The pause time target is set to 25ms. There >are much bigger pauses >> (and unexplained) in order of few 100s of ms. >> >> > This is not a rare occurence and i can see this 15-20 times in 6-7 >> hours runs. >> >> >> >> This conforms to what I have observed in extended tests: >> >> G1's control of GC pause duration is limited to a rather narrow range. >> >> Even in that range, only new gen pauses do follow the pause time target >> well while "mixed" pauses tend to overshoot with considerable probability. >> >> Find attached a graphic which shows what I mean: >> >> - New gen pauses (red) do follow the target very well from >> 150-800 millis >> >> - With a target below 150 the actual new gen pauses remain flat >> at 150-180 millis >> >> - "mixed" pauses (blue) do not follow the target well and some of >> them will always take 500-700 millis, whatever the target be >> >> - There are other pauses (remark etc., green) which are short but >> completely independent of the target value >> >> >> >> The range with reasonable control depends on the heap size, the >> application and the hardware. >> >> I measured the graphic attached on a 6-core Xeon/2GHz server running Java >> 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. >> >> (For which the pause durations achieved are not bad at all!) >> >> The application was a synthetic benchmark described here: >> http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ >> >> With the same benchmark but only 10 GB of overall heap size on a Oracle >> T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of >> plot but the range with reasonable pause time control was now 60-180 >> millis. >> >> Again the pause durations reached were by themselves not bad at all. But >> the idea of setting a pause time target and expecting it to be followed in >> a meaningful way is to some extent misleading. >> >> >> >> These results on G1's pause time control will be published soon on the >> blog of the link above. >> >> >> >> Best regards >> >> Andreas >> >> >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140221/a010f775/attachment-0001.html From gustav.r.akesson at gmail.com Fri Feb 21 11:47:25 2014 From: gustav.r.akesson at gmail.com (=?ISO-8859-1?Q?Gustav_=C5kesson?=) Date: Fri, 21 Feb 2014 20:47:25 +0100 Subject: G1 GC - pauses much larger than target In-Reply-To: References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> Message-ID: Hi, At least for ParNew/CMS the GC logging is synchronous and part of the cycle, which means problems logging e.g. a ParNew event to disc increases the pause. However, when this happens we see an increase of sys time in the GC event IIRC. We solve this issue by logging to RAM disc instead. Best Regards, Gustav ?kesson Den 21 feb 2014 20:31 skrev "Kirti Teja Rao" : > Hi, > > @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some > of these pauses are as large as 1 second and these are always observed > after the parallel phase, I assume this is the phase were G1 would need the > most amount of CPU. > > @Andreas - Most of these pauses are in young collection and are not > showing in the parallel/serial phases shown in GC log. The pauses i observe > are unreasonable 1.5+ sec for a heap of 2 GB. > > @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far > greater than user time, I believe G1 is blocked on some resource. The > application i run is not swapping and also there is more headroom in > memory. CPU is less than 35%.There are other applications running on the > machine which log quite a bit and can cause the iowait avg queue size to > spike upto 20-30 occasionally. Does G1 logging happen during the pause > time? Can a slow disk or high disk IO affect these timings? > > Is there anything else that we can try to uncover the cause for these > pauses? > > > 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 > seconds > 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) > Desired survivor size 81788928 bytes, new threshold 15 (max 15) > - age 1: 564704 bytes, 564704 total > - age 2: 18504 bytes, 583208 total > - age 3: 18552 bytes, 601760 total > - age 4: 18776 bytes, 620536 total > - age 5: 197048 bytes, 817584 total > - age 6: 18712 bytes, 836296 total > - age 7: 18456 bytes, 854752 total > - age 8: 18920 bytes, 873672 total > - age 9: 18456 bytes, 892128 total > - age 10: 18456 bytes, 910584 total > - age 11: 18456 bytes, 929040 total > - age 12: 18456 bytes, 947496 total > - age 13: 18488 bytes, 965984 total > - age 14: 18456 bytes, 984440 total > - age 15: 18456 bytes, 1002896 total > 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 > ms, target pause time: 30.00 ms] > 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 > ms] > 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: > 8.67 ms, target pause time: 30.00 ms] > , 0.0079290 secs] > [Parallel Time: 6.0 ms, GC Workers: 9] > [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: > 12675970.8, Diff: 0.7] > [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, > Sum: 36.3] > [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] > [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] > [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.2] > [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: > 50.1] > [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: > 12675975.9, Diff: 0.1] > [Code Root Fixup: 0.0 ms] > [Clear CT: 0.5 ms] > [Other: 1.4 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 0.5 ms] > [Ref Enq: 0.0 ms] > [Free CSet: 0.7 ms] > [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K > Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] > [Times: user=0.06 sys=0.00, real=1.54 secs] > 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application > threads were stopped: 1.5421650 seconds > > > > On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < > Andreas.Mueller at mgm-tp.com> wrote: > >> Hi Kirti, >> >> >> >> > I am trying out G1 collector for our application. Our application runs >> with 2GB heap and we expect relatively low latency. >> >> > The pause time target is set to 25ms. There >are much bigger pauses >> (and unexplained) in order of few 100s of ms. >> >> > This is not a rare occurence and i can see this 15-20 times in 6-7 >> hours runs. >> >> >> >> This conforms to what I have observed in extended tests: >> >> G1's control of GC pause duration is limited to a rather narrow range. >> >> Even in that range, only new gen pauses do follow the pause time target >> well while "mixed" pauses tend to overshoot with considerable probability. >> >> Find attached a graphic which shows what I mean: >> >> - New gen pauses (red) do follow the target very well from >> 150-800 millis >> >> - With a target below 150 the actual new gen pauses remain flat >> at 150-180 millis >> >> - "mixed" pauses (blue) do not follow the target well and some of >> them will always take 500-700 millis, whatever the target be >> >> - There are other pauses (remark etc., green) which are short but >> completely independent of the target value >> >> >> >> The range with reasonable control depends on the heap size, the >> application and the hardware. >> >> I measured the graphic attached on a 6-core Xeon/2GHz server running Java >> 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. >> >> (For which the pause durations achieved are not bad at all!) >> >> The application was a synthetic benchmark described here: >> http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ >> >> With the same benchmark but only 10 GB of overall heap size on a Oracle >> T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of >> plot but the range with reasonable pause time control was now 60-180 >> millis. >> >> Again the pause durations reached were by themselves not bad at all. But >> the idea of setting a pause time target and expecting it to be followed in >> a meaningful way is to some extent misleading. >> >> >> >> These results on G1's pause time control will be published soon on the >> blog of the link above. >> >> >> >> Best regards >> >> Andreas >> >> >> > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140221/882a9b5e/attachment.html From kirtiteja at gmail.com Mon Feb 24 17:29:33 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Mon, 24 Feb 2014 17:29:33 -0800 Subject: G1 GC - pauses much larger than target In-Reply-To: <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> Message-ID: Hi, I tried with swappiness set to 0 and turned off all the logging of the other application that is running on the same machine to cut down on the io on the machine. The results are much better and all the large outliers with over 100ms and upto 500-600 msec are gone now. I see pauses around 50ms-60ms for a pause target of 30ms which is ok to work with. Attached is the scatter chart comparison from excel with one of the earlier runs and this run. It is also interesting to see there are less number of gcs but slightly more pause interval per gc in this run. I am yet to find out if this improvement is because of setting swappiness to 0 or cutting down on the IO. Will do individual runs tomorrow or day after and will update the thread about the findings. On Sat, Feb 22, 2014 at 8:09 AM, Andreas M?ller wrote: > Hi Kirti, > > > > > [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater > than user time, > > That's extreme. > > But running several JVMs on the same hardware can produce such a > phenomenon on a smaller scale. I usually observe an increase of pause times > when two JVMs compete for the CPUs by a factor between sqrt(2) and 2 which > I understand from a statistical point of view. > > > > I have also got the impression that G1 is much more sensitive to CPU > contention (even well below 100% load) than other collectors. > > I have attached a GC log graphic where one JVM runs G1 on a 25 GB heap for > 30 minutes when after 15 minutes a second JVM with exactly the same > settings starts to compete for the CPUs (and other system resources). It is > clearly visible from the plot that the second JVM has a huge impact on the > operation of the first one.: > > - Long pauses jump from a max of 250 millis to 700 millis (almost > a factor of 3) > > - G1 sharply decreases the new generation size (probably to regain > control of pause times because the target is at default value, i.e. 200 > millis) > > The statistics on the right hand side are taken only from the time period > where both JVMs compete. > > My impression is that G1 does not like sharing CPUs with other processes. > It probably spoils its ability to predict pause durations properly. > > > > Your example with a 2 GB heap looks much more extreme, however. > > It looks like your GC threads are almost being starved by something. > > I would be very happy to learn about the reason if any can be found. > > > > Best regards > > Andreas > > > > *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] > *Gesendet:* Freitag, 21. Februar 2014 20:28 > *An:* Andreas M?ller > > *Cc:* hotspot-gc-use at openjdk.java.net > *Betreff:* Re: G1 GC - pauses much larger than target > > > > Hi, > > > > @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some > of these pauses are as large as 1 second and these are always observed > after the parallel phase, I assume this is the phase were G1 would need the > most amount of CPU. > > > > @Andreas - Most of these pauses are in young collection and are not > showing in the parallel/serial phases shown in GC log. The pauses i observe > are unreasonable 1.5+ sec for a heap of 2 GB. > > > > @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far > greater than user time, I believe G1 is blocked on some resource. The > application i run is not swapping and also there is more headroom in > memory. CPU is less than 35%.There are other applications running on the > machine which log quite a bit and can cause the iowait avg queue size to > spike upto 20-30 occasionally. Does G1 logging happen during the pause > time? Can a slow disk or high disk IO affect these timings? > > > > Is there anything else that we can try to uncover the cause for these > pauses? > > > > > > 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 > seconds > > 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) > > Desired survivor size 81788928 bytes, new threshold 15 (max 15) > > - age 1: 564704 bytes, 564704 total > > - age 2: 18504 bytes, 583208 total > > - age 3: 18552 bytes, 601760 total > > - age 4: 18776 bytes, 620536 total > > - age 5: 197048 bytes, 817584 total > > - age 6: 18712 bytes, 836296 total > > - age 7: 18456 bytes, 854752 total > > - age 8: 18920 bytes, 873672 total > > - age 9: 18456 bytes, 892128 total > > - age 10: 18456 bytes, 910584 total > > - age 11: 18456 bytes, 929040 total > > - age 12: 18456 bytes, 947496 total > > - age 13: 18488 bytes, 965984 total > > - age 14: 18456 bytes, 984440 total > > - age 15: 18456 bytes, 1002896 total > > 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 > ms, target pause time: 30.00 ms] > > 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 > ms] > > 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: > 8.67 ms, target pause time: 30.00 ms] > > , 0.0079290 secs] > > [Parallel Time: 6.0 ms, GC Workers: 9] > > [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: > 12675970.8, Diff: 0.7] > > [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, > Sum: 36.3] > > [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] > > [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] > > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] > > [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] > > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] > > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.2] > > [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: > 50.1] > > [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: > 12675975.9, Diff: 0.1] > > [Code Root Fixup: 0.0 ms] > > [Clear CT: 0.5 ms] > > [Other: 1.4 ms] > > [Choose CSet: 0.0 ms] > > [Ref Proc: 0.5 ms] > > [Ref Enq: 0.0 ms] > > [Free CSet: 0.7 ms] > > [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K > Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] > > [Times: user=0.06 sys=0.00, real=1.54 secs] > > 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application > threads were stopped: 1.5421650 seconds > > > > > > On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < > Andreas.Mueller at mgm-tp.com> wrote: > > Hi Kirti, > > > > > I am trying out G1 collector for our application. Our application runs > with 2GB heap and we expect relatively low latency. > > > The pause time target is set to 25ms. There >are much bigger pauses (and > unexplained) in order of few 100s of ms. > > > This is not a rare occurence and i can see this 15-20 times in 6-7 hours > runs. > > > > This conforms to what I have observed in extended tests: > > G1's control of GC pause duration is limited to a rather narrow range. > > Even in that range, only new gen pauses do follow the pause time target > well while "mixed" pauses tend to overshoot with considerable probability. > > Find attached a graphic which shows what I mean: > > - New gen pauses (red) do follow the target very well from 150-800 > millis > > - With a target below 150 the actual new gen pauses remain flat at > 150-180 millis > > - "mixed" pauses (blue) do not follow the target well and some of > them will always take 500-700 millis, whatever the target be > > - There are other pauses (remark etc., green) which are short but > completely independent of the target value > > > > The range with reasonable control depends on the heap size, the > application and the hardware. > > I measured the graphic attached on a 6-core Xeon/2GHz server running Java > 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. > > (For which the pause durations achieved are not bad at all!) > > The application was a synthetic benchmark described here: > http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ > > With the same benchmark but only 10 GB of overall heap size on a Oracle T3 > server running Java 7u45 on Solaris/SPARC I got a very similar kind of plot > but the range with reasonable pause time control was now 60-180 millis. > > Again the pause durations reached were by themselves not bad at all. But > the idea of setting a pause time target and expecting it to be followed in > a meaningful way is to some extent misleading. > > > > These results on G1's pause time control will be published soon on the > blog of the link above. > > > > Best regards > > Andreas > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140224/c8e50aa5/attachment.html -------------- next part -------------- A non-text attachment was scrubbed... Name: NoLoggingComparision.png Type: image/png Size: 13893 bytes Desc: not available Url : http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140224/c8e50aa5/NoLoggingComparision.png From Andreas.Mueller at mgm-tp.com Tue Feb 25 01:53:59 2014 From: Andreas.Mueller at mgm-tp.com (=?iso-8859-1?Q?Andreas_M=FCller?=) Date: Tue, 25 Feb 2014 09:53:59 +0000 Subject: AW: G1 GC - pauses much larger than target In-Reply-To: References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> Message-ID: <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> Hi Kirti, thanks. > I tried with swappiness set to 0 Interesting. This means then, that the Linux kernel was swapping around with no real need? I am surprised to read this but it explains the long real time and the low usr time from your logs. I have sometimes seen GC logs from Linux which strangely looked like swapping without any shortage of physical RAM (which I can't remember seeing on Solaris). This difference is worth keeping in mind! > I am yet to find out if this improvement is because of setting swappiness to 0 or cutting down on the IO. I assume that swappiness is the prime factor here. Without swapping, why should a JVM care about the IO of another process? I would be rather shocked (and eye Linux with a lot more suspicion) if it were different. Can you tell whether other collectors (e.g. CMS) also suffered from idle GC pauses when you ran them with swappiness set to the default of 60? Best regards Andreas Von: Kirti Teja Rao [mailto:kirtiteja at gmail.com] Gesendet: Dienstag, 25. Februar 2014 02:30 An: Andreas M?ller Cc: hotspot-gc-use at openjdk.java.net Betreff: Re: G1 GC - pauses much larger than target Hi, I tried with swappiness set to 0 and turned off all the logging of the other application that is running on the same machine to cut down on the io on the machine. The results are much better and all the large outliers with over 100ms and upto 500-600 msec are gone now. I see pauses around 50ms-60ms for a pause target of 30ms which is ok to work with. Attached is the scatter chart comparison from excel with one of the earlier runs and this run. It is also interesting to see there are less number of gcs but slightly more pause interval per gc in this run. I am yet to find out if this improvement is because of setting swappiness to 0 or cutting down on the IO. Will do individual runs tomorrow or day after and will update the thread about the findings. On Sat, Feb 22, 2014 at 8:09 AM, Andreas M?ller > wrote: Hi Kirti, > [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater than user time, That's extreme. But running several JVMs on the same hardware can produce such a phenomenon on a smaller scale. I usually observe an increase of pause times when two JVMs compete for the CPUs by a factor between sqrt(2) and 2 which I understand from a statistical point of view. I have also got the impression that G1 is much more sensitive to CPU contention (even well below 100% load) than other collectors. I have attached a GC log graphic where one JVM runs G1 on a 25 GB heap for 30 minutes when after 15 minutes a second JVM with exactly the same settings starts to compete for the CPUs (and other system resources). It is clearly visible from the plot that the second JVM has a huge impact on the operation of the first one.: - Long pauses jump from a max of 250 millis to 700 millis (almost a factor of 3) - G1 sharply decreases the new generation size (probably to regain control of pause times because the target is at default value, i.e. 200 millis) The statistics on the right hand side are taken only from the time period where both JVMs compete. My impression is that G1 does not like sharing CPUs with other processes. It probably spoils its ability to predict pause durations properly. Your example with a 2 GB heap looks much more extreme, however. It looks like your GC threads are almost being starved by something. I would be very happy to learn about the reason if any can be found. Best regards Andreas Von: Kirti Teja Rao [mailto:kirtiteja at gmail.com] Gesendet: Freitag, 21. Februar 2014 20:28 An: Andreas M?ller Cc: hotspot-gc-use at openjdk.java.net Betreff: Re: G1 GC - pauses much larger than target Hi, @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some of these pauses are as large as 1 second and these are always observed after the parallel phase, I assume this is the phase were G1 would need the most amount of CPU. @Andreas - Most of these pauses are in young collection and are not showing in the parallel/serial phases shown in GC log. The pauses i observe are unreasonable 1.5+ sec for a heap of 2 GB. @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater than user time, I believe G1 is blocked on some resource. The application i run is not swapping and also there is more headroom in memory. CPU is less than 35%.There are other applications running on the machine which log quite a bit and can cause the iowait avg queue size to spike upto 20-30 occasionally. Does G1 logging happen during the pause time? Can a slow disk or high disk IO affect these timings? Is there anything else that we can try to uncover the cause for these pauses? 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 seconds 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) Desired survivor size 81788928 bytes, new threshold 15 (max 15) - age 1: 564704 bytes, 564704 total - age 2: 18504 bytes, 583208 total - age 3: 18552 bytes, 601760 total - age 4: 18776 bytes, 620536 total - age 5: 197048 bytes, 817584 total - age 6: 18712 bytes, 836296 total - age 7: 18456 bytes, 854752 total - age 8: 18920 bytes, 873672 total - age 9: 18456 bytes, 892128 total - age 10: 18456 bytes, 910584 total - age 11: 18456 bytes, 929040 total - age 12: 18456 bytes, 947496 total - age 13: 18488 bytes, 965984 total - age 14: 18456 bytes, 984440 total - age 15: 18456 bytes, 1002896 total 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 ms, target pause time: 30.00 ms] 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 ms] 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: 8.67 ms, target pause time: 30.00 ms] , 0.0079290 secs] [Parallel Time: 6.0 ms, GC Workers: 9] [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: 12675970.8, Diff: 0.7] [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, Sum: 36.3] [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2] [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: 50.1] [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: 12675975.9, Diff: 0.1] [Code Root Fixup: 0.0 ms] [Clear CT: 0.5 ms] [Other: 1.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 0.5 ms] [Ref Enq: 0.0 ms] [Free CSet: 0.7 ms] [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] [Times: user=0.06 sys=0.00, real=1.54 secs] 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application threads were stopped: 1.5421650 seconds On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller > wrote: Hi Kirti, > I am trying out G1 collector for our application. Our application runs with 2GB heap and we expect relatively low latency. > The pause time target is set to 25ms. There >are much bigger pauses (and unexplained) in order of few 100s of ms. > This is not a rare occurence and i can see this 15-20 times in 6-7 hours runs. This conforms to what I have observed in extended tests: G1's control of GC pause duration is limited to a rather narrow range. Even in that range, only new gen pauses do follow the pause time target well while "mixed" pauses tend to overshoot with considerable probability. Find attached a graphic which shows what I mean: - New gen pauses (red) do follow the target very well from 150-800 millis - With a target below 150 the actual new gen pauses remain flat at 150-180 millis - "mixed" pauses (blue) do not follow the target well and some of them will always take 500-700 millis, whatever the target be - There are other pauses (remark etc., green) which are short but completely independent of the target value The range with reasonable control depends on the heap size, the application and the hardware. I measured the graphic attached on a 6-core Xeon/2GHz server running Java 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. (For which the pause durations achieved are not bad at all!) The application was a synthetic benchmark described here: http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ With the same benchmark but only 10 GB of overall heap size on a Oracle T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of plot but the range with reasonable pause time control was now 60-180 millis. Again the pause durations reached were by themselves not bad at all. But the idea of setting a pause time target and expecting it to be followed in a meaningful way is to some extent misleading. These results on G1's pause time control will be published soon on the blog of the link above. Best regards Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140225/25bdbcdb/attachment-0001.html From holger.hoffstaette at googlemail.com Tue Feb 25 02:18:11 2014 From: holger.hoffstaette at googlemail.com (=?UTF-8?B?SG9sZ2VyIEhvZmZzdMOkdHRl?=) Date: Tue, 25 Feb 2014 11:18:11 +0100 Subject: G1 GC - pauses much larger than target In-Reply-To: <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> Message-ID: <530C6DE3.8090600@googlemail.com> On 25.02.2014 10:53, Andreas M?ller wrote: >>I tried with swappiness set to 0 > > Interesting. This means then, that the Linux kernel was swapping around > with no real need? I am surprised to read this but it explains the long This has always been the case (until recently). To understand more of the history, please read my Quora answer here: http://qr.ae/tbyRA In other words talking about "Linux", paging and swappiness is *completely meaningless* without specifying exactly what version of the kernel you are talking about. -h From chkwok at digibites.nl Tue Feb 25 03:26:59 2014 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Tue, 25 Feb 2014 12:26:59 +0100 Subject: AW: G1 GC - pauses much larger than target In-Reply-To: <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> Message-ID: We're running CMS and with the default swappiness, Linux swaps out huge chunks of idle, not actively used reserve memory; from the OS point of view, it's malloc'ed memory that isn't used for hours, but oh dear when CMS does a full sweep and clean them up... Swapstorm ahoy. vm.swappiness is the first thing we change on an os install. Kind regards, Chi Ho Kwok On 25 Feb 2014 10:55, "Andreas M?ller" wrote: > Hi Kirti, > > > > thanks. > > > I tried with swappiness set to 0 > > Interesting. This means then, that the Linux kernel was swapping around > with no real need? I am surprised to read this but it explains the long > real time and the low usr time from your logs. I have sometimes seen GC > logs from Linux which strangely looked like swapping without any shortage > of physical RAM (which I can't remember seeing on Solaris). This difference > is worth keeping in mind! > > > > > I am yet to find out if this improvement is because of setting > swappiness to 0 or cutting down on the IO. > > I assume that swappiness is the prime factor here. Without swapping, why > should a JVM care about the IO of another process? I would be rather > shocked (and eye Linux with a lot more suspicion) if it were different. > > > > Can you tell whether other collectors (e.g. CMS) also suffered from idle > GC pauses when you ran them with swappiness set to the default of 60? > > > > Best regards > > Andreas > > > > *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] > *Gesendet:* Dienstag, 25. Februar 2014 02:30 > *An:* Andreas M?ller > *Cc:* hotspot-gc-use at openjdk.java.net > *Betreff:* Re: G1 GC - pauses much larger than target > > > > Hi, > > > > I tried with swappiness set to 0 and turned off all the logging of the > other application that is running on the same machine to cut down on the io > on the machine. The results are much better and all the large outliers with > over 100ms and upto 500-600 msec are gone now. I see pauses around > 50ms-60ms for a pause target of 30ms which is ok to work with. Attached is > the scatter chart comparison from excel with one of the earlier runs and > this run. It is also interesting to see there are less number of gcs but > slightly more pause interval per gc in this run. > > > > I am yet to find out if this improvement is because of setting swappiness > to 0 or cutting down on the IO. Will do individual runs tomorrow or day > after and will update the thread about the findings. > > > > > > On Sat, Feb 22, 2014 at 8:09 AM, Andreas M?ller < > Andreas.Mueller at mgm-tp.com> wrote: > > Hi Kirti, > > > > > [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater > than user time, > > That's extreme. > > But running several JVMs on the same hardware can produce such a > phenomenon on a smaller scale. I usually observe an increase of pause times > when two JVMs compete for the CPUs by a factor between sqrt(2) and 2 which > I understand from a statistical point of view. > > > > I have also got the impression that G1 is much more sensitive to CPU > contention (even well below 100% load) than other collectors. > > I have attached a GC log graphic where one JVM runs G1 on a 25 GB heap for > 30 minutes when after 15 minutes a second JVM with exactly the same > settings starts to compete for the CPUs (and other system resources). It is > clearly visible from the plot that the second JVM has a huge impact on the > operation of the first one.: > > - Long pauses jump from a max of 250 millis to 700 millis (almost > a factor of 3) > > - G1 sharply decreases the new generation size (probably to regain > control of pause times because the target is at default value, i.e. 200 > millis) > > The statistics on the right hand side are taken only from the time period > where both JVMs compete. > > My impression is that G1 does not like sharing CPUs with other processes. > It probably spoils its ability to predict pause durations properly. > > > > Your example with a 2 GB heap looks much more extreme, however. > > It looks like your GC threads are almost being starved by something. > > I would be very happy to learn about the reason if any can be found. > > > > Best regards > > Andreas > > > > *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] > *Gesendet:* Freitag, 21. Februar 2014 20:28 > *An:* Andreas M?ller > > > *Cc:* hotspot-gc-use at openjdk.java.net > *Betreff:* Re: G1 GC - pauses much larger than target > > > > Hi, > > > > @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. Some > of these pauses are as large as 1 second and these are always observed > after the parallel phase, I assume this is the phase were G1 would need the > most amount of CPU. > > > > @Andreas - Most of these pauses are in young collection and are not > showing in the parallel/serial phases shown in GC log. The pauses i observe > are unreasonable 1.5+ sec for a heap of 2 GB. > > > > @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far > greater than user time, I believe G1 is blocked on some resource. The > application i run is not swapping and also there is more headroom in > memory. CPU is less than 35%.There are other applications running on the > machine which log quite a bit and can cause the iowait avg queue size to > spike upto 20-30 occasionally. Does G1 logging happen during the pause > time? Can a slow disk or high disk IO affect these timings? > > > > Is there anything else that we can try to uncover the cause for these > pauses? > > > > > > 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 > seconds > > 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) > > Desired survivor size 81788928 bytes, new threshold 15 (max 15) > > - age 1: 564704 bytes, 564704 total > > - age 2: 18504 bytes, 583208 total > > - age 3: 18552 bytes, 601760 total > > - age 4: 18776 bytes, 620536 total > > - age 5: 197048 bytes, 817584 total > > - age 6: 18712 bytes, 836296 total > > - age 7: 18456 bytes, 854752 total > > - age 8: 18920 bytes, 873672 total > > - age 9: 18456 bytes, 892128 total > > - age 10: 18456 bytes, 910584 total > > - age 11: 18456 bytes, 929040 total > > - age 12: 18456 bytes, 947496 total > > - age 13: 18488 bytes, 965984 total > > - age 14: 18456 bytes, 984440 total > > - age 15: 18456 bytes, 1002896 total > > 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 > ms, target pause time: 30.00 ms] > > 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 > ms] > > 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: > 8.67 ms, target pause time: 30.00 ms] > > , 0.0079290 secs] > > [Parallel Time: 6.0 ms, GC Workers: 9] > > [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: > 12675970.8, Diff: 0.7] > > [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, > Sum: 36.3] > > [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] > > [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] > > [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] > > [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: 6.5] > > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 2.0] > > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.2] > > [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, Sum: > 50.1] > > [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: > 12675975.9, Diff: 0.1] > > [Code Root Fixup: 0.0 ms] > > [Clear CT: 0.5 ms] > > [Other: 1.4 ms] > > [Choose CSet: 0.0 ms] > > [Ref Proc: 0.5 ms] > > [Ref Enq: 0.0 ms] > > [Free CSet: 0.7 ms] > > [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K > Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] > > [Times: user=0.06 sys=0.00, real=1.54 secs] > > 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application > threads were stopped: 1.5421650 seconds > > > > > > On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < > Andreas.Mueller at mgm-tp.com> wrote: > > Hi Kirti, > > > > > I am trying out G1 collector for our application. Our application runs > with 2GB heap and we expect relatively low latency. > > > The pause time target is set to 25ms. There >are much bigger pauses (and > unexplained) in order of few 100s of ms. > > > This is not a rare occurence and i can see this 15-20 times in 6-7 hours > runs. > > > > This conforms to what I have observed in extended tests: > > G1's control of GC pause duration is limited to a rather narrow range. > > Even in that range, only new gen pauses do follow the pause time target > well while "mixed" pauses tend to overshoot with considerable probability. > > Find attached a graphic which shows what I mean: > > - New gen pauses (red) do follow the target very well from 150-800 > millis > > - With a target below 150 the actual new gen pauses remain flat at > 150-180 millis > > - "mixed" pauses (blue) do not follow the target well and some of > them will always take 500-700 millis, whatever the target be > > - There are other pauses (remark etc., green) which are short but > completely independent of the target value > > > > The range with reasonable control depends on the heap size, the > application and the hardware. > > I measured the graphic attached on a 6-core Xeon/2GHz server running Java > 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. > > (For which the pause durations achieved are not bad at all!) > > The application was a synthetic benchmark described here: > http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ > > With the same benchmark but only 10 GB of overall heap size on a Oracle T3 > server running Java 7u45 on Solaris/SPARC I got a very similar kind of plot > but the range with reasonable pause time control was now 60-180 millis. > > Again the pause durations reached were by themselves not bad at all. But > the idea of setting a pause time target and expecting it to be followed in > a meaningful way is to some extent misleading. > > > > These results on G1's pause time control will be published soon on the > blog of the link above. > > > > Best regards > > Andreas > > > > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140225/77af4d19/attachment.html From kirtiteja at gmail.com Tue Feb 25 17:24:55 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Tue, 25 Feb 2014 17:24:55 -0800 Subject: AW: G1 GC - pauses much larger than target In-Reply-To: References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> Message-ID: I am running the tests on Cent OS 5.5 with kernel 2.6.18-194.el5. The deviations are due vm swappiness and nothing to do with IO of the other process. Thanks for all members, for the help in resolving the issue. I have to continue to tune it for the application but for now the major problem is resolved. On Tue, Feb 25, 2014 at 3:26 AM, Chi Ho Kwok wrote: > We're running CMS and with the default swappiness, Linux swaps out huge > chunks of idle, not actively used reserve memory; from the OS point of > view, it's malloc'ed memory that isn't used for hours, but oh dear when CMS > does a full sweep and clean them up... Swapstorm ahoy. > > vm.swappiness is the first thing we change on an os install. > > > Kind regards, > > Chi Ho Kwok > On 25 Feb 2014 10:55, "Andreas M?ller" wrote: > >> Hi Kirti, >> >> >> >> thanks. >> >> > I tried with swappiness set to 0 >> >> Interesting. This means then, that the Linux kernel was swapping around >> with no real need? I am surprised to read this but it explains the long >> real time and the low usr time from your logs. I have sometimes seen GC >> logs from Linux which strangely looked like swapping without any shortage >> of physical RAM (which I can't remember seeing on Solaris). This difference >> is worth keeping in mind! >> >> >> >> > I am yet to find out if this improvement is because of setting >> swappiness to 0 or cutting down on the IO. >> >> I assume that swappiness is the prime factor here. Without swapping, why >> should a JVM care about the IO of another process? I would be rather >> shocked (and eye Linux with a lot more suspicion) if it were different. >> >> >> >> Can you tell whether other collectors (e.g. CMS) also suffered from idle >> GC pauses when you ran them with swappiness set to the default of 60? >> >> >> >> Best regards >> >> Andreas >> >> >> >> *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] >> *Gesendet:* Dienstag, 25. Februar 2014 02:30 >> *An:* Andreas M?ller >> *Cc:* hotspot-gc-use at openjdk.java.net >> *Betreff:* Re: G1 GC - pauses much larger than target >> >> >> >> Hi, >> >> >> >> I tried with swappiness set to 0 and turned off all the logging of the >> other application that is running on the same machine to cut down on the io >> on the machine. The results are much better and all the large outliers with >> over 100ms and upto 500-600 msec are gone now. I see pauses around >> 50ms-60ms for a pause target of 30ms which is ok to work with. Attached is >> the scatter chart comparison from excel with one of the earlier runs and >> this run. It is also interesting to see there are less number of gcs but >> slightly more pause interval per gc in this run. >> >> >> >> I am yet to find out if this improvement is because of setting swappiness >> to 0 or cutting down on the IO. Will do individual runs tomorrow or day >> after and will update the thread about the findings. >> >> >> >> >> >> On Sat, Feb 22, 2014 at 8:09 AM, Andreas M?ller < >> Andreas.Mueller at mgm-tp.com> wrote: >> >> Hi Kirti, >> >> >> >> > [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far greater >> than user time, >> >> That's extreme. >> >> But running several JVMs on the same hardware can produce such a >> phenomenon on a smaller scale. I usually observe an increase of pause times >> when two JVMs compete for the CPUs by a factor between sqrt(2) and 2 which >> I understand from a statistical point of view. >> >> >> >> I have also got the impression that G1 is much more sensitive to CPU >> contention (even well below 100% load) than other collectors. >> >> I have attached a GC log graphic where one JVM runs G1 on a 25 GB heap >> for 30 minutes when after 15 minutes a second JVM with exactly the same >> settings starts to compete for the CPUs (and other system resources). It is >> clearly visible from the plot that the second JVM has a huge impact on the >> operation of the first one.: >> >> - Long pauses jump from a max of 250 millis to 700 millis (almost >> a factor of 3) >> >> - G1 sharply decreases the new generation size (probably to >> regain control of pause times because the target is at default value, i.e. >> 200 millis) >> >> The statistics on the right hand side are taken only from the time period >> where both JVMs compete. >> >> My impression is that G1 does not like sharing CPUs with other processes. >> It probably spoils its ability to predict pause durations properly. >> >> >> >> Your example with a 2 GB heap looks much more extreme, however. >> >> It looks like your GC threads are almost being starved by something. >> >> I would be very happy to learn about the reason if any can be found. >> >> >> >> Best regards >> >> Andreas >> >> >> >> *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] >> *Gesendet:* Freitag, 21. Februar 2014 20:28 >> *An:* Andreas M?ller >> >> >> *Cc:* hotspot-gc-use at openjdk.java.net >> *Betreff:* Re: G1 GC - pauses much larger than target >> >> >> >> Hi, >> >> >> >> @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. >> Some of these pauses are as large as 1 second and these are always observed >> after the parallel phase, I assume this is the phase were G1 would need the >> most amount of CPU. >> >> >> >> @Andreas - Most of these pauses are in young collection and are not >> showing in the parallel/serial phases shown in GC log. The pauses i observe >> are unreasonable 1.5+ sec for a heap of 2 GB. >> >> >> >> @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far >> greater than user time, I believe G1 is blocked on some resource. The >> application i run is not swapping and also there is more headroom in >> memory. CPU is less than 35%.There are other applications running on the >> machine which log quite a bit and can cause the iowait avg queue size to >> spike upto 20-30 occasionally. Does G1 logging happen during the pause >> time? Can a slow disk or high disk IO affect these timings? >> >> >> >> Is there anything else that we can try to uncover the cause for these >> pauses? >> >> >> >> >> >> 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 >> seconds >> >> 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) >> >> Desired survivor size 81788928 bytes, new threshold 15 (max 15) >> >> - age 1: 564704 bytes, 564704 total >> >> - age 2: 18504 bytes, 583208 total >> >> - age 3: 18552 bytes, 601760 total >> >> - age 4: 18776 bytes, 620536 total >> >> - age 5: 197048 bytes, 817584 total >> >> - age 6: 18712 bytes, 836296 total >> >> - age 7: 18456 bytes, 854752 total >> >> - age 8: 18920 bytes, 873672 total >> >> - age 9: 18456 bytes, 892128 total >> >> - age 10: 18456 bytes, 910584 total >> >> - age 11: 18456 bytes, 929040 total >> >> - age 12: 18456 bytes, 947496 total >> >> - age 13: 18488 bytes, 965984 total >> >> - age 14: 18456 bytes, 984440 total >> >> - age 15: 18456 bytes, 1002896 total >> >> 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, >> _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 >> ms, target pause time: 30.00 ms] >> >> 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 >> ms] >> >> 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: >> 306 regions, survivors: 1 regions, old: 0 regions, predicted pause time: >> 8.67 ms, target pause time: 30.00 ms] >> >> , 0.0079290 secs] >> >> [Parallel Time: 6.0 ms, GC Workers: 9] >> >> [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: >> 12675970.8, Diff: 0.7] >> >> [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, >> Sum: 36.3] >> >> [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] >> >> [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: 49] >> >> [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] >> >> [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: >> 6.5] >> >> [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: >> 2.0] >> >> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >> Sum: 0.2] >> >> [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, >> Sum: 50.1] >> >> [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: >> 12675975.9, Diff: 0.1] >> >> [Code Root Fixup: 0.0 ms] >> >> [Clear CT: 0.5 ms] >> >> [Other: 1.4 ms] >> >> [Choose CSet: 0.0 ms] >> >> [Ref Proc: 0.5 ms] >> >> [Ref Enq: 0.0 ms] >> >> [Free CSet: 0.7 ms] >> >> [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K >> Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] >> >> [Times: user=0.06 sys=0.00, real=1.54 secs] >> >> 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which application >> threads were stopped: 1.5421650 seconds >> >> >> >> >> >> On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < >> Andreas.Mueller at mgm-tp.com> wrote: >> >> Hi Kirti, >> >> >> >> > I am trying out G1 collector for our application. Our application runs >> with 2GB heap and we expect relatively low latency. >> >> > The pause time target is set to 25ms. There >are much bigger pauses >> (and unexplained) in order of few 100s of ms. >> >> > This is not a rare occurence and i can see this 15-20 times in 6-7 >> hours runs. >> >> >> >> This conforms to what I have observed in extended tests: >> >> G1's control of GC pause duration is limited to a rather narrow range. >> >> Even in that range, only new gen pauses do follow the pause time target >> well while "mixed" pauses tend to overshoot with considerable probability. >> >> Find attached a graphic which shows what I mean: >> >> - New gen pauses (red) do follow the target very well from >> 150-800 millis >> >> - With a target below 150 the actual new gen pauses remain flat >> at 150-180 millis >> >> - "mixed" pauses (blue) do not follow the target well and some of >> them will always take 500-700 millis, whatever the target be >> >> - There are other pauses (remark etc., green) which are short but >> completely independent of the target value >> >> >> >> The range with reasonable control depends on the heap size, the >> application and the hardware. >> >> I measured the graphic attached on a 6-core Xeon/2GHz server running Java >> 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g -Xmx50g. >> >> (For which the pause durations achieved are not bad at all!) >> >> The application was a synthetic benchmark described here: >> http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ >> >> With the same benchmark but only 10 GB of overall heap size on a Oracle >> T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of >> plot but the range with reasonable pause time control was now 60-180 >> millis. >> >> Again the pause durations reached were by themselves not bad at all. But >> the idea of setting a pause time target and expecting it to be followed in >> a meaningful way is to some extent misleading. >> >> >> >> These results on G1's pause time control will be published soon on the >> blog of the link above. >> >> >> >> Best regards >> >> Andreas >> >> >> >> >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140225/f458ac83/attachment.html From kirtiteja at gmail.com Thu Feb 27 11:51:42 2014 From: kirtiteja at gmail.com (Kirti Teja Rao) Date: Thu, 27 Feb 2014 11:51:42 -0800 Subject: AW: G1 GC - pauses much larger than target In-Reply-To: References: <46FF8393B58AD84D95E444264805D98FBDE16A40@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE16C3F@edata01.mgm-edv.de> <46FF8393B58AD84D95E444264805D98FBDE17F74@edata01.mgm-edv.de> Message-ID: There is a follow up question i have. What is the GC phase that is affected due to paging and why is it not showing up in PrintGCDetails? Also, i see this is the time spent after the parallel phase is complete. Any ideas on this? On Tue, Feb 25, 2014 at 5:24 PM, Kirti Teja Rao wrote: > I am running the tests on Cent OS 5.5 with kernel 2.6.18-194.el5. The > deviations are due vm swappiness and nothing to do with IO of the other > process. Thanks for all members, for the help in resolving the issue. I > have to continue to tune it for the application but for now the major > problem is resolved. > > > On Tue, Feb 25, 2014 at 3:26 AM, Chi Ho Kwok wrote: > >> We're running CMS and with the default swappiness, Linux swaps out huge >> chunks of idle, not actively used reserve memory; from the OS point of >> view, it's malloc'ed memory that isn't used for hours, but oh dear when CMS >> does a full sweep and clean them up... Swapstorm ahoy. >> >> vm.swappiness is the first thing we change on an os install. >> >> >> Kind regards, >> >> Chi Ho Kwok >> On 25 Feb 2014 10:55, "Andreas M?ller" >> wrote: >> >>> Hi Kirti, >>> >>> >>> >>> thanks. >>> >>> > I tried with swappiness set to 0 >>> >>> Interesting. This means then, that the Linux kernel was swapping around >>> with no real need? I am surprised to read this but it explains the long >>> real time and the low usr time from your logs. I have sometimes seen GC >>> logs from Linux which strangely looked like swapping without any shortage >>> of physical RAM (which I can't remember seeing on Solaris). This difference >>> is worth keeping in mind! >>> >>> >>> >>> > I am yet to find out if this improvement is because of setting >>> swappiness to 0 or cutting down on the IO. >>> >>> I assume that swappiness is the prime factor here. Without swapping, why >>> should a JVM care about the IO of another process? I would be rather >>> shocked (and eye Linux with a lot more suspicion) if it were different. >>> >>> >>> >>> Can you tell whether other collectors (e.g. CMS) also suffered from idle >>> GC pauses when you ran them with swappiness set to the default of 60? >>> >>> >>> >>> Best regards >>> >>> Andreas >>> >>> >>> >>> *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] >>> *Gesendet:* Dienstag, 25. Februar 2014 02:30 >>> *An:* Andreas M?ller >>> *Cc:* hotspot-gc-use at openjdk.java.net >>> *Betreff:* Re: G1 GC - pauses much larger than target >>> >>> >>> >>> Hi, >>> >>> >>> >>> I tried with swappiness set to 0 and turned off all the logging of the >>> other application that is running on the same machine to cut down on the io >>> on the machine. The results are much better and all the large outliers with >>> over 100ms and upto 500-600 msec are gone now. I see pauses around >>> 50ms-60ms for a pause target of 30ms which is ok to work with. Attached is >>> the scatter chart comparison from excel with one of the earlier runs and >>> this run. It is also interesting to see there are less number of gcs but >>> slightly more pause interval per gc in this run. >>> >>> >>> >>> I am yet to find out if this improvement is because of setting >>> swappiness to 0 or cutting down on the IO. Will do individual runs tomorrow >>> or day after and will update the thread about the findings. >>> >>> >>> >>> >>> >>> On Sat, Feb 22, 2014 at 8:09 AM, Andreas M?ller < >>> Andreas.Mueller at mgm-tp.com> wrote: >>> >>> Hi Kirti, >>> >>> >>> >>> > [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far >>> greater than user time, >>> >>> That's extreme. >>> >>> But running several JVMs on the same hardware can produce such a >>> phenomenon on a smaller scale. I usually observe an increase of pause times >>> when two JVMs compete for the CPUs by a factor between sqrt(2) and 2 which >>> I understand from a statistical point of view. >>> >>> >>> >>> I have also got the impression that G1 is much more sensitive to CPU >>> contention (even well below 100% load) than other collectors. >>> >>> I have attached a GC log graphic where one JVM runs G1 on a 25 GB heap >>> for 30 minutes when after 15 minutes a second JVM with exactly the same >>> settings starts to compete for the CPUs (and other system resources). It is >>> clearly visible from the plot that the second JVM has a huge impact on the >>> operation of the first one.: >>> >>> - Long pauses jump from a max of 250 millis to 700 millis >>> (almost a factor of 3) >>> >>> - G1 sharply decreases the new generation size (probably to >>> regain control of pause times because the target is at default value, i.e. >>> 200 millis) >>> >>> The statistics on the right hand side are taken only from the time >>> period where both JVMs compete. >>> >>> My impression is that G1 does not like sharing CPUs with other >>> processes. It probably spoils its ability to predict pause durations >>> properly. >>> >>> >>> >>> Your example with a 2 GB heap looks much more extreme, however. >>> >>> It looks like your GC threads are almost being starved by something. >>> >>> I would be very happy to learn about the reason if any can be found. >>> >>> >>> >>> Best regards >>> >>> Andreas >>> >>> >>> >>> *Von:* Kirti Teja Rao [mailto:kirtiteja at gmail.com] >>> *Gesendet:* Freitag, 21. Februar 2014 20:28 >>> *An:* Andreas M?ller >>> >>> >>> *Cc:* hotspot-gc-use at openjdk.java.net >>> *Betreff:* Re: G1 GC - pauses much larger than target >>> >>> >>> >>> Hi, >>> >>> >>> >>> @Jenny - CPU looks fine. Never over 40% and generally between 25-35%. >>> Some of these pauses are as large as 1 second and these are always observed >>> after the parallel phase, I assume this is the phase were G1 would need the >>> most amount of CPU. >>> >>> >>> >>> @Andreas - Most of these pauses are in young collection and are not >>> showing in the parallel/serial phases shown in GC log. The pauses i observe >>> are unreasonable 1.5+ sec for a heap of 2 GB. >>> >>> >>> >>> @All - [Times: user=0.06 sys=0.00, real=1.54 secs] real time being far >>> greater than user time, I believe G1 is blocked on some resource. The >>> application i run is not swapping and also there is more headroom in >>> memory. CPU is less than 35%.There are other applications running on the >>> machine which log quite a bit and can cause the iowait avg queue size to >>> spike upto 20-30 occasionally. Does G1 logging happen during the pause >>> time? Can a slow disk or high disk IO affect these timings? >>> >>> >>> >>> Is there anything else that we can try to uncover the cause for these >>> pauses? >>> >>> >>> >>> >>> >>> 2014-02-21T06:18:13.592+0000: 12675.969: Application time: 10.7438770 >>> seconds >>> >>> 2014-02-21T06:18:13.593+0000: 12675.970: [GC pause (young) >>> >>> Desired survivor size 81788928 bytes, new threshold 15 (max 15) >>> >>> - age 1: 564704 bytes, 564704 total >>> >>> - age 2: 18504 bytes, 583208 total >>> >>> - age 3: 18552 bytes, 601760 total >>> >>> - age 4: 18776 bytes, 620536 total >>> >>> - age 5: 197048 bytes, 817584 total >>> >>> - age 6: 18712 bytes, 836296 total >>> >>> - age 7: 18456 bytes, 854752 total >>> >>> - age 8: 18920 bytes, 873672 total >>> >>> - age 9: 18456 bytes, 892128 total >>> >>> - age 10: 18456 bytes, 910584 total >>> >>> - age 11: 18456 bytes, 929040 total >>> >>> - age 12: 18456 bytes, 947496 total >>> >>> - age 13: 18488 bytes, 965984 total >>> >>> - age 14: 18456 bytes, 984440 total >>> >>> - age 15: 18456 bytes, 1002896 total >>> >>> 12675.970: [G1Ergonomics (CSet Construction) start choosing CSet, >>> _pending_cards: 4408, predicted base time: 6.77 ms, remaining time: 23.23 >>> ms, target pause time: 30.00 ms] >>> >>> 12675.970: [G1Ergonomics (CSet Construction) add young regions to CSet, >>> eden: 306 regions, survivors: 1 regions, predicted young region time: 1.89 >>> ms] >>> >>> 12675.970: [G1Ergonomics (CSet Construction) finish choosing CSet, >>> eden: 306 regions, survivors: 1 regions, old: 0 regions, predicted pause >>> time: 8.67 ms, target pause time: 30.00 ms] >>> >>> , 0.0079290 secs] >>> >>> [Parallel Time: 6.0 ms, GC Workers: 9] >>> >>> [GC Worker Start (ms): Min: 12675970.1, Avg: 12675970.3, Max: >>> 12675970.8, Diff: 0.7] >>> >>> [Ext Root Scanning (ms): Min: 3.0, Avg: 4.0, Max: 5.0, Diff: 1.9, >>> Sum: 36.3] >>> >>> [Update RS (ms): Min: 0.0, Avg: 0.5, Max: 0.9, Diff: 0.9, Sum: 4.1] >>> >>> [Processed Buffers: Min: 0, Avg: 5.4, Max: 13, Diff: 13, Sum: >>> 49] >>> >>> [Scan RS (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.9] >>> >>> [Object Copy (ms): Min: 0.3, Avg: 0.7, Max: 0.9, Diff: 0.6, Sum: >>> 6.5] >>> >>> [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: >>> 2.0] >>> >>> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, >>> Sum: 0.2] >>> >>> [GC Worker Total (ms): Min: 5.1, Avg: 5.6, Max: 5.8, Diff: 0.8, >>> Sum: 50.1] >>> >>> [GC Worker End (ms): Min: 12675975.8, Avg: 12675975.9, Max: >>> 12675975.9, Diff: 0.1] >>> >>> [Code Root Fixup: 0.0 ms] >>> >>> [Clear CT: 0.5 ms] >>> >>> [Other: 1.4 ms] >>> >>> [Choose CSet: 0.0 ms] >>> >>> [Ref Proc: 0.5 ms] >>> >>> [Ref Enq: 0.0 ms] >>> >>> [Free CSet: 0.7 ms] >>> >>> [Eden: 1224.0M(1224.0M)->0.0B(1224.0M) Survivors: 4096.0K->4096.0K >>> Heap: 1342.2M(2048.0M)->118.1M(2048.0M)] >>> >>> [Times: user=0.06 sys=0.00, real=1.54 secs] >>> >>> 2014-02-21T06:18:15.135+0000: 12677.511: Total time for which >>> application threads were stopped: 1.5421650 seconds >>> >>> >>> >>> >>> >>> On Fri, Feb 21, 2014 at 12:38 AM, Andreas M?ller < >>> Andreas.Mueller at mgm-tp.com> wrote: >>> >>> Hi Kirti, >>> >>> >>> >>> > I am trying out G1 collector for our application. Our application >>> runs with 2GB heap and we expect relatively low latency. >>> >>> > The pause time target is set to 25ms. There >are much bigger pauses >>> (and unexplained) in order of few 100s of ms. >>> >>> > This is not a rare occurence and i can see this 15-20 times in 6-7 >>> hours runs. >>> >>> >>> >>> This conforms to what I have observed in extended tests: >>> >>> G1's control of GC pause duration is limited to a rather narrow range. >>> >>> Even in that range, only new gen pauses do follow the pause time target >>> well while "mixed" pauses tend to overshoot with considerable probability. >>> >>> Find attached a graphic which shows what I mean: >>> >>> - New gen pauses (red) do follow the target very well from >>> 150-800 millis >>> >>> - With a target below 150 the actual new gen pauses remain flat >>> at 150-180 millis >>> >>> - "mixed" pauses (blue) do not follow the target well and some >>> of them will always take 500-700 millis, whatever the target be >>> >>> - There are other pauses (remark etc., green) which are short >>> but completely independent of the target value >>> >>> >>> >>> The range with reasonable control depends on the heap size, the >>> application and the hardware. >>> >>> I measured the graphic attached on a 6-core Xeon/2GHz server running >>> Java 7u45 on CentOS/Linux with 64 GB RAM and a heap size of -Xms50g >>> -Xmx50g. >>> >>> (For which the pause durations achieved are not bad at all!) >>> >>> The application was a synthetic benchmark described here: >>> http://blog.mgm-tp.com/2013/12/benchmarking-g1-and-other-java-7-garbage-collectors/ >>> >>> With the same benchmark but only 10 GB of overall heap size on a Oracle >>> T3 server running Java 7u45 on Solaris/SPARC I got a very similar kind of >>> plot but the range with reasonable pause time control was now 60-180 >>> millis. >>> >>> Again the pause durations reached were by themselves not bad at all. But >>> the idea of setting a pause time target and expecting it to be followed in >>> a meaningful way is to some extent misleading. >>> >>> >>> >>> These results on G1's pause time control will be published soon on the >>> blog of the link above. >>> >>> >>> >>> Best regards >>> >>> Andreas >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >>> >>> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140227/0d377d48/attachment.html From gustav.r.akesson at gmail.com Thu Feb 27 23:11:00 2014 From: gustav.r.akesson at gmail.com (=?ISO-8859-1?Q?Gustav_=C5kesson?=) Date: Fri, 28 Feb 2014 08:11:00 +0100 Subject: CMS olg gen growth Message-ID: Hi, When setting Xms != Xmx (yes, I know this is a bad idea...) I've seen a peculiar behavior with a constant increase of old generation capacity prior to CMS but without a FullGC event. These are the settings which the application is running on: -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintTenuringDistribution -Xloggc:/tmp/gc_logs.txt -XX:+AggressiveOpts -XX:+UseBiasedLocking -XX:CompileThreshold=5000 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC -Xss2048k -Xmx1024M -Xms512M -Xmn160M Below we have two examples of initiating CMS cycles. Prior to CMS running the heap is in total 507904K (first example) but suddenly after the initial mark the capacity increased by 192K. This pattern is pretty consistent throughout the execution. 2014-02-27T17:54:33.554+0100: 3209.911: [GC2014-02-27T17:54:33.554+0100: 3209.912: [ParNew Desired survivor size 8388608 bytes, new threshold 6 (max 6) - age 1: 1531520 bytes, 1531520 total - age 3: 160 bytes, 1531680 total : 132721K->2333K(147456K), 0.0123744 secs] 289498K->159110K(*507904K*), 0.0129755 secs] [Times: user=0.13 sys=0.01, real=0.01 secs] 2014-02-27T17:54:33.567+0100: 3209.925: Total time for which application threads were stopped: 0.0272471 seconds 2014-02-27T17:54:33.706+0100: 3210.064: Total time for which application threads were stopped: 0.0133765 seconds 2014-02-27T17:54:33.726+0100: 3210.084: Total time for which application threads were stopped: 0.0197027 seconds 2014-02-27T17:54:33.739+0100: 3210.097: [GC [1 CMS-initial-mark: 156777K( *360640K*)] 290183K(*508096K*), 1.8524057 secs] [Times: user=1.85 sys=0.00, real=1.85 secs] 2014-02-27T19:07:07.828+0100: 7564.088: [GC2014-02-27T19:07:07.828+0100: 7564.089: [ParNew Desired survivor size 8388608 bytes, new threshold 6 (max 6) - age 1: 1705520 bytes, 1705520 total - age 4: 32 bytes, 1705552 total - age 5: 64 bytes, 1705616 total - age 6: 32 bytes, 1705648 total : 132729K->2201K(147456K), 0.0154973 secs] 289657K->159130K(*508096K*), 0.0161130 secs] [Times: user=0.14 sys=0.00, real=0.02 secs] 2014-02-27T19:07:07.845+0100: 7564.105: Total time for which application threads were stopped: 0.0318814 seconds 2014-02-27T19:07:08.005+0100: 7564.265: Total time for which application threads were stopped: 0.0153855 seconds 2014-02-27T19:07:08.027+0100: 7564.287: Total time for which application threads were stopped: 0.0217859 seconds 2014-02-27T19:07:08.049+0100: 7564.309: Total time for which application threads were stopped: 0.0218527 seconds 2014-02-27T19:07:08.063+0100: 7564.324: [GC [1 CMS-initial-mark: 156929K( *361024K*)] 290203K(*508480K*), 1.8475537 secs] [Times: user=1.85 sys=0.00, real=1.85 secs] The question is why does the heap grow like this? I was under the impression that CMS only increased the capacity using a FullGC event, and by then increased more than a few hundred kilobytes. What I've also experienced is that the when the heap is NOT increased, then the pause is considerably lower (as shown below). Is it possible that this minor heap growth is adding adding to the pause of the initial mark? 2014-02-28T07:32:21.878+0100: 52277.150: [GC2014-02-28T07:32:21.878+0100: 52277.151: [ParNew Desired survivor size 8388608 bytes, new threshold 6 (max 6) - age 1: 1021256 bytes, 1021256 total - age 2: 32 bytes, 1021288 total - age 5: 32 bytes, 1021320 total : 132007K->1234K(147456K), 0.0123908 secs] 284921K->154148K(*510400K*), 0.0129916 secs] [Times: user=0.13 sys=0.01, real=0.01 secs] 2014-02-28T07:32:21.891+0100: 52277.164: Total time for which application threads were stopped: 0.0279730 seconds 2014-02-28T07:32:21.906+0100: 52277.179: [GC [1 CMS-initial-mark: 152913K(362944K)] 155041K(*510400K*), 0.0365786 secs] [Times: user=0.04 sys=0.00, real=0.04 secs] Best Regards, Gustav ?kesson -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140228/c54fdc9f/attachment.html From ysr1729 at gmail.com Fri Feb 28 12:17:48 2014 From: ysr1729 at gmail.com (Srinivas Ramakrishna) Date: Fri, 28 Feb 2014 12:17:48 -0800 Subject: CMS olg gen growth In-Reply-To: References: Message-ID: I know there were issues of creeping heap growth because of a mismatch of heuristics for starting a CMS cycle vs the heuristics for deciding how much free space there should be in the old gen following the end of a CMS cycle. I cna't recall the details, but this has been an issue with CMS from day one, but we never really fixed it. There's a comment in the resize code or an open bug to fix this behaviour. Yes, CMS does resize the heap (all the way up to max) in response to what it believes the heap size should be based on how much space is free following a CMS concurrent collection, without the need for a full gc. I am not sure though about what is causing the heap resize to happen before the CMS initial mark. (I would expect it to happen either at a minor gc or at an allocation event or at the end of a CMS sweep phase.) My guess is that a minor gc or an allocation event needed contiguous space for a larger object which could not be found, so it got a new larger chunk from the unallocated portion of the old gen (but this is just a guess). You might want to run with -XX:+PrintHeapAtGC which should tell you when the expansion occurred. There may be other tracing flags that might provide more clues, but I am a bit rusty at the moment. -- ramki On Thu, Feb 27, 2014 at 11:11 PM, Gustav ?kesson wrote: > Hi, > > When setting Xms != Xmx (yes, I know this is a bad idea...) I've seen a > peculiar behavior with a constant increase of old generation capacity prior > to CMS but without a FullGC event. These are the settings which the > application is running on: > > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime > -XX:+PrintTenuringDistribution -Xloggc:/tmp/gc_logs.txt -XX:+AggressiveOpts > -XX:+UseBiasedLocking -XX:CompileThreshold=5000 -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC -Xss2048k -Xmx1024M > -Xms512M -Xmn160M > > > Below we have two examples of initiating CMS cycles. Prior to CMS running > the heap is in total 507904K (first example) but suddenly after the initial > mark the capacity increased by 192K. This pattern is pretty consistent > throughout the execution. > > > 2014-02-27T17:54:33.554+0100: 3209.911: [GC2014-02-27T17:54:33.554+0100: > 3209.912: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1531520 bytes, 1531520 total > - age 3: 160 bytes, 1531680 total > : 132721K->2333K(147456K), 0.0123744 secs] 289498K->159110K(*507904K*), > 0.0129755 secs] [Times: user=0.13 sys=0.01, real=0.01 secs] > 2014-02-27T17:54:33.567+0100: 3209.925: Total time for which application > threads were stopped: 0.0272471 seconds > 2014-02-27T17:54:33.706+0100: 3210.064: Total time for which application > threads were stopped: 0.0133765 seconds > 2014-02-27T17:54:33.726+0100: 3210.084: Total time for which application > threads were stopped: 0.0197027 seconds > 2014-02-27T17:54:33.739+0100: 3210.097: [GC [1 CMS-initial-mark: 156777K( > *360640K*)] 290183K(*508096K*), 1.8524057 secs] [Times: user=1.85 > sys=0.00, real=1.85 secs] > > 2014-02-27T19:07:07.828+0100: 7564.088: [GC2014-02-27T19:07:07.828+0100: > 7564.089: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1705520 bytes, 1705520 total > - age 4: 32 bytes, 1705552 total > - age 5: 64 bytes, 1705616 total > - age 6: 32 bytes, 1705648 total > : 132729K->2201K(147456K), 0.0154973 secs] 289657K->159130K(*508096K*), > 0.0161130 secs] [Times: user=0.14 sys=0.00, real=0.02 secs] > 2014-02-27T19:07:07.845+0100: 7564.105: Total time for which application > threads were stopped: 0.0318814 seconds > 2014-02-27T19:07:08.005+0100: 7564.265: Total time for which application > threads were stopped: 0.0153855 seconds > 2014-02-27T19:07:08.027+0100: 7564.287: Total time for which application > threads were stopped: 0.0217859 seconds > 2014-02-27T19:07:08.049+0100: 7564.309: Total time for which application > threads were stopped: 0.0218527 seconds > 2014-02-27T19:07:08.063+0100: 7564.324: [GC [1 CMS-initial-mark: 156929K( > *361024K*)] 290203K(*508480K*), 1.8475537 secs] [Times: user=1.85 > sys=0.00, real=1.85 secs] > > > The question is why does the heap grow like this? I was under the > impression that CMS only increased the capacity using a FullGC event, and > by then increased more than a few hundred kilobytes. What I've also > experienced is that the when the heap is NOT increased, then the pause is > considerably lower (as shown below). Is it possible that this minor heap > growth is adding adding to the pause of the initial mark? > > > > 2014-02-28T07:32:21.878+0100: 52277.150: [GC2014-02-28T07:32:21.878+0100: > 52277.151: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1021256 bytes, 1021256 total > - age 2: 32 bytes, 1021288 total > - age 5: 32 bytes, 1021320 total > : 132007K->1234K(147456K), 0.0123908 secs] 284921K->154148K(*510400K*), > 0.0129916 secs] [Times: user=0.13 sys=0.01, real=0.01 secs] > 2014-02-28T07:32:21.891+0100: 52277.164: Total time for which application > threads were stopped: 0.0279730 seconds > 2014-02-28T07:32:21.906+0100: 52277.179: [GC [1 CMS-initial-mark: > 152913K(362944K)] 155041K(*510400K*), 0.0365786 secs] [Times: user=0.04 > sys=0.00, real=0.04 secs] > > > > Best Regards, > > Gustav ?kesson > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140228/b35f5860/attachment.html From jon.masamitsu at oracle.com Fri Feb 28 12:21:19 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 28 Feb 2014 12:21:19 -0800 Subject: CMS olg gen growth In-Reply-To: References: Message-ID: <5310EFBF.6070208@oracle.com> Gustav, During a young GC if CMS cannot get enough space for the promotion of an object and there is sufficient room to expand the old gen, it will expand it. The alternative is a promotion failure and CMS will expand the old gen in preference to a promotion failure. 196k could be right if it is a large object that needs promotion. The expansion will be at least 128k (somewhat larger on a 64bit system). This type of expansion says that there isn't sufficient space available in the old gen for the promotion. Since the CMS gen allocation is a freelist allocator, the slowness of the GC could be due to more searching in the freelists. That's just a guess. You can turn on -XX:PrintFLSCensus=1 or maybe -XX:PrintFLSStatistics=1 or 2 (2 gives more output). You might be able to tell if the freelist dictionary is full of chunks (and lots of searching is being done). Jon On 2/27/2014 11:11 PM, Gustav ?kesson wrote: > Hi, > When setting Xms != Xmx (yes, I know this is a bad idea...) I've seen > a peculiar behavior with a constant increase of old generation > capacity prior to CMS but without a FullGC event. These are the > settings which the application is running on: > -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime > -XX:+PrintTenuringDistribution > -Xloggc:/tmp/gc_logs.txt -XX:+AggressiveOpts -XX:+UseBiasedLocking > -XX:CompileThreshold=5000 -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC -Xss2048k > -Xmx1024M -Xms512M -Xmn160M > Below we have two examples of initiating CMS cycles. Prior to CMS > running the heap is in total 507904K (first example) but suddenly > after the initial mark the capacity increased by 192K. This pattern is > pretty consistent throughout the execution. > 2014-02-27T17:54:33.554+0100: 3209.911: > [GC2014-02-27T17:54:33.554+0100: 3209.912: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1531520 bytes, 1531520 total > - age 3: 160 bytes, 1531680 total > : 132721K->2333K(147456K), 0.0123744 secs] > 289498K->159110K(*507904K*), 0.0129755 secs] [Times: user=0.13 > sys=0.01, real=0.01 secs] > 2014-02-27T17:54:33.567+0100: 3209.925: Total time for which > application threads were stopped: 0.0272471 seconds > 2014-02-27T17:54:33.706+0100: 3210.064: Total time for which > application threads were stopped: 0.0133765 seconds > 2014-02-27T17:54:33.726+0100: 3210.084: Total time for which > application threads were stopped: 0.0197027 seconds > 2014-02-27T17:54:33.739+0100: 3210.097: [GC [1 CMS-initial-mark: > 156777K(*_360640K_*)] 290183K(*508096K*), 1.8524057 secs] [Times: > user=1.85 sys=0.00, real=1.85 secs] > 2014-02-27T19:07:07.828+0100: 7564.088: > [GC2014-02-27T19:07:07.828+0100: 7564.089: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1705520 bytes, 1705520 total > - age 4: 32 bytes, 1705552 total > - age 5: 64 bytes, 1705616 total > - age 6: 32 bytes, 1705648 total > : 132729K->2201K(147456K), 0.0154973 secs] > 289657K->159130K(*508096K*), 0.0161130 secs] [Times: user=0.14 > sys=0.00, real=0.02 secs] > 2014-02-27T19:07:07.845+0100: 7564.105: Total time for which > application threads were stopped: 0.0318814 seconds > 2014-02-27T19:07:08.005+0100: 7564.265: Total time for which > application threads were stopped: 0.0153855 seconds > 2014-02-27T19:07:08.027+0100: 7564.287: Total time for which > application threads were stopped: 0.0217859 seconds > 2014-02-27T19:07:08.049+0100: 7564.309: Total time for which > application threads were stopped: 0.0218527 seconds > 2014-02-27T19:07:08.063+0100: 7564.324: [GC [1 CMS-initial-mark: > 156929K(*_361024K_*)] 290203K(*508480K*), 1.8475537 secs] [Times: > user=1.85 sys=0.00, real=1.85 secs] > The question is why does the heap grow like this? I was under the > impression that CMS only increased the capacity using a FullGC event, > and by then increased more than a few hundred kilobytes. What I've > also experienced is that the when the heap is NOT increased, then the > pause is considerably lower (as shown below). Is it possible that this > minor heap growth is adding adding to the pause of the initial mark? > > 2014-02-28T07:32:21.878+0100: 52277.150: > [GC2014-02-28T07:32:21.878+0100: 52277.151: [ParNew > Desired survivor size 8388608 bytes, new threshold 6 (max 6) > - age 1: 1021256 bytes, 1021256 total > - age 2: 32 bytes, 1021288 total > - age 5: 32 bytes, 1021320 total > : 132007K->1234K(147456K), 0.0123908 secs] > 284921K->154148K(*510400K*), 0.0129916 secs] [Times: user=0.13 > sys=0.01, real=0.01 secs] > 2014-02-28T07:32:21.891+0100: 52277.164: Total time for which > application threads were stopped: 0.0279730 seconds > 2014-02-28T07:32:21.906+0100: 52277.179: [GC [1 CMS-initial-mark: > 152913K(362944K)] 155041K(*510400K*), 0.0365786 secs] [Times: > user=0.04 sys=0.00, real=0.04 secs] > > Best Regards, > Gustav ?kesson > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140228/479649ce/attachment.html From jon.masamitsu at oracle.com Fri Feb 28 14:59:00 2014 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Fri, 28 Feb 2014 14:59:00 -0800 Subject: CMS olg gen growth In-Reply-To: <5310EFBF.6070208@oracle.com> References: <5310EFBF.6070208@oracle.com> Message-ID: <531114B4.60108@oracle.com> Ramki reminded me that the increase was happening after the minor collect completed so my comments don't apply. Jon On 2/28/2014 12:21 PM, Jon Masamitsu wrote: > Gustav, > > During a young GC if CMS cannot get enough space for the > promotion of an object and there is sufficient room to expand > the old gen, it will expand it. The alternative is a promotion > failure and CMS will expand the old gen in preference to a > promotion failure. 196k could be right if it is a large object > that needs promotion. The expansion will be at least 128k > (somewhat larger on a 64bit system). > > This type of expansion says that there isn't sufficient space > available in the old gen for the promotion. Since the CMS > gen allocation is a freelist allocator, the slowness of the > GC could be due to more searching in the freelists. That's > just a guess. > > You can turn on -XX:PrintFLSCensus=1 or maybe > -XX:PrintFLSStatistics=1 or 2 (2 gives more output). > You might be able to tell if the freelist dictionary is > full of chunks (and lots of searching is being done). > > Jon > > > On 2/27/2014 11:11 PM, Gustav ?kesson wrote: >> Hi, >> When setting Xms != Xmx (yes, I know this is a bad idea...) I've seen >> a peculiar behavior with a constant increase of old generation >> capacity prior to CMS but without a FullGC event. These are the >> settings which the application is running on: >> -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps >> -XX:+PrintGCDateStamps -XX:+PrintGCApplicationStoppedTime >> -XX:+PrintTenuringDistribution >> -Xloggc:/tmp/gc_logs.txt -XX:+AggressiveOpts -XX:+UseBiasedLocking >> -XX:CompileThreshold=5000 -XX:+UseConcMarkSweepGC >> -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseParNewGC -Xss2048k >> -Xmx1024M -Xms512M -Xmn160M >> Below we have two examples of initiating CMS cycles. Prior to CMS >> running the heap is in total 507904K (first example) but suddenly >> after the initial mark the capacity increased by 192K. This pattern >> is pretty consistent throughout the execution. >> 2014-02-27T17:54:33.554+0100: 3209.911: >> [GC2014-02-27T17:54:33.554+0100: 3209.912: [ParNew >> Desired survivor size 8388608 bytes, new threshold 6 (max 6) >> - age 1: 1531520 bytes, 1531520 total >> - age 3: 160 bytes, 1531680 total >> : 132721K->2333K(147456K), 0.0123744 secs] >> 289498K->159110K(*507904K*), 0.0129755 secs] [Times: user=0.13 >> sys=0.01, real=0.01 secs] >> 2014-02-27T17:54:33.567+0100: 3209.925: Total time for which >> application threads were stopped: 0.0272471 seconds >> 2014-02-27T17:54:33.706+0100: 3210.064: Total time for which >> application threads were stopped: 0.0133765 seconds >> 2014-02-27T17:54:33.726+0100: 3210.084: Total time for which >> application threads were stopped: 0.0197027 seconds >> 2014-02-27T17:54:33.739+0100: 3210.097: [GC [1 CMS-initial-mark: >> 156777K(*_360640K_*)] 290183K(*508096K*), 1.8524057 secs] [Times: >> user=1.85 sys=0.00, real=1.85 secs] >> 2014-02-27T19:07:07.828+0100: 7564.088: >> [GC2014-02-27T19:07:07.828+0100: 7564.089: [ParNew >> Desired survivor size 8388608 bytes, new threshold 6 (max 6) >> - age 1: 1705520 bytes, 1705520 total >> - age 4: 32 bytes, 1705552 total >> - age 5: 64 bytes, 1705616 total >> - age 6: 32 bytes, 1705648 total >> : 132729K->2201K(147456K), 0.0154973 secs] >> 289657K->159130K(*508096K*), 0.0161130 secs] [Times: user=0.14 >> sys=0.00, real=0.02 secs] >> 2014-02-27T19:07:07.845+0100: 7564.105: Total time for which >> application threads were stopped: 0.0318814 seconds >> 2014-02-27T19:07:08.005+0100: 7564.265: Total time for which >> application threads were stopped: 0.0153855 seconds >> 2014-02-27T19:07:08.027+0100: 7564.287: Total time for which >> application threads were stopped: 0.0217859 seconds >> 2014-02-27T19:07:08.049+0100: 7564.309: Total time for which >> application threads were stopped: 0.0218527 seconds >> 2014-02-27T19:07:08.063+0100: 7564.324: [GC [1 CMS-initial-mark: >> 156929K(*_361024K_*)] 290203K(*508480K*), 1.8475537 secs] [Times: >> user=1.85 sys=0.00, real=1.85 secs] >> The question is why does the heap grow like this? I was under the >> impression that CMS only increased the capacity using a FullGC event, >> and by then increased more than a few hundred kilobytes. What I've >> also experienced is that the when the heap is NOT increased, then the >> pause is considerably lower (as shown below). Is it possible that >> this minor heap growth is adding adding to the pause of the initial mark? >> >> 2014-02-28T07:32:21.878+0100: 52277.150: >> [GC2014-02-28T07:32:21.878+0100: 52277.151: [ParNew >> Desired survivor size 8388608 bytes, new threshold 6 (max 6) >> - age 1: 1021256 bytes, 1021256 total >> - age 2: 32 bytes, 1021288 total >> - age 5: 32 bytes, 1021320 total >> : 132007K->1234K(147456K), 0.0123908 secs] >> 284921K->154148K(*510400K*), 0.0129916 secs] [Times: user=0.13 >> sys=0.01, real=0.01 secs] >> 2014-02-28T07:32:21.891+0100: 52277.164: Total time for which >> application threads were stopped: 0.0279730 seconds >> 2014-02-28T07:32:21.906+0100: 52277.179: [GC [1 CMS-initial-mark: >> 152913K(362944K)] 155041K(*510400K*), 0.0365786 secs] [Times: >> user=0.04 sys=0.00, real=0.04 secs] >> >> Best Regards, >> Gustav ?kesson >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140228/f8ef6560/attachment.html