From tequilaron at gmail.com Mon Dec 4 23:09:52 2017 From: tequilaron at gmail.com (Ron Reynolds) Date: Mon, 4 Dec 2017 15:09:52 -0800 Subject: young-collections are too infrequent? Message-ID: increasing the max-pause-time from 100ms to 200ms has certainly made our young-collections less frequent (before they were about every 5-10 seconds; now they're 30-60+ seconds apart). unfortunately our Full-GC woes continue. one thing i noticed is that once the JVM is warmed up only humongous-allocations seem to trigger concurrent-mark/mixed-GC cycles (not counting the first failed young-collection that heralds the soon-to-follow full-GC). this is the last healthy young-collection: 2017-12-01T15:08:07.619+0000: [GC pause (G1 Evacuation Pause) (young) 302437.685: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 153647, predicted base time: 60.27 ms, remaining time: 139.73 ms, target pause time: 200.00 ms] 302437.685: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 2293 regions, survivors: 6 regions, predicted young region time: 108.96 ms] 302437.685: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 2293 regions, survivors: 6 regions, old: 0 regions, predicted pause time: 169.23 ms, target pause time: 200.00 ms] *54G->17G(59G)*, 0.1609184 secs] since it left 17G of the 59G heap (28.8%) allocated which would seem to indicate an IHOP of 25 should trigger mixed-collections more frequently, however, given these next 2 less-than-healthy young-collections i'm not sure mixed-collections will prevent what we're experiencing. in other words, if the above young-collection had been followed by mixed-collections it would have only collected perhaps another 10GB of old. this is the next YC which is the first failed collection. 2017-12-01T15:08:40.241+0000: [GC pause (G1 Evacuation Pause) (young) 302470.307: [G1Ergonomics (CSet Construction) start choosing CSet, *_pending_cards: 133537*, predicted base time: 86.00 ms, remaining time: 114.00 ms, target pause time: 200.00 ms] 302470.307: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 2283 regions, survivors: 16 regions, predicted young region time: 85.78 ms] 302470.307: [G1Ergonomics (CSet Construction) finish choosing CSet, *eden: 2283 regions, survivors: 16 regions*, old: 0 regions, predicted pause time: 171.77 ms, target pause time: 200.00 ms] 302472.204: [G1Ergonomics (Heap Sizing) *attempt heap expansion, reason: region allocation request failed*, allocation request: 57512 bytes] 302472.204: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 57512 bytes, attempted expansion amount: 16777216 bytes] 302472.204: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: *heap already fully expanded*] 302480.689: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 55851352064 bytes, allocation request: 0 bytes, threshold: 28938181005 bytes (45.00 %), source: end of GC]*-- 53G->56G(59G), 10.3817795 secs*] the above Young-Collection CSet contained 2283 eden and 16 survivor regions but then it tried to allocate more heap? due to lack of space for promotion? due to heap fragmentation by humongous objects? (the JVM is using 16M regions so it should have ~3833 regions which means it's 60% young-generation at this point). and why would it try to expand the heap-size at all when -Xmx and -Xms are fixed to the same value and thus the attempt is guaranteed to fail? and since the YC is a STW event how did it go from 2283 eden + 16 survivor to 1 eden + 288 survivor and still not recover any heap(based on before/after sizes of 53G->56G)? also there's a 2 second gap between finishing the CSet and the failed attempted heap-increase and the whole thing take 10 seconds. i've asked for more verbose GC logging from ops to hopefully clarify where all those seconds got wasted. the next Young-Collection found over 6.5M pending cards and only 1 eden and 288 survivors (presumably created by the previous young-collection). 302480.710: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested] 2017-12-01T15:08:50.644+0000: [GC pause (G1 Evacuation Pause) (young) (initial-mark) 302480.711: [G1Ergonomics (CSet Construction) start choosing CSet, *_pending_cards: 6533165*, predicted base time: 1123.81 ms, remaining time: 0.00 ms, target pause time: 200.00 ms] 302480.711: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 1 regions, survivors: 288 regions, predicted young region time: 160.18 ms] 302480.711: [G1Ergonomics (CSet Construction) finish choosing CSet, *eden: 1 regions, survivors: 288 regions*, old: 0 regions, predicted pause time: 1283.99 ms, target pause time: 200.00 ms] 302482.600: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: region allocation request failed, allocation request: 8388600 bytes] 302482.600: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 8388600 bytes, attempted expansion amount: 16777216 bytes] 302482.600: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] -- 56G->56G(59G), 10.2031695 secs] 2017-12-01T15:09:00.848+0000: [GC concurrent-root-region-scan-start] so, does it seem that decreasing IHOP and increasing the concurrent-mark heap and thread-count will address this since it didn't even try to start a concurrent-mark cycle until it was basically too late? all subsequent young-collections failed with 0 eden, 0 survivor and even 0 pending-cards and then the Full-GC occurs at 15:09:08 (28 seconds later) and recovers 59G->6766M. this is all from a 8-core VM running "Java HotSpot(TM) 64-Bit Server VM (25.131-b11)" (i couldn't find any mention of GC improvements between build 131 and the current build, 151, so selling Ops on increasing the JVM version is going to be a difficult sell and moving to Java 9 will, of course, require a lot more preparations). note, one thing about these JVMs is that they do not have exclusive use of these cores but share them with at least 1 Apache server (which is fronting the Tomcat that generated these GC logs). sorry if this is a variation on a previous question but sometimes this mailing list feels like the only place to get answers. thanks. ....................ron. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tequilaron at gmail.com Mon Dec 11 19:24:27 2017 From: tequilaron at gmail.com (Ron Reynolds) Date: Mon, 11 Dec 2017 11:24:27 -0800 Subject: Shenandoah GC Message-ID: someone at work just mentioned this new GC; i believe available only in Open-JDK JVM and only if explicitly included. (?) i've just started doing research on it (the only source of which seems to be https://wiki.openjdk.java.net/display/shenandoah/Main) but it sounds like a replacement for G1. has anyone had actual experience with this GC and how it compares with G1 and any idea why it's been kept somewhat quiet? also are there any plans to add it to the Oracle JVM? -------------- next part -------------- An HTML attachment was scrubbed... URL: From pasthelod at gmail.com Mon Dec 11 19:42:36 2017 From: pasthelod at gmail.com (Pas) Date: Mon, 11 Dec 2017 20:42:36 +0100 Subject: Shenandoah GC In-Reply-To: References: Message-ID: Hello, Check these: https://shipilev.net/jvm-anatomy-park/3-gc-design-and-pauses/ https://www.youtube.com/watch?v=AAiB3fDwyRM It'll be avialable in RedHat's JDK builds: Shenandoah is shipping in Fedora starting with Fedora 24 and as a tech preview in Rhel 7.4. Running standard OpenJDK testing with -XX:+UseShenandoahGC should be sufficient. (You can see that it's included in the changelog: https://koji.fedoraproject.org/koji/buildinfo?buildID=1008667 ) I'm not affiliated with RedHat, nor with Oracle, and haven't tested it, but planning to evaluate it next year on a new project. (But I'd need an Ubuntu/Debian based build anyhow.) Regards, Pas On Mon, Dec 11, 2017 at 8:24 PM, Ron Reynolds wrote: > someone at work just mentioned this new GC; i believe available only in > Open-JDK JVM and only if explicitly included. (?) > > i've just started doing research on it (the only source of which seems to > be https://wiki.openjdk.java.net/display/shenandoah/Main) but it sounds > like a replacement for G1. > > has anyone had actual experience with this GC and how it compares with G1 > and any idea why it's been kept somewhat quiet? also are there any plans > to add it to the Oracle JVM? > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Dec 14 10:19:13 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Dec 2017 11:19:13 +0100 Subject: young-collections are too infrequent? In-Reply-To: References: Message-ID: <1513246753.2509.28.camel@oracle.com> Hi Ron, sorry for the really late reply. I started answering this and then... things happened... On Mon, 2017-12-04 at 15:09 -0800, Ron Reynolds wrote: > increasing the max-pause-time from 100ms to 200ms has certainly made > our young-collections less frequent (before they were about every 5- > 10 seconds; now they're 30-60+ seconds apart). unfortunately our > Full-GC woes continue. They are not connected in any way as previously suggested. At best the situation will get worse because in G1 the marking can only start in conjunction with a young gc. So if you reduce the frequency, you potentially loose time for the marking. > > one thing i noticed is that once the JVM is warmed up only humongous- > allocations seem to trigger concurrent-mark/mixed-GC cycles (not > counting the first failed young-collection that heralds the soon-to- > follow full-GC). > this is the last healthy young-collection: > 2017-12-01T15:08:07.619+0000: [GC pause (G1 Evacuation Pause) (young) > 302437.685: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 153647, predicted base time: 60.27 ms, remaining > time: 139.73 ms, target pause time: 200.00 ms] > 302437.685: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 2293 regions, survivors: 6 regions, predicted young > region time: 108.96 ms] > 302437.685: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 2293 regions, survivors: 6 regions, old: 0 regions, predicted > pause time: 169.23 ms, target pause time: 200.00 ms] > 54G->17G(59G), 0.1609184 secs] > since it left 17G of the 59G heap (28.8%) allocated which would seem > to indicate an IHOP of 25 should trigger mixed-collections more > frequently, however, given these next 2 less-than-healthy young- > collections i'm not sure mixed-collections will prevent what we're > experiencing. in other words, if the above young-collection had been > followed by mixed-collections it would have only collected perhaps > another 10GB of old. > this is the next YC which is the first failed collection. > 2017-12-01T15:08:40.241+0000: [GC pause (G1 Evacuation Pause) (young) > 302470.307: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 133537, predicted base time: 86.00 ms, remaining > time: 114.00 ms, target pause time: 200.00 ms] > 302470.307: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 2283 regions, survivors: 16 regions, predicted young > region time: 85.78 ms] > 302470.307: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 2283 regions, survivors: 16 regions, old: 0 regions, predicted > pause time: 171.77 ms, target pause time: 200.00 ms] > 302472.204: [G1Ergonomics (Heap Sizing) attempt heap expansion, > reason: region allocation request failed, allocation request: 57512 > bytes] > 302472.204: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 57512 bytes, attempted expansion amount: 16777216 > bytes] > 302472.204: [G1Ergonomics (Heap Sizing) did not expand the heap, > reason: heap already fully expanded] > 302480.689: [G1Ergonomics (Concurrent Cycles) request concurrent > cycle initiation, reason: occupancy higher than threshold, occupancy: > 55851352064 bytes, allocation request: 0 bytes, threshold: > 28938181005 bytes (45.00 %), source: end of GC] > -- 53G->56G(59G), 10.3817795 secs] > the above Young-Collection CSet contained 2283 eden and 16 survivor > regions but then it tried to allocate more heap? due to lack of > space for promotion? due to heap fragmentation by humongous objects? > (the JVM is using 16M regions so it should have ~3833 regions which > means it's 60% young-generation at this point). Which explains the problems with the large pause time. 60% is the maximum young gen size G1 expands to by default. > and why would it try to expand the heap-size at all when -Xmx and > -Xms are fixed to the same value and thus the attempt is guaranteed > to fail? This can just be considered noise; while searching for opportunities to increase the young gen even further (because the pause time allows), it tries out all possibilities, and "fails" doing so for this particular option. It would certainly be more user friendly. > and since the YC is a STW event how did it go from 2283 eden + 16 > survivor to 1 eden + 288 survivor and still not recover any > heap(based on before/after sizes of 53G->56G)? also there's a 2 In this case, for some reason, between two collections, the death rate of the objects in your young gen significantly changed. G1 bases young gen sizing on old behavior. If application's behavior changes significantly between two young gc's (e.g. samplings), G1 needs to bite the bullet because there is no other way. So if there is a sudden spike in promotions during GC, *only then* G1 notices that and significantly reduces young gen anticipating larger live data for the next GC. Another problem could be that there has been an "evacuation failure", i.e. G1 can't find space for the objects that are live (no space to copy to), and needs to do some recovery work. That a) is *very* slow in JDK8, basically single-threaded (in JDK9 this has basically the same performance as regular GCs), and b) leaves G1 in the situation of being mostly out of memory (I remember the max heap has been 56g in your case). The reason is that currently, all young regions that contain at least one object that could not be evacuated are immediately converted to old. That may be why after that GC there are so few regions left. > second gap between finishing the CSet and the failed attempted heap- > increase and the whole thing take 10 seconds. i've asked for more > verbose GC logging from ops to hopefully clarify where all those > seconds got wasted. > > the next Young-Collection found over 6.5M pending cards and only 1 > eden and 288 survivors (presumably created by the previous young- > collection). This is another side effect of promoting a lot of objects: so you know that G1 needs to track references between (old gen) regions to evacuate them. When copying objects to the old gen, all the references from these objects need to be considered to be added to the remembered sets. G1 collection cheats a little here: instead of adding those to the remembered sets immediately, it uses the refinement mechanisms, i.e. concurrent threads, to do this work. This saves some work (and decreases pause time) during the collection. > 302480.710: [G1Ergonomics (Concurrent Cycles) initiate concurrent > cycle, reason: concurrent cycle initiation requested] > > 2017-12-01T15:08:50.644+0000: [GC pause (G1 Evacuation Pause) (young) > (initial-mark) > 302480.711: [G1Ergonomics (CSet Construction) start choosing CSet, > _pending_cards: 6533165, predicted base time: 1123.81 ms, remaining > time: 0.00 ms, target pause time: 200.00 ms] > 302480.711: [G1Ergonomics (CSet Construction) add young regions to > CSet, eden: 1 regions, survivors: 288 regions, predicted young region > time: 160.18 ms] > 302480.711: [G1Ergonomics (CSet Construction) finish choosing CSet, > eden: 1 regions, survivors: 288 regions, old: 0 regions, predicted > pause time: 1283.99 ms, target pause time: 200.00 ms] > 302482.600: [G1Ergonomics (Heap Sizing) attempt heap expansion, > reason: region allocation request failed, allocation request: 8388600 > bytes] > 302482.600: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 8388600 bytes, attempted expansion amount: 16777216 > bytes] > 302482.600: [G1Ergonomics (Heap Sizing) did not expand the heap, > reason: heap already fully expanded] > -- 56G->56G(59G), 10.2031695 secs] > 2017-12-01T15:09:00.848+0000: [GC concurrent-root-region-scan-start] > so, does it seem that decreasing IHOP and increasing the concurrent- > mark heap and thread-count will address this since it didn't even try > to start a concurrent-mark cycle until it was basically too late? This may happen because as mentioned above, the huge young gen might have prevented timely start of the marking. > all subsequent young-collections failed with 0 eden, 0 survivor and > even 0 pending-cards and then the Full-GC occurs at 15:09:08 (28 > seconds later) and recovers 59G->6766M. These 0 eden, 0 survivor GC problem will be fixed in JDK-10 (see JDK- 8137099: G1 needs to "upgrade" GC within the safepoint if it can't allocate during that safepoint to avoid OoME and the current corresponding review thread on hotspot-gc-dev). Note that these superfluous young gens are dwarfed by the full gc, so they only "add insult to injury". :/ > this is all from a 8-core VM running "Java HotSpot(TM) 64-Bit Server > VM (25.131-b11)" (i couldn't find any mention of GC improvements > between build 131 and the current build, 151, so selling Ops on > increasing the JVM version is going to be a difficult sell and > moving to Java 9 will, of course, require a lot more > preparations). note, one thing about these JVMs is that they do not > have exclusive use of these cores but share them with at least 1 > Apache server (which is fronting the Tomcat that generated these GC > logs). > > sorry if this is a variation on a previous question but sometimes > this mailing list feels like the only place to get answers. One option here is to artificially limit the maximum eden size by e.g. using -XX:G1MaxNewSize (in percent; default 60); or specify minimum and maximum young gen size using -XX:NewSize _and_ -XX:MaxNewSize (taking the young gen size in bytes, if your company policy for some reason forbids to use I think experimental options; make sure you specify *both*). Thanks, Thomas From csxulijie at gmail.com Wed Dec 20 07:47:58 2017 From: csxulijie at gmail.com (Lijie Xu) Date: Wed, 20 Dec 2017 15:47:58 +0800 Subject: The heap size difference among Parallel, CMS, and G1 Message-ID: Hi All, I'm testing the performance of Parallel, CMS, and G1. I found that their available heap sizes are different under the same -Xmx configuration. For example, I set the -Xmx to 6.5GB (6656m) and use Runtime.getRuntime().maxMemory() to obtain the real heap size. The output is as follows. Running with: [-Xmx6656m, -XX:+UseParallelGC, -XX:+UseParallelOldGC] Runtime.getRuntime().maxMemory(): 6,058,496K, 5916.50MB, 5.78GB. memoryStore: 3.29GB. Running with: [-Xmx6656m, -XX:+UseConcMarkSweepGC] Runtime.getRuntime().maxMemory(): 6,747,648K, 6589.50MB, 6.44GB. memoryStore: 3.69GB. Running with: [-Xmx6656m, -XX:+UseG1GC] Runtime.getRuntime().maxMemory(): 6,815,744K, 6656.00MB, 6.50GB. memoryStore: 3.72GB. We can see that the available heap sizes are [Parallel: 5.78GB < CMS: 6.44GB < G1: 6.50GB]. My JDK version is java version "1.8.0_121" Java(TM) SE Runtime Environment (build 1.8.0_121-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode) Any suggestions will be appreciated. The test script is as follows. #!/bin/bash XMX=6656m java -Xmx$XMX -XX:+UseParallelGC -XX:+UseParallelOldGC HeapSizeDifferences java -Xmx$XMX -XX:+UseConcMarkSweepGC HeapSizeDifferences java -Xmx$XMX -XX:+UseG1GC HeapSizeDifferences The testing code is as follows. import java.lang.management.ManagementFactory; import java.util.ArrayList; import java.util.Collection; import java.util.List; public class HeapSizeDifferences { static Collection objects = new ArrayList(); static long lastMaxMemory = 0; public static void main(String[] args) { try { List inputArguments = ManagementFactory.getRuntimeMXBean().getInputArguments(); System.out.println("Running with: " + inputArguments); while (true) { printMaxMemory(); consumeSpace(); } } catch (OutOfMemoryError e) { freeSpace(); printMaxMemory(); } } static void printMaxMemory() { long currentMaxMemory = Runtime.getRuntime().maxMemory(); if (currentMaxMemory != lastMaxMemory) { lastMaxMemory = currentMaxMemory; System.out.format("Runtime.getRuntime().maxMemory(): %,dK, %.2fMB, %.2fGB.%n", currentMaxMemory / 1024, (float) currentMaxMemory / 1024 / 1024, (float) currentMaxMemory / 1024 / 1024 / 1024); double memoryStore = (currentMaxMemory - 300 * 1024 * 1024) * 0.6; System.out.format("memoryStore: %.2fGB.%n", memoryStore / 1024 / 1024 / 1024); } } static void consumeSpace() { objects.add(new int[1_000_000]); } static void freeSpace() { objects.clear(); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Wed Dec 20 09:27:51 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Dec 2017 10:27:51 +0100 Subject: The heap size difference among Parallel, CMS, and G1 In-Reply-To: References: Message-ID: <1513762071.3316.31.camel@oracle.com> Hi, On Wed, 2017-12-20 at 15:47 +0800, Lijie Xu wrote: > Hi All, > > I'm testing the performance of Parallel, CMS, and G1. I found that > their available heap sizes are different under the same -Xmx > configuration. this is normal. The different algorithms use different heap sizing heuristics. This is both due to what the GCs allow (e.g. G1 allows very dynamic sizing of generations while it is fixed for the others after startup) and work (e.g. CMS may just need more space as it can not compact the old gen), their target applications and other reasons. Like some of the differences might just be because of timing in the application, and there is a relatively large variation between runs. Or nobody bothered to update these given a set of currently relevant applications. Since we usually do not change these heuristics in update releases, they may reflect ideas from jdk8 or earlier (like March 2014). I *think* maybe G1 got some re-tuning in the 8u40/60 (2015) timeframe. But I am almost sure for the others heuristics were not updated for a very long time (10 years, probably minimum). Depending on your application they may or may not select the optimal choices. Unless you can show on a large range of representative applications that one or the other choice is better or worse, this is not a bug :) Since heap sizing has a huge impact on performance you might want to make performance tests "out-of-the-box" (no settings at all) vs. (at least) setting min and max heap to the same value to e.g. rule out heap-sizing heuristics. Be aware that some (popular) settings that make sense in some collectors, may not do in others as they directly disable some heuristics or just give completely non-representative results (like e.g. fixing young gen size in G1) Thanks, Thomas