From thomas.schatzl at oracle.com Wed Feb 1 13:10:28 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 01 Feb 2017 14:10:28 +0100 Subject: CMS large objects vs G1 humongous allocations In-Reply-To: References: <1485859866.3425.7.camel@oracle.com> Message-ID: <1485954628.3415.12.camel@oracle.com> Hi Amit, On Tue, 2017-01-31 at 22:04 +0530, Amit Balode wrote: > File is a bit huge:) linked it here?https://raw.githubusercontent.com > /amitbalode/uploads/master/amit.txt > ? some observations: - some humongous objects; not sure if increasing heap region size helps - as for the evacuation failures, I think they can be avoided by capping the maximum young gen size. Every time this happens, the young gen is really large, however it seems that according to heap size calculations the surviving objects should actually have enough space. It does not because the humongous objects may take up too much space, and at least the printing does not take that into account. I remember discussing this or similar issues in the past, not sure if it has been fixed in one way or another in the meantime. Anyway, capping young gen should be able to avoid this issue at least some times. Try setting -XX:G1MaxNewSizePercent to something lower than the default 60%. Thanks, ? Thomas From amit.balode at gmail.com Wed Feb 1 13:48:07 2017 From: amit.balode at gmail.com (Amit Balode) Date: Wed, 1 Feb 2017 19:18:07 +0530 Subject: CMS large objects vs G1 humongous allocations In-Reply-To: <1485954628.3415.12.camel@oracle.com> References: <1485859866.3425.7.camel@oracle.com> <1485954628.3415.12.camel@oracle.com> Message-ID: Hi Thomas, thanks for input. For "Every time this happens, the young gen is really large, however it seems that according to heap size calculations the surviving objects should actually have enough space." - Could you paste the snippet from log which you referring to? "I remember discussing this or similar issues in the past, not sure if it has been fixed in one way or another in the meantime." It would really be great if you could help dig whether it has been fixed and which release so we could try upgrading to it. good point regarding G1MaxNewSizePercent. In general, I have been trying to avoid too many customization with G1 and let heuristics decide for itself but if no option, I will try to put this setting and experiment. On Wed, Feb 1, 2017 at 6:40 PM, Thomas Schatzl wrote: > Hi Amit, > > On Tue, 2017-01-31 at 22:04 +0530, Amit Balode wrote: > > File is a bit huge:) linked it here https://raw.githubusercontent.com > > /amitbalode/uploads/master/amit.txt > > > > some observations: > > - some humongous objects; not sure if increasing heap region size helps > - as for the evacuation failures, I think they can be avoided by > capping the maximum young gen size. Every time this happens, the young > gen is really large, however it seems that according to heap size > calculations the surviving objects should actually have enough space. > It does not because the humongous objects may take up too much space, > and at least the printing does not take that into account. > > I remember discussing this or similar issues in the past, not sure if > it has been fixed in one way or another in the meantime. > > Anyway, capping young gen should be able to avoid this issue at least > some times. Try setting -XX:G1MaxNewSizePercent to something lower than > the default 60%. > > Thanks, > Thomas > > -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.balode at gmail.com Wed Feb 1 13:54:50 2017 From: amit.balode at gmail.com (Amit Balode) Date: Wed, 1 Feb 2017 19:24:50 +0530 Subject: Deciding between 2MB or 32MB region size in G1 Message-ID: Hello, We have multiple applications running in production where predicting size of the runtime object is kinda tough and random. It could vary from 1KB to 25MB for different applications. To not have too many lingering configs for different applications, I am trying to come up with standard set of configs which could be applicable to all applications. Some applications do not exceed 10KB object size, so I could definitely keep 2MB as region size for them. But I am wondering what would be disadvantage of setting all applications to 32MB region size regardless of how small the object is? Is it that fragmentation issues will happen more if you have less regions? If so, will the fragmentation issue happen only during humongous allocations? In term of performance, will selection of size change anything? -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustav.r.akesson at gmail.com Wed Feb 1 14:44:08 2017 From: gustav.r.akesson at gmail.com (=?UTF-8?Q?Gustav_=C3=85kesson?=) Date: Wed, 1 Feb 2017 15:44:08 +0100 Subject: Long Parnew pause Message-ID: Hi, In our application I've observed an occasional and peculiar Parnew GC which takes several seconds. From what I've been able to gather, it is associated with an occasional 35mb allocation of data. Those objects are allocated and tenured (see the bold aging in below logs) and once being promoted to old generation, that Parnew GC takes around 7 seconds. This surprises me a bit since it should not take so long time to move 35mb of data to another heap region. Looking at the logs, this issue is not related to I/O (zero systime) nor TTSP (takes few millis to stop the application threads). GC threads seems to simply spend their time working on the CPU chip. What in Parnew/CMS could possibly make these 35mb take so long to promote? Some flags that can shed som light, or any suspicion such as free-list balancing? I appreciate any input on the matter. JVM settings and platform information at the end of this mail. {Heap before GC invocations=1723 (full 0): par new generation total 1887488K, used 1768096K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, 0x00007fcd57c80000) from space 209664K, 43% used [0x00007fcd64940000, 0x00007fcd6a1680e8, 0x00007fcd71600000) to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, 0x00007fcd64940000) concurrent mark-sweep generation total 34973696K, used 7319028K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121299K, capacity 134343K, committed 134400K, reserved 135168K 2017-01-26T12:50:11.476+0100: 14135.489: [GC (Allocation Failure) 2017-01-26T12:50:11.476+0100: 14135.489: [ParNew Desired survivor size 107347968 bytes, new threshold 6 (max 6) - age 1: 12439600 bytes, 12439600 total - age 2: 5233256 bytes, 17672856 total - age 3: 5083408 bytes, 22756264 total *- age 4: 37639936 bytes, 60396200 total* - age 5: 4869520 bytes, 65265720 total - age 6: 4746784 bytes, 70012504 total : 1768096K->91122K(1887488K), 0.1117876 secs] 9087124K->7412981K(36861184K), 0.1120711 secs] [Times: user=0.85 sys=0.00, real=0.11 secs] Heap after GC invocations=1724 (full 0): par new generation total 1887488K, used 91122K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, 0x00007fcd57c80000) from space 209664K, 43% used [0x00007fcd57c80000, 0x00007fcd5d57ca70, 0x00007fcd64940000) to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, 0x00007fcd71600000) concurrent mark-sweep generation total 34973696K, used 7321858K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121299K, capacity 134343K, committed 134400K, reserved 135168K } 2017-01-26T12:50:11.589+0100: 14135.601: Total time for which application threads were stopped: 0.1174674 seconds, Stopping threads took: 0.0042340 seconds 2017-01-26T12:50:12.168+0100: 14136.181: Application time: 0.5798363 seconds {Heap before GC invocations=1724 (full 0): par new generation total 1887488K, used 1768946K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, 0x00007fcd57c80000) from space 209664K, 43% used [0x00007fcd57c80000, 0x00007fcd5d57ca70, 0x00007fcd64940000) to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, 0x00007fcd71600000) concurrent mark-sweep generation total 34973696K, used 7321858K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121299K, capacity 134343K, committed 134400K, reserved 135168K 2017-01-26T12:50:12.170+0100: 14136.182: [GC (Allocation Failure) 2017-01-26T12:50:12.170+0100: 14136.182: [ParNew Desired survivor size 107347968 bytes, new threshold 6 (max 6) - age 1: 10383048 bytes, 10383048 total - age 2: 5102856 bytes, 15485904 total - age 3: 5154816 bytes, 20640720 total - age 4: 5080000 bytes, 25720720 total *- age 5: 37637680 bytes, 63358400 total* - age 6: 4658912 bytes, 68017312 total : 1768946K->86544K(1887488K), 0.0929344 secs] 9090805K->7411133K(36861184K), 0.0932244 secs] [Times: user=0.70 sys=0.00, real=0.09 secs] Heap after GC invocations=1725 (full 0): par new generation total 1887488K, used 86544K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, 0x00007fcd57c80000) from space 209664K, 41% used [0x00007fcd64940000, 0x00007fcd69dc41d0, 0x00007fcd71600000) to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, 0x00007fcd64940000) concurrent mark-sweep generation total 34973696K, used 7324589K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121299K, capacity 134343K, committed 134400K, reserved 135168K } 2017-01-26T12:50:12.263+0100: 14136.276: Total time for which application threads were stopped: 0.0945634 seconds, Stopping threads took: 0.0001968 seconds 2017-01-26T12:50:12.960+0100: 14136.972: Application time: 0.6966358 seconds {Heap before GC invocations=1725 (full 0): par new generation total 1887488K, used 1764368K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, 0x00007fcd57c80000) from space 209664K, 41% used [0x00007fcd64940000, 0x00007fcd69dc41d0, 0x00007fcd71600000) to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, 0x00007fcd64940000) concurrent mark-sweep generation total 34973696K, used 7324589K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121324K, capacity 134471K, committed 134656K, reserved 135168K 2017-01-26T12:50:12.961+0100: 14136.973: [GC (Allocation Failure) 2017-01-26T12:50:12.961+0100: 14136.973: [ParNew Desired survivor size 107347968 bytes, new threshold 6 (max 6) - age 1: 8033264 bytes, 8033264 total - age 2: 5686168 bytes, 13719432 total - age 3: 5019640 bytes, 18739072 total - age 4: 5150920 bytes, 23889992 total - age 5: 5076720 bytes, 28966712 total *- age 6: 37481736 bytes, 66448448 total* : 1764368K->79984K(1887488K), 0.0955902 secs] 9088957K->7407366K(36861184K), 0.0958643 secs] [Times: user=0.69 sys=0.00, real=0.10 secs] Heap after GC invocations=1726 (full 0): par new generation total 1887488K, used 79984K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, 0x00007fcd57c80000) from space 209664K, 38% used [0x00007fcd57c80000, 0x00007fcd5ca9c148, 0x00007fcd64940000) to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, 0x00007fcd71600000) concurrent mark-sweep generation total 34973696K, used 7327382K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121324K, capacity 134471K, committed 134656K, reserved 135168K } 2017-01-26T12:50:13.057+0100: 14137.069: Total time for which application threads were stopped: 0.0972200 seconds, Stopping threads took: 0.0001917 seconds 2017-01-26T12:50:13.683+0100: 14137.695: Application time: 0.6259722 seconds {Heap before GC invocations=1726 (full 0): par new generation total 1887488K, used 1757808K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, 0x00007fcd57c80000) from space 209664K, 38% used [0x00007fcd57c80000, 0x00007fcd5ca9c148, 0x00007fcd64940000) to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, 0x00007fcd71600000) concurrent mark-sweep generation total 34973696K, used 7327382K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121324K, capacity 134471K, committed 134656K, reserved 135168K *2017-01-26T12:50:13.684+0100: 14137.697: [GC (Allocation Failure) 2017-01-26T12:50:13.684+0100: 14137.697: [ParNew* *Desired survivor size 107347968 bytes, new threshold 6 (max 6)* *- age 1: 10784424 bytes, 10784424 total* *- age 2: 5148032 bytes, 15932456 total* *- age 3: 5607232 bytes, 21539688 total* *- age 4: 5013024 bytes, 26552712 total* *- age 5: 5148840 bytes, 31701552 total* *- age 6: 4839808 bytes, 36541360 total* *: 1757808K->58357K(1887488K), 7.4626505 secs] 9085190K->7420330K(36861184K), 7.4629090 secs] [Times: user=58.63 sys=0.00, real=7.47 secs] * Heap after GC invocations=1727 (full 0): par new generation total 1887488K, used 58357K [0x00007fccf1600000, 0x00007fcd71600000, 0x00007fcd71600000) eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, 0x00007fcd57c80000) from space 209664K, 27% used [0x00007fcd64940000, 0x00007fcd6823d650, 0x00007fcd71600000) to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, 0x00007fcd64940000) concurrent mark-sweep generation total 34973696K, used 7361973K [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) Metaspace used 121324K, capacity 134471K, committed 134656K, reserved 135168K } *2017-01-26T12:50:21.147+0100: 14145.160: Total time for which application threads were stopped: 7.4642882 seconds, Stopping threads took: 0.0002572 seconds* Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE (1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8) Memory: 4k page, physical 49427048k(42773024k free), swap 4194300k(4194300k free) -XX:+AlwaysPreTouch -XX:+CMSEdenChunksRecordAlways -XX:CMSInitiatingOccupancyFraction=80 -XX:+CMSParallelInitialMarkEnabled -XX:+CMSScavengeBeforeRemark -XX:CMSWaitDuration=60000 -XX:+DisableExplicitGC -XX:GCLogFileSize=31457280 -XX:InitialHeapSize=37959499776 -XX:MaxHeapSize=37959499776 -XX:MaxMetaspaceSize=268435456 -XX:MaxNewSize=2147483648 -XX:MaxTenuringThreshold=6 -XX:MetaspaceSize=268435456 -XX:NewSize=2147483648 -XX:+UseBiasedLocking -XX:+UseCMSInitiatingOccupancyOnly -XX:-UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseLargePages -XX:+UseParNewGC Best Regards, Gustav ?kesson -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Feb 2 11:07:33 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 02 Feb 2017 12:07:33 +0100 Subject: CMS large objects vs G1 humongous allocations In-Reply-To: References: <1485859866.3425.7.camel@oracle.com> <1485954628.3415.12.camel@oracle.com> Message-ID: <1486033653.8016.23.camel@oracle.com> Hi, On Wed, 2017-02-01 at 19:18 +0530, Amit Balode wrote: > Hi Thomas, thanks for input. > > For "Every time this happens, the young?gen is really?large, however > it seems that according to heap size?calculations the > surviving?objects?should actually have enough space." - Could you > paste the snippet from log which you referring to?? ? ?[Eden: 8960.0M(8960.0M)->0.0B(288.0M) Survivors: 864.0M->512.0M Heap: 13.6G(16.0G)->2112.0M(16.0G)] ? ?[Eden: 8832.0M(8960.0M)->0.0B(800.0M) Survivors: 864.0M->0.0B Heap: 13.9G(16.0G)->11.6G(16.0G)] ? ?[Eden: 8960.0M(8960.0M)->0.0B(8544.0M) Survivors: 320.0M->512.0M Heap: 13.3G(16.0G)->2624.0M(16.0G)] ? ?[Eden: 8416.0M(9600.0M)->0.0B(9440.0M) Survivors: 224.0M->384.0M Heap: 13.1G(16.0G)->2392.0M(16.0G)] For the GCs that had evacuation failure. According to these lines the heap occupancy for those is e.g. 13.1G, i.e. quite a bit lower than 16G, which should in theory be enough to cover the promotion (looking at previous gcs, it is at most in the few 100MBs). (Caveat: there are a lot of assumptions in application behavior here)? So the 13.1G (which means 2.9G free) may be somewhat misleading. It shows free memory, but not memory that can be allocated into. I could guess this is from humongous objects. So we are probably closer to full heap than we think we are. > "I remember discussing this or similar issues in the past, not sure > if?it has been fixed in one way or another in the meantime." It would > really be great if you could help dig whether it has been fixed and > which release so we could try upgrading to it. One of the issues I remember is that garbage collection itself wasted quite a bit of heap with PLAB sizing (gc threads don't allocate object by object, but get memory to copy to in largish chunks, the PLABs, for various reasons); the existing young gen calculation mostly assumes that there is mostly no memory overhead because of this (but there are some "heuristics" in there of course). In memory tight situations this may cause that problem. This sometimes excessive java heap consumption during gc has been improved a lot with jdk9; further evacuation failures are very fast with that. One other option for any older release is the mentioned G1MaxNewSizePercent which basically limits the amount of data copied during gc (so that the other heuristics are good). Others are fixing PLAB size (potentially impacting gc performance), or increasing G1ReservePercent (the "heuristics" mentioned above). > good point regarding?G1MaxNewSizePercent. In general, ?I have been > trying to avoid too many?customization with G1 and let heuristics > decide for itself but if no option, I will try to put this setting > and experiment. We recommend to at least try without options with G1. Very very often they are quite successful in achieving their goals. Thanks, ? Thomas From yu.zhang at oracle.com Thu Feb 2 16:20:03 2017 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Thu, 2 Feb 2017 08:20:03 -0800 Subject: Deciding between 2MB or 32MB region size in G1 In-Reply-To: References: Message-ID: Hi, Amit, IMO, there is no one size fits all. Some considerations about the bigger region size: Reduce the humongous objects. The humongous objects are allocated in old gen. If they can not be collected during young gc, they can fill up the old gen quickly without marking or full gc. Less remember set to keep track of. Bigger TLAB. This could be good or bad. With bigger tlab, threads need less refill trip, but may waste more tlab space. It depends on the objects size. Possible bigger waste due to humongous objects (depends on the size of the objects) Possible end of region waste for allocation. Maybe others have more comments. Thanks Jenny On 02/01/2017 05:54 AM, Amit Balode wrote: > Hello, We have multiple applications running in production where > predicting size of the runtime object is kinda tough and random. It > could vary from 1KB to 25MB for different applications. To not have > too many lingering configs for different applications, I am trying to > come up with standard set of configs which could be applicable to all > applications. Some applications do not exceed 10KB object size, so I > could definitely keep 2MB as region size for them. But I am wondering > what would be disadvantage of setting all applications to 32MB region > size regardless of how small the object is? > > Is it that fragmentation issues will happen more if you have less > regions? If so, will the fragmentation issue happen only during > humongous allocations? > In term of performance, will selection of size change anything? > > -- > Thanks & Regards, > Amit.Balode > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Fri Feb 3 13:01:53 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 03 Feb 2017 14:01:53 +0100 Subject: Need help on G1 GC young gen Update RS and Scan RS pause reduction In-Reply-To: References: <1484852604.6579.27.camel@oracle.com> <1485244966.2883.8.camel@oracle.com> <1485338858.3625.42.camel@oracle.com> Message-ID: <1486126913.2892.31.camel@oracle.com> Hi Amit, On Fri, 2017-02-03 at 11:09 +0000, Amit Mishra wrote: > Hi Thomas/team, > > > I have put all parameters as per your suggestion but somehow the > minor gc pauses are still haunting. > > Attaching GC logs. > > > bash-3.2$ grep -i young gcstats.log.10636|cut -d, -f2|awk -F" " > '{print $1}'|awk '$1 > 1' > 1.1273134 > 1.1683221 > 3.5504848 > 5.2693987 ? looking at these log entries, there seems to be something going on that seems outside of VM control: - from one gc to another, just for these four gcs, sys time is relatively high. - for the last two occurrences, at least one thread is hanging in "Ext Root Scanning" for almost all of the gc time for no obvious reason. - there do not seem to be an unusually large amount of changes in the amount of work done in the particular phases that would raise immediate concerns to me. Please try to find out the source of the high sys time and maybe even what causes it. I can't help a lot in that area, but dtrace seems a good starting point as suggested earlier. I think we went through most obvious tunings now, but maybe somebody else has more ideas. I don't at this time. The jdk (7u45) you are using is also very old, so even if we find that there is something wrong with g1 in particular, I kind of doubt there are many more useful knobs to turn with that version (or even appropriate logging to find out about the actual issue). Since 7u45 release, there have been hundreds of changes that in particular improve G1 performance, so please consider upgrading to something more recent (at least latest 8u, preferably to me some test runs with 9ea). Upgrading alone might already help. Thanks, ? Thomas From Milan.Mimica at infobip.com Fri Feb 3 16:22:56 2017 From: Milan.Mimica at infobip.com (Milan Mimica) Date: Fri, 3 Feb 2017 16:22:56 +0000 Subject: G1 native memory consumption In-Reply-To: <1485168079.2811.21.camel@oracle.com> References: <1484943874550.90103@infobip.com>, <1485168079.2811.21.camel@oracle.com> Message-ID: <1486138975652.57172@infobip.com> Hi Thomas Thanks for your input. I took me a while to have a stable system again to repeat measurements. I have tried setting G1HeapRegionSize to 16M on one instance (8M is default) and I notice lower GC memory usage: GC (reserved=1117MB -18MB, committed=1117MB -18MB) vs GC (reserved=1604MB +313MB, committed=1604MB +313MB) It seems more stable too. However, "Internal" is still relatively high for a 25G heap, and there is no much difference between instances: Internal (reserved=2132MB -7MB, committed=2132MB -7MB) Milan Mimica, Senior Software Engineer / Division Lead ________________________________________ From: Thomas Schatzl Sent: Monday, January 23, 2017 11:41 To: Milan Mimica; hotspot-gc-use at openjdk.java.net Subject: Re: G1 native memory consumption Hi Milan, On Fri, 2017-01-20 at 20:24 +0000, Milan Mimica wrote: > Hi > > I'm inspecting memory consumption issues of a service running on > java-8u102, linux. The service is running for a few days now, and in > a few days more it would consume all of 32GB physical memory > available, and get killed by OOM Killer. > Questions: > - If the code is not allocating any significant off-heap memory, > neither by Unsafe.allocateMemory or by external library, isn't 7GB > native memory overhead supposed to be enough for a 25GB heap? > - Why so much memory spent on "Internal" category, apparently from G1 > thread? G1 remembered sets typically consume approximately 10% of java heap, depending on application, heap and your remembered set configuration. This remembered set contains information that is necessary to be able to do incremental and partial old generation compaction. So the 7GB should be sufficient. You can decrease remembered set overhead by e.g. increasing heap region size. Given the heap size you mentioned it seems it would be worth a try to go to 16M regions. > Find the attached jemalloc heap profile, showing "live" > allocations that happened in about 30 hour timespan, and a > NMT profile of approximately same period. > Profiling was done after some warm-up time time, and with a > manually triggered Full GC in between just to give the JVM a chance > to clean up everything. Remembered set memory consumption should level out after some time, where "some" may be quite a bit of time. I.e. it may take much longer than for other collectors. It also depends on the memory allocator. Do you have any new measurements after this weekend? It should have levelled out by this time. JDK9 contains some improvements on memory usage in exactly this area. There will likely be further improvements in this area going forward. Thanks, Thomas From amit.balode at gmail.com Fri Feb 3 16:42:33 2017 From: amit.balode at gmail.com (Amit Balode) Date: Fri, 3 Feb 2017 22:12:33 +0530 Subject: Deciding between 2MB or 32MB region size in G1 In-Reply-To: References: Message-ID: Yeah, humongous allocation savings is a bigger advantage to have as compared to some amount of fragmentation which will come with larger 32MB. Would love to hear more comments. On Thu, Feb 2, 2017 at 9:50 PM, yu.zhang at oracle.com wrote: > Hi, Amit, > > IMO, there is no one size fits all. > > Some considerations about the bigger region size: > > Reduce the humongous objects. The humongous objects are allocated in old > gen. If they can not be collected during young gc, they can fill up the old > gen quickly without marking or full gc. > > Less remember set to keep track of. > > Bigger TLAB. This could be good or bad. With bigger tlab, threads need > less refill trip, but may waste more tlab space. It depends on the objects > size. > > Possible bigger waste due to humongous objects (depends on the size of the > objects) > > Possible end of region waste for allocation. > > Maybe others have more comments. > > Thanks > > Jenny > > On 02/01/2017 05:54 AM, Amit Balode wrote: > > Hello, We have multiple applications running in production where > predicting size of the runtime object is kinda tough and random. It could > vary from 1KB to 25MB for different applications. To not have too many > lingering configs for different applications, I am trying to come up with > standard set of configs which could be applicable to all applications. Some > applications do not exceed 10KB object size, so I could definitely keep 2MB > as region size for them. But I am wondering what would be disadvantage of > setting all applications to 32MB region size regardless of how small the > object is? > > Is it that fragmentation issues will happen more if you have less regions? > If so, will the fragmentation issue happen only during humongous > allocations? > In term of performance, will selection of size change anything? > > -- > Thanks & Regards, > Amit.Balode > > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.balode at gmail.com Fri Feb 3 16:46:26 2017 From: amit.balode at gmail.com (Amit Balode) Date: Fri, 3 Feb 2017 22:16:26 +0530 Subject: CMS large objects vs G1 humongous allocations In-Reply-To: <1486033653.8016.23.camel@oracle.com> References: <1485859866.3425.7.camel@oracle.com> <1485954628.3415.12.camel@oracle.com> <1486033653.8016.23.camel@oracle.com> Message-ID: Thomas, thanks a lot of inputs. I will try out those options as you and Vitaly mentioned. On Thu, Feb 2, 2017 at 4:37 PM, Thomas Schatzl wrote: > Hi, > > On Wed, 2017-02-01 at 19:18 +0530, Amit Balode wrote: > > Hi Thomas, thanks for input. > > > > For "Every time this happens, the young gen is really large, however > > it seems that according to heap size calculations the > > surviving objects should actually have enough space." - Could you > > paste the snippet from log which you referring to? > > [Eden: 8960.0M(8960.0M)->0.0B(288.0M) Survivors: 864.0M->512.0M > Heap: 13.6G(16.0G)->2112.0M(16.0G)] > > [Eden: 8832.0M(8960.0M)->0.0B(800.0M) Survivors: 864.0M->0.0B Heap: > 13.9G(16.0G)->11.6G(16.0G)] > > [Eden: 8960.0M(8960.0M)->0.0B(8544.0M) Survivors: 320.0M->512.0M > Heap: 13.3G(16.0G)->2624.0M(16.0G)] > > [Eden: 8416.0M(9600.0M)->0.0B(9440.0M) Survivors: 224.0M->384.0M > Heap: 13.1G(16.0G)->2392.0M(16.0G)] > > For the GCs that had evacuation failure. > > According to these lines the heap occupancy for those is e.g. 13.1G, > i.e. quite a bit lower than 16G, which should in theory be enough to > cover the promotion (looking at previous gcs, it is at most in the few > 100MBs). > > (Caveat: there are a lot of assumptions in application behavior here) > > So the 13.1G (which means 2.9G free) may be somewhat misleading. It > shows free memory, but not memory that can be allocated into. I could > guess this is from humongous objects. > > So we are probably closer to full heap than we think we are. > > > "I remember discussing this or similar issues in the past, not sure > > if it has been fixed in one way or another in the meantime." It would > > really be great if you could help dig whether it has been fixed and > > which release so we could try upgrading to it. > > One of the issues I remember is that garbage collection itself wasted > quite a bit of heap with PLAB sizing (gc threads don't allocate object > by object, but get memory to copy to in largish chunks, the PLABs, for > various reasons); the existing young gen calculation mostly assumes > that there is mostly no memory overhead because of this (but there are > some "heuristics" in there of course). > > In memory tight situations this may cause that problem. > > This sometimes excessive java heap consumption during gc has been > improved a lot with jdk9; further evacuation failures are very fast > with that. > > One other option for any older release is the mentioned > G1MaxNewSizePercent which basically limits the amount of data copied > during gc (so that the other heuristics are good). Others are fixing > PLAB size (potentially impacting gc performance), or increasing > G1ReservePercent (the "heuristics" mentioned above). > > > good point regarding G1MaxNewSizePercent. In general, I have been > > trying to avoid too many customization with G1 and let heuristics > > decide for itself but if no option, I will try to put this setting > > and experiment. > > We recommend to at least try without options with G1. Very very often > they are quite successful in achieving their goals. > > Thanks, > Thomas > > -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mishra at redknee.com Fri Feb 3 11:09:48 2017 From: amit.mishra at redknee.com (Amit Mishra) Date: Fri, 3 Feb 2017 11:09:48 +0000 Subject: Need help on G1 GC young gen Update RS and Scan RS pause reduction References: <1484852604.6579.27.camel@oracle.com> <1485244966.2883.8.camel@oracle.com> <1485338858.3625.42.camel@oracle.com> Message-ID: Hi Thomas/team, I have put all parameters as per your suggestion but somehow the minor gc pauses are still haunting. Attaching GC logs. bash-3.2$ grep -i young gcstats.log.10636|cut -d, -f2|awk -F" " '{print $1}'|awk '$1 > 1' 1.1273134 1.1683221 3.5504848 5.2693987 Kindly suggest me next action. GC parameters argv[0]: /usr/java1.7/bin/amd64/java argv[11]: -Xmx48g argv[12]: -Xms48g argv[13]: -XX:-EliminateLocks argv[14]: -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton argv[15]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB argv[18]: -XX:-ReduceInitialCardMarks argv[19]: -server argv[21]: -classpath argv[25]: -Xss1m argv[26]: -Xoss1m argv[27]: -XX:NewSize=1024m argv[28]: -XX:MaxNewSize=3072m argv[29]: -XX:PermSize=512m argv[30]: -XX:MaxPermSize=512m argv[31]: -XX:ReservedCodeCacheSize=128m argv[32]: -XX:+HeapDumpOnOutOfMemoryError argv[33]: -XX:+AggressiveOpts argv[34]: -Dnetworkaddress.cache.ttl=3600 argv[35]: -Dcom.sun.management.jmxremote.port=11883 argv[36]: -Dcom.sun.management.jmxremote.ssl=false argv[37]: -Dcom.sun.management.jmxremote.authenticate=false argv[38]: -XX:+UseG1GC argv[39]: -XX:MaxGCPauseMillis=500 argv[40]: -XX:+PrintFlagsFinal argv[41]: -XX:G1RSetUpdatingPauseTimePercent=5 argv[42]: -XX:+PrintGCTimeStamps argv[43]: -XX:+PrintGCDetails argv[46]: -XX:+UseLargePages argv[47]: -XX:+MaxFDLimit argv[51]: -XX:+ParallelRefProcEnabled argv[52]: -XX:+DisableExplicitGC argv[53]: -XX:+UnlockDiagnosticVMOptions argv[54]: -XX:+G1SummarizeRSetStats argv[55]: -XX:G1SummarizeRSetStatsPeriod=1 argv[56]: -XX:+PerfDisableSharedMem argv[57]: -XX:+AlwaysPreTouch argv[58]: -XX:G1HeapRegionSize=32M argv[59]: -XX:G1RSetRegionEntries=2048 argv[60]: -XX:+UnlockDiagnosticVMOptions Thanks, Amit Mishra -----Original Message----- From: Amit Mishra Sent: Wednesday, January 25, 2017 15:47 To: 'Thomas Schatzl' ; hotspot-gc-use at openjdk.java.net Subject: RE: Need help on G1 GC young gen Update RS and Scan RS pause reduction Thank you very much Thomas but based on our CMS experience we do also set NewGen Size same as Max New Gen to avoid shrinking and expansion of new gen in real time which sometimes results in unexplainable pauses, can we do the same thing here as well to set min/max size as 1 G and see if it improves overall situation , next thing I am going to do it to increase InitiatingHeapOccupancyPercent from 40 to 60% which we generally set for CMS.(CMSInitiatingOccupancyFraction) I am doing these changes and will let you know once again. Regards, Amit Mishra -----Original Message----- From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] Sent: Wednesday, January 25, 2017 15:38 To: Amit Mishra ; hotspot-gc-use at openjdk.java.net Subject: Re: Need help on G1 GC young gen Update RS and Scan RS pause reduction Hi Amit, On Tue, 2017-01-24 at 10:41 +0000, Amit Mishra wrote: > Hello Thomas/team, > > I have put parameters as per your suggestion and now update RS time is > manageable but Scan RS and Object copy time are high causing pause > time to go beyond 1 second while we do expect max pause time to not to > be greater than 500 ms. From the log it seems that the application, at least during startup time, has quite a bit of variance in the amount of objects that are held live from one garbage collection to the other. It is quite common for applications to behave differently in this area during startup actually. Since G1 sizes the young gen based on previous measurements, if there is a long stretch of garbage collections that do not need lots of work, it will increase the size of the young gen. However, at some point the application's behavior changes (i.e. the number of objects that need to be preserved during collection), and then these long pauses occur. If these long garbage collections are really not desired, the only way I see is to decrease the maximum young generation size, not the minimum one as I wrongly suggested. From that log I can see that the issue only occurs at the beginning of the run; i.e. the second occurrence is at around 310s, all other ~4100s are fine, using a pretty large young gen (~25G). To avoid these long pauses you would need to set maximum young gen at (I would guess) around 2G. Since this is a global setting for the entire run, this will decrease throughput a lot. It's up to you to determine whether this is okay in your case. To set minimum (and maximum) young generation size there are two sets of options: Set ? -XX:G1NewSizePercent (back) to 1 and ? -XX:G1MaxNewSizePercent to something like 4-5, maybe 3. As you might have noticed, there is not much wiggle room with percentages in your case any more, but you can also set absolute min/max young gen sizes via ? -XX:NewSize=X , probably something around or above 1G seems good. ? -XX:MaxNewSize=Y , in your case something around 2-3G should work. [Please make sure that you set both NewSize/MaxNewSize, otherwise you might experience somewhat unexpected behavior] I also looked through your current settings below. > Value of pause are as below(attaching complete gc file for you) grep > -i young gcstats.log.3319|cut -d, -f2|awk -F" " '{print $1}'|awk > '$1 > 1' > 1.4668911 > 1.2109846 > > Note : I am observing pauses just after 4-5 minutes after Application > restart post new GC parameters implementation. > > GC parameters are as: > > > argv[11]: -Xmx48g > argv[12]: -Xms48g > argv[13]: -XX:-EliminateLocks > argv[14]: > -Dorg.omg.CORBA.ORBSingletonClass=org.jacorb.orb.ORBSingleton > argv[15]: -Dorg.omg.CORBA.ORBClass=org.jacorb.orb.ORB > argv[18]: -XX:-ReduceInitialCardMarks > argv[19]: -server > argv[21]: -classpath > argv[24]: -Djava.io.tmpdir=/tmp > argv[25]: -Xss1m > argv[26]: -Xoss1m > argv[27]: -XX:PermSize=512m > argv[28]: -XX:MaxPermSize=512m > argv[29]: -XX:ReservedCodeCacheSize=128m > argv[30]: -XX:+HeapDumpOnOutOfMemoryError > argv[31]: -XX:+AggressiveOpts > argv[32]: -Dnetworkaddress.cache.ttl=3600 > argv[33]: -Dcom.sun.management.jmxremote.port=11883 > argv[34]: -Dcom.sun.management.jmxremote.ssl=false > argv[35]: -Dcom.sun.management.jmxremote.authenticate=false > argv[36]: -XX:+UseG1GC > argv[37]: -XX:MaxGCPauseMillis=500 > argv[38]: -XX:+PrintFlagsFinal > argv[39]: -XX:G1RSetUpdatingPauseTimePercent=5 > argv[40]: -XX:+PrintGCTimeStamps > argv[41]: -XX:+PrintGCDetails > argv[43]: -verbose:gc > argv[44]: -XX:+UseLargePages > argv[45]: -XX:+MaxFDLimit > argv[49]: -XX:+UnlockExperimentalVMOptions > argv[50]: -XX:G1NewSizePercent=2 ^--- as suggested above, use either G1NewSizePercent/G1MaxNewSizePercent or NewSize/MaxNewSize with the suggested values. NewSize/MaxNewSize don't need the UnlockExperimentalVMOptions btw. > argv[51]: -XX:+ParallelRefProcEnabled > argv[52]: -XX:+DisableExplicitGC > argv[53]: -XX:ParallelGCThreads=70 ^--- again, the recommendation is to not set ParallelGCThreads to anything above the number of virtual cpus you have. You could try removing this again.? > argv[54]: -XX:InitiatingHeapOccupancyPercent=40 I think this is a bit too conservative, but I don't have a good value, and maybe it is required later in the run. Independent on whether you cap maximum new size or not, it might be useful for throughput to increase this a lot. Some plug for JDK9: G1 automatically determines rather good values for that with that version. :) > argv[55]: -XX:+UnlockDiagnosticVMOptions > argv[56]: -XX:+G1SummarizeRSetStats > argv[57]: -XX:G1SummarizeRSetStatsPeriod=1 ^--- again, don't use these two production after your testing completes. If you remove them, I think you can also remove UnlockDiagnosticVMOptions. > argv[58]: -XX:+PerfDisableSharedMem > argv[59]: -XX:+AlwaysPreTouch > argv[60]: -XX:G1HeapRegionSize=32M > argv[61]: -XX:G1RSetRegionEntries=2048 > argv[62]: -XX:+UnlockDiagnosticVMOptions ^--- no need to repeat that at the end. Hth, ? Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: gcstats.log.10636_cps_3rdfeb.gz Type: application/x-gzip Size: 1877106 bytes Desc: gcstats.log.10636_cps_3rdfeb.gz URL: From charlie.hunt at oracle.com Fri Feb 3 18:41:31 2017 From: charlie.hunt at oracle.com (charlie hunt) Date: Fri, 3 Feb 2017 12:41:31 -0600 Subject: Need help on G1 GC young gen Update RS and Scan RS pause reduction In-Reply-To: <1486126913.2892.31.camel@oracle.com> References: <1484852604.6579.27.camel@oracle.com> <1485244966.2883.8.camel@oracle.com> <1485338858.3625.42.camel@oracle.com> <1486126913.2892.31.camel@oracle.com> Message-ID: <4712011A-5C3D-471B-A238-DF1A08267DDC@oracle.com> > ?- from one gc to another, just for these four gcs, sys time is relatively high.? Assuming this is on Linux ? perhaps double check that THP (transparent huge pages) is disabled. charlie > On Feb 3, 2017, at 7:01 AM, Thomas Schatzl wrote: > > Hi Amit, > > On Fri, 2017-02-03 at 11:09 +0000, Amit Mishra wrote: >> Hi Thomas/team, >> >> >> I have put all parameters as per your suggestion but somehow the >> minor gc pauses are still haunting. >> >> Attaching GC logs. >> >> >> bash-3.2$ grep -i young gcstats.log.10636|cut -d, -f2|awk -F" " >> '{print $1}'|awk '$1 > 1' >> 1.1273134 >> 1.1683221 >> 3.5504848 >> 5.2693987 > > looking at these log entries, there seems to be something going on > that seems outside of VM control: > > - from one gc to another, just for these four gcs, sys time is > relatively high. > > - for the last two occurrences, at least one thread is hanging in "Ext > Root Scanning" for almost all of the gc time for no obvious reason. > > - there do not seem to be an unusually large amount of changes in the > amount of work done in the particular phases that would raise immediate > concerns to me. > > Please try to find out the source of the high sys time and maybe even > what causes it. I can't help a lot in that area, but dtrace seems a > good starting point as suggested earlier. > > I think we went through most obvious tunings now, but maybe somebody else has more ideas. I don't at this time. > > The jdk (7u45) you are using is also very old, so even if we find that > there is something wrong with g1 in particular, I kind of doubt there > are many more useful knobs to turn with that version (or even > appropriate logging to find out about the actual issue). Since 7u45 > release, there have been hundreds of changes that in particular improve > G1 performance, so please consider upgrading to something more recent > (at least latest 8u, preferably to me some test runs with 9ea). > Upgrading alone might already help. > > Thanks, > Thomas > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.ragozin at gmail.com Sun Feb 5 19:06:39 2017 From: alexey.ragozin at gmail.com (Alexey Ragozin) Date: Sun, 5 Feb 2017 22:06:39 +0300 Subject: Long Parnew pause In-Reply-To: References: Message-ID: Hi, Are you running on physical box or virtual one? In virtualized environments hypervisor may kick guest OS from certain cores for prolonged period (up to few dozen of seconds in my expirience). From guest OS prospective, task holding a core is accounted for all that time (so you can see unrealistic CPU usage). I'm not aware of reliable means to monitor such codition, though usually guest OS would have a spike of CPU job queue. Regards, Alexey Ragozin On Wed, Feb 1, 2017 at 5:44 PM, Gustav ?kesson wrote: > Hi, > > In our application I've observed an occasional and peculiar Parnew GC > which takes several seconds. From what I've been able to gather, it is > associated with an occasional 35mb allocation of data. Those objects are > allocated and tenured (see the bold aging in below logs) and once being > promoted to old generation, that Parnew GC takes around 7 seconds. This > surprises me a bit since it should not take so long time to move 35mb of > data to another heap region. > > Looking at the logs, this issue is not related to I/O (zero systime) nor > TTSP (takes few millis to stop the application threads). GC threads seems > to simply spend their time working on the CPU chip. What in Parnew/CMS > could possibly make these 35mb take so long to promote? Some flags that can > shed som light, or any suspicion such as free-list balancing? > > I appreciate any input on the matter. > > JVM settings and platform information at the end of this mail. > > {Heap before GC invocations=1723 (full 0): > par new generation total 1887488K, used 1768096K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, > 0x00007fcd57c80000) > from space 209664K, 43% used [0x00007fcd64940000, 0x00007fcd6a1680e8, > 0x00007fcd71600000) > to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, > 0x00007fcd64940000) > concurrent mark-sweep generation total 34973696K, used 7319028K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121299K, capacity 134343K, committed 134400K, > reserved 135168K > 2017-01-26T12:50:11.476+0100: 14135.489: [GC (Allocation Failure) > 2017-01-26T12:50:11.476+0100: 14135.489: [ParNew > Desired survivor size 107347968 bytes, new threshold 6 (max 6) > - age 1: 12439600 bytes, 12439600 total > - age 2: 5233256 bytes, 17672856 total > - age 3: 5083408 bytes, 22756264 total > *- age 4: 37639936 bytes, 60396200 total* > - age 5: 4869520 bytes, 65265720 total > - age 6: 4746784 bytes, 70012504 total > : 1768096K->91122K(1887488K), 0.1117876 secs] > 9087124K->7412981K(36861184K), 0.1120711 secs] [Times: user=0.85 sys=0.00, > real=0.11 secs] > Heap after GC invocations=1724 (full 0): > par new generation total 1887488K, used 91122K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, > 0x00007fcd57c80000) > from space 209664K, 43% used [0x00007fcd57c80000, 0x00007fcd5d57ca70, > 0x00007fcd64940000) > to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, > 0x00007fcd71600000) > concurrent mark-sweep generation total 34973696K, used 7321858K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121299K, capacity 134343K, committed 134400K, > reserved 135168K > } > 2017-01-26T12:50:11.589+0100: 14135.601: Total time for which application > threads were stopped: 0.1174674 seconds, Stopping threads took: 0.0042340 > seconds > 2017-01-26T12:50:12.168+0100: 14136.181: Application time: 0.5798363 > seconds > {Heap before GC invocations=1724 (full 0): > par new generation total 1887488K, used 1768946K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, > 0x00007fcd57c80000) > from space 209664K, 43% used [0x00007fcd57c80000, 0x00007fcd5d57ca70, > 0x00007fcd64940000) > to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, > 0x00007fcd71600000) > concurrent mark-sweep generation total 34973696K, used 7321858K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121299K, capacity 134343K, committed 134400K, > reserved 135168K > 2017-01-26T12:50:12.170+0100: 14136.182: [GC (Allocation Failure) > 2017-01-26T12:50:12.170+0100: 14136.182: [ParNew > Desired survivor size 107347968 bytes, new threshold 6 (max 6) > - age 1: 10383048 bytes, 10383048 total > - age 2: 5102856 bytes, 15485904 total > - age 3: 5154816 bytes, 20640720 total > - age 4: 5080000 bytes, 25720720 total > *- age 5: 37637680 bytes, 63358400 total* > - age 6: 4658912 bytes, 68017312 total > : 1768946K->86544K(1887488K), 0.0929344 secs] > 9090805K->7411133K(36861184K), 0.0932244 secs] [Times: user=0.70 sys=0.00, > real=0.09 secs] > Heap after GC invocations=1725 (full 0): > par new generation total 1887488K, used 86544K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, > 0x00007fcd57c80000) > from space 209664K, 41% used [0x00007fcd64940000, 0x00007fcd69dc41d0, > 0x00007fcd71600000) > to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, > 0x00007fcd64940000) > concurrent mark-sweep generation total 34973696K, used 7324589K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121299K, capacity 134343K, committed 134400K, > reserved 135168K > } > 2017-01-26T12:50:12.263+0100: 14136.276: Total time for which application > threads were stopped: 0.0945634 seconds, Stopping threads took: 0.0001968 > seconds > 2017-01-26T12:50:12.960+0100: 14136.972: Application time: 0.6966358 > seconds > {Heap before GC invocations=1725 (full 0): > par new generation total 1887488K, used 1764368K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, > 0x00007fcd57c80000) > from space 209664K, 41% used [0x00007fcd64940000, 0x00007fcd69dc41d0, > 0x00007fcd71600000) > to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, > 0x00007fcd64940000) > concurrent mark-sweep generation total 34973696K, used 7324589K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121324K, capacity 134471K, committed 134656K, > reserved 135168K > 2017-01-26T12:50:12.961+0100: 14136.973: [GC (Allocation Failure) > 2017-01-26T12:50:12.961+0100: 14136.973: [ParNew > Desired survivor size 107347968 bytes, new threshold 6 (max 6) > - age 1: 8033264 bytes, 8033264 total > - age 2: 5686168 bytes, 13719432 total > - age 3: 5019640 bytes, 18739072 total > - age 4: 5150920 bytes, 23889992 total > - age 5: 5076720 bytes, 28966712 total > *- age 6: 37481736 bytes, 66448448 total* > : 1764368K->79984K(1887488K), 0.0955902 secs] > 9088957K->7407366K(36861184K), 0.0958643 secs] [Times: user=0.69 sys=0.00, > real=0.10 secs] > Heap after GC invocations=1726 (full 0): > par new generation total 1887488K, used 79984K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, > 0x00007fcd57c80000) > from space 209664K, 38% used [0x00007fcd57c80000, 0x00007fcd5ca9c148, > 0x00007fcd64940000) > to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, > 0x00007fcd71600000) > concurrent mark-sweep generation total 34973696K, used 7327382K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121324K, capacity 134471K, committed 134656K, > reserved 135168K > } > 2017-01-26T12:50:13.057+0100: 14137.069: Total time for which application > threads were stopped: 0.0972200 seconds, Stopping threads took: 0.0001917 > seconds > 2017-01-26T12:50:13.683+0100: 14137.695: Application time: 0.6259722 > seconds > {Heap before GC invocations=1726 (full 0): > par new generation total 1887488K, used 1757808K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 100% used [0x00007fccf1600000, 0x00007fcd57c80000, > 0x00007fcd57c80000) > from space 209664K, 38% used [0x00007fcd57c80000, 0x00007fcd5ca9c148, > 0x00007fcd64940000) > to space 209664K, 0% used [0x00007fcd64940000, 0x00007fcd64940000, > 0x00007fcd71600000) > concurrent mark-sweep generation total 34973696K, used 7327382K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121324K, capacity 134471K, committed 134656K, > reserved 135168K > *2017-01-26T12:50:13.684+0100: 14137.697: [GC (Allocation Failure) > 2017-01-26T12:50:13.684+0100: 14137.697: [ParNew* > *Desired survivor size 107347968 bytes, new threshold 6 (max 6)* > *- age 1: 10784424 bytes, 10784424 total* > *- age 2: 5148032 bytes, 15932456 total* > *- age 3: 5607232 bytes, 21539688 total* > *- age 4: 5013024 bytes, 26552712 total* > *- age 5: 5148840 bytes, 31701552 total* > *- age 6: 4839808 bytes, 36541360 total* > *: 1757808K->58357K(1887488K), 7.4626505 secs] > 9085190K->7420330K(36861184K), 7.4629090 secs] [Times: user=58.63 sys=0.00, > real=7.47 secs] * > Heap after GC invocations=1727 (full 0): > par new generation total 1887488K, used 58357K [0x00007fccf1600000, > 0x00007fcd71600000, 0x00007fcd71600000) > eden space 1677824K, 0% used [0x00007fccf1600000, 0x00007fccf1600000, > 0x00007fcd57c80000) > from space 209664K, 27% used [0x00007fcd64940000, 0x00007fcd6823d650, > 0x00007fcd71600000) > to space 209664K, 0% used [0x00007fcd57c80000, 0x00007fcd57c80000, > 0x00007fcd64940000) > concurrent mark-sweep generation total 34973696K, used 7361973K > [0x00007fcd71600000, 0x00007fd5c8000000, 0x00007fd5c8000000) > Metaspace used 121324K, capacity 134471K, committed 134656K, > reserved 135168K > } > *2017-01-26T12:50:21.147+0100: 14145.160: Total time for which application > threads were stopped: 7.4642882 seconds, Stopping threads took: 0.0002572 > seconds* > > > Java HotSpot(TM) 64-Bit Server VM (25.112-b15) for linux-amd64 JRE > (1.8.0_112-b15), built on Sep 22 2016 21:10:53 by "java_re" with gcc 4.3.0 > 20080428 (Red Hat 4.3.0-8) > Memory: 4k page, physical 49427048k(42773024k free), swap > 4194300k(4194300k free) > -XX:+AlwaysPreTouch > -XX:+CMSEdenChunksRecordAlways > -XX:CMSInitiatingOccupancyFraction=80 > -XX:+CMSParallelInitialMarkEnabled > -XX:+CMSScavengeBeforeRemark > -XX:CMSWaitDuration=60000 > -XX:+DisableExplicitGC > -XX:GCLogFileSize=31457280 > -XX:InitialHeapSize=37959499776 > -XX:MaxHeapSize=37959499776 > -XX:MaxMetaspaceSize=268435456 > -XX:MaxNewSize=2147483648 > -XX:MaxTenuringThreshold=6 > -XX:MetaspaceSize=268435456 > -XX:NewSize=2147483648 > -XX:+UseBiasedLocking > -XX:+UseCMSInitiatingOccupancyOnly > -XX:-UseCompressedOops > -XX:+UseConcMarkSweepGC > -XX:+UseGCLogFileRotation > -XX:+UseLargePages > -XX:+UseParNewGC > > > > Best Regards, > Gustav ?kesson > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.mishra at redknee.com Mon Feb 6 09:17:52 2017 From: amit.mishra at redknee.com (Amit Mishra) Date: Mon, 6 Feb 2017 09:17:52 +0000 Subject: Need help on G1 GC young gen Update RS and Scan RS pause reduction In-Reply-To: <4712011A-5C3D-471B-A238-DF1A08267DDC@oracle.com> References: <1484852604.6579.27.camel@oracle.com> <1485244966.2883.8.camel@oracle.com> <1485338858.3625.42.camel@oracle.com> <1486126913.2892.31.camel@oracle.com> <4712011A-5C3D-471B-A238-DF1A08267DDC@oracle.com> Message-ID: Hi Charlie, This is solaris OS and there is flag argv[46]: -XX:+UseLargePages which is enabled, do you recommend to disable it. Also per Thomas suggestion I am trying to figure out Java 1.8 compatible with our Apps and will use that in further testing. Thanks, Amit Mishra From: charlie hunt [mailto:charlie.hunt at oracle.com] Sent: Saturday, February 4, 2017 00:12 To: Thomas Schatzl Cc: Amit Mishra ; hotspot-gc-use at openjdk.java.net Subject: Re: Need help on G1 GC young gen Update RS and Scan RS pause reduction > ?- from one gc to another, just for these four gcs, sys time is relatively high.? Assuming this is on Linux ? perhaps double check that THP (transparent huge pages) is disabled. charlie On Feb 3, 2017, at 7:01 AM, Thomas Schatzl > wrote: Hi Amit, On Fri, 2017-02-03 at 11:09 +0000, Amit Mishra wrote: Hi Thomas/team, I have put all parameters as per your suggestion but somehow the minor gc pauses are still haunting. Attaching GC logs. bash-3.2$ grep -i young gcstats.log.10636|cut -d, -f2|awk -F" " '{print $1}'|awk '$1 > 1' 1.1273134 1.1683221 3.5504848 5.2693987 looking at these log entries, there seems to be something going on that seems outside of VM control: - from one gc to another, just for these four gcs, sys time is relatively high. - for the last two occurrences, at least one thread is hanging in "Ext Root Scanning" for almost all of the gc time for no obvious reason. - there do not seem to be an unusually large amount of changes in the amount of work done in the particular phases that would raise immediate concerns to me. Please try to find out the source of the high sys time and maybe even what causes it. I can't help a lot in that area, but dtrace seems a good starting point as suggested earlier. I think we went through most obvious tunings now, but maybe somebody else has more ideas. I don't at this time. The jdk (7u45) you are using is also very old, so even if we find that there is something wrong with g1 in particular, I kind of doubt there are many more useful knobs to turn with that version (or even appropriate logging to find out about the actual issue). Since 7u45 release, there have been hundreds of changes that in particular improve G1 performance, so please consider upgrading to something more recent (at least latest 8u, preferably to me some test runs with 9ea). Upgrading alone might already help. Thanks, Thomas _______________________________________________ hotspot-gc-use mailing list hotspot-gc-use at openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Tue Feb 7 01:51:01 2017 From: charlie.hunt at oracle.com (charlie hunt) Date: Mon, 6 Feb 2017 19:51:01 -0600 Subject: Need help on G1 GC young gen Update RS and Scan RS pause reduction In-Reply-To: References: <1484852604.6579.27.camel@oracle.com> <1485244966.2883.8.camel@oracle.com> <1485338858.3625.42.camel@oracle.com> <1486126913.2892.31.camel@oracle.com> <4712011A-5C3D-471B-A238-DF1A08267DDC@oracle.com> Message-ID: No, if this is Solaris, continue to use large pages. There are no known issues with using large pages on Solaris (SPARC or x86/x64). Charlie > On Feb 6, 2017, at 3:17 AM, Amit Mishra wrote: > > Hi Charlie, > > This is solaris OS and there is flag argv[46]: -XX:+UseLargePages which is enabled, do you recommend to disable it. > > Also per Thomas suggestion I am trying to figure out Java 1.8 compatible with our Apps and will use that in further testing. > > Thanks, > Amit Mishra > > From: charlie hunt [mailto:charlie.hunt at oracle.com] > Sent: Saturday, February 4, 2017 00:12 > To: Thomas Schatzl > Cc: Amit Mishra ; hotspot-gc-use at openjdk.java.net > Subject: Re: Need help on G1 GC young gen Update RS and Scan RS pause reduction > > > ?- from one gc to another, just for these four gcs, sys time is relatively high.? > > Assuming this is on Linux ? perhaps double check that THP (transparent huge pages) is disabled. > > charlie > > On Feb 3, 2017, at 7:01 AM, Thomas Schatzl wrote: > > Hi Amit, > > On Fri, 2017-02-03 at 11:09 +0000, Amit Mishra wrote: > > Hi Thomas/team, > > > I have put all parameters as per your suggestion but somehow the > minor gc pauses are still haunting. > > Attaching GC logs. > > > bash-3.2$ grep -i young gcstats.log.10636|cut -d, -f2|awk -F" " > '{print $1}'|awk '$1 > 1' > 1.1273134 > 1.1683221 > 3.5504848 > 5.2693987 > > looking at these log entries, there seems to be something going on > that seems outside of VM control: > > - from one gc to another, just for these four gcs, sys time is > relatively high. > > - for the last two occurrences, at least one thread is hanging in "Ext > Root Scanning" for almost all of the gc time for no obvious reason. > > - there do not seem to be an unusually large amount of changes in the > amount of work done in the particular phases that would raise immediate > concerns to me. > > Please try to find out the source of the high sys time and maybe even > what causes it. I can't help a lot in that area, but dtrace seems a > good starting point as suggested earlier. > > I think we went through most obvious tunings now, but maybe somebody else has more ideas. I don't at this time. > > The jdk (7u45) you are using is also very old, so even if we find that > there is something wrong with g1 in particular, I kind of doubt there > are many more useful knobs to turn with that version (or even > appropriate logging to find out about the actual issue). Since 7u45 > release, there have been hundreds of changes that in particular improve > G1 performance, so please consider upgrading to something more recent > (at least latest 8u, preferably to me some test runs with 9ea). > Upgrading alone might already help. > > Thanks, > Thomas > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amit.balode at gmail.com Tue Feb 7 13:36:59 2017 From: amit.balode at gmail.com (Amit Balode) Date: Tue, 7 Feb 2017 19:06:59 +0530 Subject: How to find fragmented space in G1 regions Message-ID: Any thoughts on how we could find how much space is fragmented in G1 regions? -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From prasanna.gopal at blackrock.com Wed Feb 8 12:25:49 2017 From: prasanna.gopal at blackrock.com (Gopal, Prasanna CWK) Date: Wed, 8 Feb 2017 12:25:49 +0000 Subject: G1 Region size info Message-ID: <46dc58914a7c40f991bcdbd3f023a85e@UKPMSEXD202N02.na.blkint.com> Hi All I am trying to understand the region size info provided in of our application?s GC log file. We are running an application with the following configuration -Xmx7G -Xms7G -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:InitiatingHeapOccupancyPercent=60 -XX:G1ReservePercent=20 -XX:G1ReservePercent=20 -XX:G1HeapRegionSize=32M -XX:G1MixedGCLiveThresholdPercent=85 -XX:MaxGCPauseMillis=500 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintHeapAtGC -XX:+PrintReferenceGC -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime JDK : jdk_7u40_x64 ( Yes we need to move to latest JDK) At the end of the region size info, we have following summary ### SUMMARY capacity: 352.00 MB used: 328.94 MB / 93.45 % prev-live: 138.65 MB / 39.39 % next-live: 0.00 MB / 0.00 % 4283M->4121M(6144M), 0.0066530 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2017-02-08T00:00:59.239-0500: 223238.232: Total time for which application threads were stopped: 0.0075650 seconds 2017-02-08T00:00:59.239-0500: 223238.232: [GC concurrent-cleanup-start] 2017-02-08T00:00:59.239-0500: 223238.232: [GC concurrent-cleanup-end, 0.0000950 secs] 2017-02-08T00:01:00.003-0500: 223238.996: Application time: 0.7639680 seconds 2017-02-08T00:01:00.005-0500: 223238.999: Total time for which application threads were stopped: 0.0025640 seconds 2017-02-08T00:01:05.024-0500: 223244.017: Application time: 5.0186280 seconds Does this mean, our heap occupancy is only 352 MB after Post-Sorting phase ?. It doesn?t co-relate with the information provided at thd end of GC clean up phase (4283M->4121M(6144M), 0.0066530 secs) , which say the heap size is 4121M. And subsequent Young GC shows the following heap composition [Eden: 992.0M(992.0M)->0.0B(1088.0M) Survivors: 64.0M->32.0M Heap: 4538.0M(6144.0M)->3516.9M(6144.0M)] Could you please help me in understanding the Summary information provided in Post sorting phase ? . Please let me know ,if you need any more information. Regards Prasanna Region size info ============= region size 32768K, 2 young (65536K), 2 survivors (65536K) compacting perm gen total 98304K, used 85713K [0x00000007c0000000, 0x00000007c6000000, 0x0000000800000000) the space 98304K, 87% used [0x00000007c0000000, 0x00000007c53b4400, 0x00000007c53b4400, 0x00000007c6000000) No shared spaces configured. } [Times: user=0.09 sys=0.00, real=0.02 secs] 2017-02-08T00:00:55.746-0500: 223234.739: [GC concurrent-root-region-scan-start] 2017-02-08T00:00:55.746-0500: 223234.739: Total time for which application threads were stopped: 0.0145960 seconds 2017-02-08T00:00:55.748-0500: 223234.742: [GC concurrent-root-region-scan-end, 0.0025760 secs] 2017-02-08T00:00:55.748-0500: 223234.742: [GC concurrent-mark-start] 2017-02-08T00:00:59.211-0500: 223238.204: [GC concurrent-mark-end, 3.4626220 secs] 2017-02-08T00:00:59.211-0500: 223238.204: Application time: 3.4652490 seconds 2017-02-08T00:00:59.212-0500: 223238.205: [GC remark 2017-02-08T00:00:59.213-0500: 223238.206: [GC ref-proc2017-02-08T00:00:59.213-0500: 223238.206: [SoftReference, 986 refs, 0.0007480 secs]2017-02-08T00:00:59.213-0500: 223238.207: [WeakReference, 547 refs, 0.0004050 secs]2017-02-08T00:00:59.214-0500: 223238.207: [FinalReference, 1 refs, 0.0002350 secs]2017-02-08T00:00:59.214-0500: 223238.207: [PhantomReference, 6 refs, 0.0002730 secs]2017-02-08T00:00:59.214-0500: 223238.208: [JNI Weak Reference, 0.0000510 secs], 0.0018440 secs], 0.0189520 secs] [Times: user=0.05 sys=0.00, real=0.02 secs] 2017-02-08T00:00:59.231-0500: 223238.225: Total time for which application threads were stopped: 0.0202670 seconds 2017-02-08T00:00:59.231-0500: 223238.225: Application time: 0.0001010 seconds 2017-02-08T00:00:59.232-0500: 223238.226: [GC cleanup ### PHASE Post-Marking @ 223238.226 ### HEAP committed: 0x0000000640000000-0x00000007c0000000 reserved: 0x0000000640000000-0x00000007c0000000 region-size: 33554432 ### ### type address-range used prev-live next-live gc-eff ### (bytes) (bytes) (bytes) (bytes/ms) ### OLD 0x0000000640000000-0x0000000642000000 32368376 32368376 32366752 223961.6 ### OLD 0x0000000642000000-0x0000000644000000 33554408 33554408 33554408 21493.5 ### OLD 0x0000000644000000-0x0000000646000000 33554424 33554424 33554424 29029.2 ### OLD 0x0000000646000000-0x0000000648000000 33554424 33554424 33554424 13470.1 ### OLD 0x0000000648000000-0x000000064a000000 33554432 33554432 33552752 293134.5 ### OLD 0x000000064a000000-0x000000064c000000 33357104 33357104 33357104 179951.0 ### OLD 0x000000064c000000-0x000000064e000000 33554408 33554408 33554408 525436.5 ### OLD 0x000000064e000000-0x0000000650000000 33554416 33554416 33554416 273501.1 ### OLD 0x0000000650000000-0x0000000652000000 33554432 33554432 33554432 407703.2 ### OLD 0x0000000652000000-0x0000000654000000 33554432 33554432 33554432 70645.8 ### OLD 0x0000000654000000-0x0000000656000000 33554432 33554432 33554432 222200.0 ### OLD 0x0000000656000000-0x0000000658000000 33554392 33554392 33554392 16195314.6 ### OLD 0x0000000658000000-0x000000065a000000 33303600 33303600 33303600 428407.8 ### OLD 0x000000065a000000-0x000000065c000000 33554424 33554424 33554424 153899.4 ### OLD 0x000000065c000000-0x000000065e000000 33554384 33554384 33554384 307229.6 ### OLD 0x000000065e000000-0x0000000660000000 33554416 33554416 33554416 10610494.9 ### OLD 0x0000000660000000-0x0000000662000000 33554416 33554416 33554416 469683.0 ### OLD 0x0000000662000000-0x0000000664000000 33554320 33554320 33554320 1241801.8 ### OLD 0x0000000664000000-0x0000000666000000 33554416 33554416 33554416 371576.7 ### OLD 0x0000000666000000-0x0000000668000000 33554432 33554408 33554408 18876.1 ### OLD 0x0000000668000000-0x000000066a000000 33553800 33553800 33553800 197254.3 ### OLD 0x000000066a000000-0x000000066c000000 32920592 32920592 32920592 261906.8 ### OLD 0x000000066c000000-0x000000066e000000 33554272 33554272 33554272 263415.0 ### OLD 0x000000066e000000-0x0000000670000000 33466704 33466704 33466704 217406.7 ### OLD 0x0000000670000000-0x0000000672000000 33554432 33554432 33554432 558273.8 ### OLD 0x0000000672000000-0x0000000674000000 33554424 33554424 33521640 471863.4 ### OLD 0x0000000674000000-0x0000000676000000 33554416 33554416 33554416 24962.1 ### OLD 0x0000000676000000-0x0000000678000000 32866480 32866480 32866480 469198.9 ### OLD 0x0000000678000000-0x000000067a000000 33554424 33554424 33554424 1911056.3 ### OLD 0x000000067a000000-0x000000067c000000 33420736 33420736 33420736 405214.5 ### OLD 0x000000067c000000-0x000000067e000000 33554432 33554432 33554432 413762.1 ### OLD 0x000000067e000000-0x0000000680000000 33554144 33554144 33554144 658142.1 ### OLD 0x0000000680000000-0x0000000682000000 33554432 33554432 33554432 561025.9 ### OLD 0x0000000682000000-0x0000000684000000 33554248 33554248 33543904 702743.4 ### OLD 0x0000000684000000-0x0000000686000000 33554416 33554416 33554416 4785040.9 ### OLD 0x0000000686000000-0x0000000688000000 33554360 33554360 33554360 307205.2 ### OLD 0x0000000688000000-0x000000068a000000 33554416 33554416 33554416 791487.3 ### OLD 0x000000068a000000-0x000000068c000000 33554376 33554376 33554376 6024807.6 ?? Remove some for brevity ### HUMS 0x000000079e000000-0x00000007a0000000 33554432 33554432 33554432 501904008.8 ### HUMC 0x00000007a0000000-0x00000007a2000000 2367536 2367536 2367536 2526909.9 ### FREE 0x00000007a2000000-0x00000007a4000000 0 0 0 1018023826.2 ### FREE 0x00000007a4000000-0x00000007a6000000 0 0 0 3621347.9 ### FREE 0x00000007a6000000-0x00000007a8000000 0 0 0 6975354.2 ### FREE 0x00000007a8000000-0x00000007aa000000 0 0 0 6228744.7 ### FREE 0x00000007aa000000-0x00000007ac000000 0 0 0 825510.3 ### FREE 0x00000007ac000000-0x00000007ae000000 0 0 0 3304768.5 ### FREE 0x00000007ae000000-0x00000007b0000000 0 0 0 2701625.1 ### FREE 0x00000007b0000000-0x00000007b2000000 0 0 0 2623840.5 ### FREE 0x00000007b2000000-0x00000007b4000000 0 0 0 6313095.1 ### FREE 0x00000007b4000000-0x00000007b6000000 0 0 0 4588159.0 ### FREE 0x00000007b6000000-0x00000007b8000000 0 0 0 2184971.5 ### FREE 0x00000007b8000000-0x00000007ba000000 0 0 0 592114.6 ### FREE 0x00000007ba000000-0x00000007bc000000 0 0 0 24307488.7 ### FREE 0x00000007bc000000-0x00000007be000000 0 0 0 2283212.1 ### FREE 0x00000007be000000-0x00000007c0000000 0 0 0 627031.8 ### ### SUMMARY capacity: 6144.00 MB used: 4283.18 MB / 69.71 % prev-live: 4057.87 MB / 66.05 % next-live: 3896.51 MB / 63.42 % ### PHASE Post-Sorting @ 223238.232 ### HEAP committed: 0x0000000640000000-0x00000007c0000000 reserved: 0x0000000640000000-0x00000007c0000000 region-size: 33554432 ### ### type address-range used prev-live next-live gc-eff ### (bytes) (bytes) (bytes) (bytes/ms) ### OLD 0x0000000736000000-0x0000000738000000 33554432 2166312 0 87829788.1 ### OLD 0x000000073c000000-0x000000073e000000 33554432 6048552 0 26417453.7 ### OLD 0x000000073a000000-0x000000073c000000 33554432 13552368 0 8930752.6 ### OLD 0x00000006fe000000-0x0000000700000000 32982728 15474224 0 6200283.7 ### OLD 0x0000000738000000-0x000000073a000000 33554432 17669672 0 5731058.5 ### OLD 0x00000006f0000000-0x00000006f2000000 33232096 5230584 0 5595398.5 ### OLD 0x00000006f2000000-0x00000006f4000000 33554384 8938848 0 5331392.0 ### OLD 0x0000000706000000-0x0000000708000000 10273584 10273432 0 3376131.8 ### OLD 0x00000006ee000000-0x00000006f0000000 33554392 18481472 0 1231822.2 ### OLD 0x00000006f6000000-0x00000006f8000000 33554432 22116128 0 774360.6 ### OLD 0x0000000702000000-0x0000000704000000 33554424 25431376 0 520416.9 ### ### SUMMARY capacity: 352.00 MB used: 328.94 MB / 93.45 % prev-live: 138.65 MB / 39.39 % next-live: 0.00 MB / 0.00 % 4283M->4121M(6144M), 0.0066530 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 2017-02-08T00:00:59.239-0500: 223238.232: Total time for which application threads were stopped: 0.0075650 seconds 2017-02-08T00:00:59.239-0500: 223238.232: [GC concurrent-cleanup-start] 2017-02-08T00:00:59.239-0500: 223238.232: [GC concurrent-cleanup-end, 0.0000950 secs] 2017-02-08T00:01:00.003-0500: 223238.996: Application time: 0.7639680 seconds 2017-02-08T00:01:00.005-0500: 223238.999: Total time for which application threads were stopped: 0.0025640 seconds 2017-02-08T00:01:05.024-0500: 223244.017: Application time: 5.0186280 seconds This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for more information about BlackRock?s Privacy Policy. BlackRock Advisors (UK) Limited and BlackRock Investment Management (UK) Limited are authorised and regulated by the Financial Conduct Authority. Registered in England No. 796793 and No. 2020394 respectively. BlackRock Life Limited is authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority and the Prudential Regulation Authority. Registered in England No. 2223202. Registered Offices: 12 Throgmorton Avenue, London EC2N 2DL. BlackRock International Limited is authorised and regulated by the Financial Conduct Authority and is a registered investment adviser with the Securities and Exchange Commission (SEC). Registered in Scotland No. SC160821. Registered Office: Exchange Place One, 1 Semple Street, Edinburgh EH3 8BL. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/en-us/about-us/contacts-locations. ? 2017 BlackRock, Inc. All rights reserved. -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Wed Feb 8 15:31:58 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 08 Feb 2017 16:31:58 +0100 Subject: How to find fragmented space in G1 regions In-Reply-To: References: Message-ID: <1486567918.3510.30.camel@oracle.com> Hi, On Tue, 2017-02-07 at 19:06 +0530, Amit Balode wrote: > Any thoughts on how we could find how much space is fragmented in G1 > regions? ? -XX:+G1PrintRegionLivenessInfo prints region information containing its type after every marking (use gc+liveness=trace for jdk9) twice. The first "post-marking" output is what you want. The "post-sorting" one is only interesting for getting details about the collection set (and this one does not contain free regions). The first column contains information about the type of the region: FREE -> free region SURV -> survivor region EDEN -> eden region HUMS -> humongous (start) HUMC -> humongous (continuation) OLD -> old region ARC -> archive regions (jdk9 only I think, not sure, maybe also jdk8) >From that you can deduce the size and how many contiguous free regions are available at the moment. There is also -XX:+PrintHeapAtGC and -XX:+PrintHeapAtGCExtended which print the region layout at every GC. Thanks, ? Thomas From amit.balode at gmail.com Wed Feb 8 15:45:59 2017 From: amit.balode at gmail.com (Amit Balode) Date: Wed, 8 Feb 2017 21:15:59 +0530 Subject: How to find fragmented space in G1 regions In-Reply-To: <1486567918.3510.30.camel@oracle.com> References: <1486567918.3510.30.camel@oracle.com> Message-ID: Thanks Thomas, do you know if there is overhead of those flags? I think trace would be expensive but what about others? Will they add anything to pause time? On Wed, Feb 8, 2017 at 9:01 PM, Thomas Schatzl wrote: > Hi, > > On Tue, 2017-02-07 at 19:06 +0530, Amit Balode wrote: > > Any thoughts on how we could find how much space is fragmented in G1 > > regions? > > -XX:+G1PrintRegionLivenessInfo prints region information containing > its type after every marking (use gc+liveness=trace for jdk9) twice. > The first "post-marking" output is what you want. The "post-sorting" > one is only interesting for getting details about the collection set > (and this one does not contain free regions). > > The first column contains information about the type of the region: > > FREE -> free region > SURV -> survivor region > EDEN -> eden region > HUMS -> humongous (start) > HUMC -> humongous (continuation) > OLD -> old region > ARC -> archive regions (jdk9 only I think, not sure, maybe also jdk8) > > From that you can deduce the size and how many contiguous free regions > are available at the moment. > > There is also -XX:+PrintHeapAtGC and -XX:+PrintHeapAtGCExtended which > print the region layout at every GC. > > Thanks, > Thomas > > -- Thanks & Regards, Amit.Balode -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Wed Feb 8 15:53:32 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 08 Feb 2017 16:53:32 +0100 Subject: How to find fragmented space in G1 regions In-Reply-To: References: <1486567918.3510.30.camel@oracle.com> Message-ID: <1486569212.3510.43.camel@oracle.com> Hi, On Wed, 2017-02-08 at 21:15 +0530, Amit Balode wrote: > Thanks Thomas, do you know if there is overhead of those flags? I > think trace would be expensive but what about others? Will they add > anything to pause time? ? the bottleneck is 99% writing out the data. The internal per-line calculation overhead is negligible. However the amount of information printed may cause I/O issues. I think since at least?G1PrintRegionLivenessInfo is a diagnostic option, you can turn it completely on and off at runtime. (In JDK9, everything is completely based on the unified logging, you can do that as well). Thanks, ? Thomas From thomas.schatzl at oracle.com Wed Feb 8 16:06:48 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 08 Feb 2017 17:06:48 +0100 Subject: G1 native memory consumption In-Reply-To: <1486138975652.57172@infobip.com> References: <1484943874550.90103@infobip.com> , <1485168079.2811.21.camel@oracle.com> <1486138975652.57172@infobip.com> Message-ID: <1486570008.3510.50.camel@oracle.com> Hi Milan, On Fri, 2017-02-03 at 16:22 +0000, Milan Mimica wrote: > Hi Thomas > > Thanks for your input. I took me a while to have a stable system > again to repeat measurements. > > I have tried setting G1HeapRegionSize to 16M on one instance (8M is > default) and I notice lower GC memory usage: > GC (reserved=1117MB -18MB, committed=1117MB -18MB) > vs > GC (reserved=1604MB +313MB, committed=1604MB +313MB) > > It seems more stable too. However, "Internal" is still relatively > high for a 25G heap, and there is no much difference between > instances: > Internal (reserved=2132MB -7MB, committed=2132MB -7MB) I am not sure why there is no difference, it would be nice to have a breakdown on this like in the previous case to rule out other components or not enough warmup. Everything that is allocated via the OtherRegionsTable::add_reference() -> BitMap::resize() path in the figure from the other email is remembered sets, and they _should_ have gone down. You can try to move memory from that path to the CHeapObj operator new one. This results in g1 storing remembered sets in a much more dense but potentially slower to access representation. The switch to turn here is G1RSetSparseRegionEntries. It gives maximum number of cards (small areas, 512 bytes) per region to store in that representation. If it overflows, pretty large bitmaps that might be really sparsely populated are used (that take lots of time). By default it is somewhat like? 4 * (log2(region-size-in-MB + 1) E.g. with 32M region only 24 cards are stored there max. I think you can easily increase this to something like 64 or 128 or even larger. I think (and I am unsure about this, in jdk9 we halved its memory usage) memory usage should be around equal with the bitmaps with 2k entries on 32M regions, so I would stop at something in that area at most. This size need not be a power of two btw. You can try increasing this value significantly and see if it helps with memory consumption without impacting performance too much. Thanks, ? Thomas From thomas.schatzl at oracle.com Wed Feb 8 16:13:22 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 08 Feb 2017 17:13:22 +0100 Subject: G1 Region size info In-Reply-To: <46dc58914a7c40f991bcdbd3f023a85e@UKPMSEXD202N02.na.blkint.com> References: <46dc58914a7c40f991bcdbd3f023a85e@UKPMSEXD202N02.na.blkint.com> Message-ID: <1486570402.3510.55.camel@oracle.com> Hi, On Wed, 2017-02-08 at 12:25 +0000, Gopal, Prasanna CWK wrote: > Hi All > ? > I am trying to understand the region size info provided in of our > application?s GC log file. > ? > We are running an application with the following configuration > ? > -Xmx7G [...] > ? > Does this mean, ?our heap occupancy is only 352 MB after Post-Sorting > phase ?. It doesn?t co-relate with the information provided at thd > end of GC clean up phase (4283M->4121M(6144M), 0.0066530 secs) , > which say the heap size is 4121M. Post-sorting only considers regions that are collection set candidates - i.e. regions that G1 will clean out in the next reclamation phase. I.e. contain lots of garbage. If you think that is too little (g1 not cleaning out enough in mixed gcs), you might want to make evacuation more aggressive. >From the post-marking snippet it seems though that there are not many regions with lots of garbage there anyway though. Post-marking statistics is what you want to look at and compare with. Thanks, ? Thomas From prasanna.gopal at blackrock.com Wed Feb 8 16:28:41 2017 From: prasanna.gopal at blackrock.com (Gopal, Prasanna CWK) Date: Wed, 8 Feb 2017 16:28:41 +0000 Subject: G1 Region size info In-Reply-To: <1486570402.3510.55.camel@oracle.com> References: <46dc58914a7c40f991bcdbd3f023a85e@UKPMSEXD202N02.na.blkint.com> <1486570402.3510.55.camel@oracle.com> Message-ID: ?Hi Thomas Thanks for your reply. We have following G1 configuration at the moment -Xmx7G -Xms7G -XX:+UseG1GC -XX:+UnlockExperimentalVMOptions -XX:InitiatingHeapOccupancyPercent=60 -XX:G1ReservePercent=20 -XX:G1ReservePercent=20 -XX:G1HeapRegionSize=32M -XX:G1MixedGCLiveThresholdPercent=85 -XX:MaxGCPauseMillis=500 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintHeapAtGC -XX:+PrintReferenceGC -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime Apart from these parameters, can we try another parameter to make the evacuation more aggressive? -XX:G1MixedGCLiveThresholdPercent=85 -XX:InitiatingHeapOccupancyPercent=60 ==> we have experimented with less values , it is just making the concurrent cycle without claiming any significant. -XX:MaxGCPauseMillis=500 Our object allocation rate is very high, before increasing the memory can we try any other parameter which can make the evacuation more aggressive? Appreciate your help. Please do let me know, if you need any more information. Regards Prasanna -----Original Message----- From: Thomas Schatzl [mailto:thomas.schatzl at oracle.com] Sent: 08 February 2017 16:13 To: Gopal, Prasanna CWK ; hotspot-gc-use at openjdk.java.net Subject: Re: G1 Region size info Hi, On Wed, 2017-02-08 at 12:25 +0000, Gopal, Prasanna CWK wrote: > Hi All > ? > I am trying to understand the region size info provided in of our > application?s GC log file. > ? > We are running an application with the following configuration > ? > -Xmx7G [...] > ? > Does this mean, ?our heap occupancy is only 352 MB after Post-Sorting > phase ?. It doesn?t co-relate with the information provided at thd end > of GC clean up phase (4283M->4121M(6144M), 0.0066530 secs) , which say > the heap size is 4121M. Post-sorting only considers regions that are collection set candidates - i.e. regions that G1 will clean out in the next reclamation phase. I.e. contain lots of garbage. If you think that is too little (g1 not cleaning out enough in mixed gcs), you might want to make evacuation more aggressive. From the post-marking snippet it seems though that there are not many regions with lots of garbage there anyway though. Post-marking statistics is what you want to look at and compare with. Thanks, ? Thomas This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for further information. Please refer to http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for more information about BlackRock?s Privacy Policy. BlackRock Advisors (UK) Limited and BlackRock Investment Management (UK) Limited are authorised and regulated by the Financial Conduct Authority. Registered in England No. 796793 and No. 2020394 respectively. BlackRock Life Limited is authorised by the Prudential Regulation Authority and regulated by the Financial Conduct Authority and the Prudential Regulation Authority. Registered in England No. 2223202. Registered Offices: 12 Throgmorton Avenue, London EC2N 2DL. BlackRock International Limited is authorised and regulated by the Financial Conduct Authority and is a registered investment adviser with the Securities and Exchange Commission (SEC). Registered in Scotland No. SC160821. Registered Office: Exchange Place One, 1 Semple Street, Edinburgh EH3 8BL. For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/en-us/about-us/contacts-locations. ? 2017 BlackRock, Inc. All rights reserved. From Milan.Mimica at infobip.com Wed Feb 8 16:56:36 2017 From: Milan.Mimica at infobip.com (Milan Mimica) Date: Wed, 8 Feb 2017 16:56:36 +0000 Subject: G1 native memory consumption In-Reply-To: <1486570008.3510.50.camel@oracle.com> References: <1484943874550.90103@infobip.com> ,<1485168079.2811.21.camel@oracle.com> <1486138975652.57172@infobip.com>, <1486570008.3510.50.camel@oracle.com> Message-ID: <1486572996601.68328@infobip.com> Hi Thomas > I am not sure why there is no difference, it would be nice to have a > breakdown on this like in the previous case to rule out other > components or not enough warmup. At least native memory is stable now. That's what I was aiming for. Attached are two graphs. This is after 5 days uptime, high load. With G1HeapRegionSize 16M. Note that the graph is showing alive memory allocations for which allocation happened in last: izd2.svg -- 36 hours izd3.svg -- 10 hours As you can see, not much going on in last 10 hours. That's great! It's stable. Still, native memory usage is relatively high, but that's not a big problem for me. Java Heap (reserved=25600MB, committed=25600MB) Internal (reserved=2144MB, committed=2144MB) GC (reserved=1166MB, committed=1166MB) I'll look at the rest you wrote some later day. Milan Mimica, Senior Software Engineer / Division Lead ________________________________________ From: Thomas Schatzl Sent: Wednesday, February 8, 2017 17:06 To: Milan Mimica; hotspot-gc-use at openjdk.java.net Subject: Re: G1 native memory consumption Hi Milan, On Fri, 2017-02-03 at 16:22 +0000, Milan Mimica wrote: > Hi Thomas > > Thanks for your input. I took me a while to have a stable system > again to repeat measurements. > > I have tried setting G1HeapRegionSize to 16M on one instance (8M is > default) and I notice lower GC memory usage: > GC (reserved=1117MB -18MB, committed=1117MB -18MB) > vs > GC (reserved=1604MB +313MB, committed=1604MB +313MB) > > It seems more stable too. However, "Internal" is still relatively > high for a 25G heap, and there is no much difference between > instances: > Internal (reserved=2132MB -7MB, committed=2132MB -7MB) I am not sure why there is no difference, it would be nice to have a breakdown on this like in the previous case to rule out other components or not enough warmup. Everything that is allocated via the OtherRegionsTable::add_reference() -> BitMap::resize() path in the figure from the other email is remembered sets, and they _should_ have gone down. You can try to move memory from that path to the CHeapObj operator new one. This results in g1 storing remembered sets in a much more dense but potentially slower to access representation. The switch to turn here is G1RSetSparseRegionEntries. It gives maximum number of cards (small areas, 512 bytes) per region to store in that representation. If it overflows, pretty large bitmaps that might be really sparsely populated are used (that take lots of time). By default it is somewhat like 4 * (log2(region-size-in-MB + 1) E.g. with 32M region only 24 cards are stored there max. I think you can easily increase this to something like 64 or 128 or even larger. I think (and I am unsure about this, in jdk9 we halved its memory usage) memory usage should be around equal with the bitmaps with 2k entries on 32M regions, so I would stop at something in that area at most. This size need not be a power of two btw. You can try increasing this value significantly and see if it helps with memory consumption without impacting performance too much. Thanks, Thomas -------------- next part -------------- A non-text attachment was scrubbed... Name: izd2.svg Type: image/svg+xml Size: 47136 bytes Desc: izd2.svg URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: izd3.svg Type: image/svg+xml Size: 92803 bytes Desc: izd3.svg URL: From thomas.schatzl at oracle.com Wed Feb 8 19:28:33 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 08 Feb 2017 20:28:33 +0100 Subject: G1 Region size info In-Reply-To: References: <46dc58914a7c40f991bcdbd3f023a85e@UKPMSEXD202N02.na.blkint.com> <1486570402.3510.55.camel@oracle.com> Message-ID: <1486582113.3853.24.camel@oracle.com> Hi Gopal, On Wed, 2017-02-08 at 16:28 +0000, Gopal, Prasanna CWK wrote: > ?Hi Thomas > > Thanks for your reply. We have following G1 configuration at the > moment? > > [...] > Apart from these parameters, can we try another parameter to make the > evacuation more aggressive?? > It premature to try to make the collection more aggressive if e.g. there is not anything worth to reclaim anyway. > > -XX:G1MixedGCLiveThresholdPercent=85??? You could increase that. Look at your "post-marking" output and see if there would be a significant additional amount of space to be reclaimed. Be aware that evacuating e.g. 90% full regions might be slow (and you will only ever get 10% back). Another option would be decreasing G1HeapWastePercent (not sure what the default is, but it is pretty low already iirc), which would more thoroughly clean out the collection set. Also being more aggressively evacuating may not help e.g. for problems with humongous objects/region fragmentation. If there is a lot of unusable memory at the end of humongous objects (check the "post-marking" output) actually decreasing region size might help. Eg. ###???HUMS 0x000000079e000000- 0x00000007a0000000???33554432???33554432???33554432?????501904008.8 ###???HUMC 0x00000007a0000000- 0x00000007a2000000????2367536????2367536????2367536???????2526909.9 indicates that that humongous object basically wastes 31M out of 64M, which is really bad if there are more of those hanging around. I do not see any good solution with g1 on 7u other than increasing the heap if that large a region size is necessary. If these humongous objects are short-lived (and do not have j.l.O. elements), then upgrading to 8u/9 may help (i.e. if eager reclaim can clean out large objects regularly and asap). Btw, the log also indicates?4121M out of 6144M of live data (around 3800M after hypothetically cleaning out all of old gen). This amount of live data may already beyond the comfort zone of most collectors. Only Jdk9 improves a bit in these situations, but not sure if the changes apply here. Not sure if decreasing heap region size will help a lot either as the heap is already relatively full. > -XX:InitiatingHeapOccupancyPercent=60 ==> we have experimented with > less values , it is just making the concurrent cycle without claiming > any significant.? Actually even 60% seems to much. If your average live set size is at 61% already like in the log, G1 already runs marking all the time. > -XX:MaxGCPauseMillis=500 > > Our object allocation rate is very high, before increasing the memory > can we try any other parameter which can make the evacuation more > aggressive? Appreciate your help. Please do let me know, if you need > any more information.? Thanks, ? Thomas From willb at eero.com Tue Feb 21 18:52:36 2017 From: willb at eero.com (Will Bertelsen) Date: Tue, 21 Feb 2017 10:52:36 -0800 Subject: Native memory leak in StringTable::intern using G1 Message-ID: Hi All, I've been experimenting with G1 in production and have noticed a large native memory leak that eventually exhausts all memory on the system. I ran it overnight with NMT enabled and this was the biggest offender: [0x00007f86c31cf205] Hashtable::new_entry(unsigned int, oopDesc*)+0x165 [0x00007f86c35dd263] StringTable::basic_add(int, Handle, unsigned short*, int, unsigned int, Thread*)+0xd3 [0x00007f86c35dd452] StringTable::intern(Handle, unsigned short*, int, Thread*)+0x182 [0x00007f86c35dd921] StringTable::intern(oopDesc*, Thread*)+0x131 (malloc=2628520KB +2601784KB #328565 +325223) Has anyone seen this before? Here is my java version and gc settings: java version "1.8.0_45" Java(TM) SE Runtime Environment (build 1.8.0_45-b14) Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) -Xmx16384M -Xms16384M -XX:+AggressiveOpts -XX:+UnlockExperimentalVMOptions -XX:+UseG1GC -XX:G1HeapRegionSize=8M -XX:G1NewSizePercent=20 -XX:G1MaxNewSizePercent=80 -XX:MaxGCPauseMillis=250 -XX:MaxMetaspaceSize=1G -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Wed Feb 22 19:04:36 2017 From: yu.zhang at oracle.com (yu.zhang at oracle.com) Date: Wed, 22 Feb 2017 11:04:36 -0800 Subject: Native memory leak in StringTable::intern using G1 In-Reply-To: References: Message-ID: <5a6265b5-9e96-1427-845b-2bf508d1edd2@oracle.com> Will, Does your application generate a lot of interned string? Another way to confirm is with jmap -heap 'interned Stings' should be printed. Did full gc happen during the run? Thanks Jenny On 02/21/2017 10:52 AM, Will Bertelsen wrote: > Hi All, > > I've been experimenting with G1 in production and have noticed a large > native memory leak that eventually exhausts all memory on the system. > I ran it overnight with NMT enabled and this was the biggest offender: > > [0x00007f86c31cf205] Hashtable (MemoryType)9>::new_entry(unsigned int, oopDesc*)+0x165 > [0x00007f86c35dd263] StringTable::basic_add(int, Handle, unsigned > short*, int, unsigned int, Thread*)+0xd3 > [0x00007f86c35dd452] StringTable::intern(Handle, unsigned short*, int, > Thread*)+0x182 > [0x00007f86c35dd921] StringTable::intern(oopDesc*, Thread*)+0x131 > (malloc=2628520KB +2601784KB #328565 +325223) > > Has anyone seen this before? > > Here is my java version and gc settings: > > java version "1.8.0_45" > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) > > -Xmx16384M > -Xms16384M > -XX:+AggressiveOpts > -XX:+UnlockExperimentalVMOptions > > -XX:+UseG1GC > -XX:G1HeapRegionSize=8M > -XX:G1NewSizePercent=20 > -XX:G1MaxNewSizePercent=80 > > -XX:MaxGCPauseMillis=250 > -XX:MaxMetaspaceSize=1G > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From willb at eero.com Wed Feb 22 21:35:01 2017 From: willb at eero.com (Will Bertelsen) Date: Wed, 22 Feb 2017 13:35:01 -0800 Subject: Native memory leak in StringTable::intern using G1 In-Reply-To: <5a6265b5-9e96-1427-845b-2bf508d1edd2@oracle.com> References: <5a6265b5-9e96-1427-845b-2bf508d1edd2@oracle.com> Message-ID: My application doesn't explicitly intern anything, though our libraries might. However, when running jmap as you suggested no interned strings are reported. And no. Full GC never occurred in the 2 or so days we ran G1 before the OS killed our proc due to system memory exhaustion. Here is the output of jmap: JVM version is 25.45-b02 using thread-local object allocation. Garbage-First (G1) GC with 13 thread(s) Heap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 17179869184 (16384.0MB) NewSize = 1363144 (1.2999954223632812MB) MaxNewSize = 13740539904 (13104.0MB) OldSize = 5452592 (5.1999969482421875MB) NewRatio = 2 SurvivorRatio = 8 MetaspaceSize = 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 1073741824 (1024.0MB) G1HeapRegionSize = 8388608 (8.0MB) Heap Usage: G1 Heap: regions = 2048 capacity = 17179869184 (16384.0MB) used = 7899518456 (7533.5678634643555MB) free = 9280350728 (8850.432136535645MB) 45.98124916665256% used G1 Young Generation: Eden Space: regions = 518 capacity = 7700742144 (7344.0MB) used = 4345298944 (4144.0MB) free = 3355443200 (3200.0MB) 56.42701525054466% used Survivor Space: regions = 58 capacity = 486539264 (464.0MB) used = 486539264 (464.0MB) free = 0 (0.0MB) 100.0% used G1 Old Generation: regions = 367 capacity = 8992587776 (8576.0MB) used = 3067680248 (2925.5678634643555MB) free = 5924907528 (5650.4321365356445MB) 34.11343124375414% used On Wed, Feb 22, 2017 at 11:04 AM, yu.zhang at oracle.com wrote: > Will, > > Does your application generate a lot of interned string? > > Another way to confirm is with jmap -heap 'interned Stings' should > be printed. Did full gc happen during the run? > > Thanks > > Jenny > > On 02/21/2017 10:52 AM, Will Bertelsen wrote: > > Hi All, > > I've been experimenting with G1 in production and have noticed a large > native memory leak that eventually exhausts all memory on the system. I ran > it overnight with NMT enabled and this was the biggest offender: > > [0x00007f86c31cf205] Hashtable::new_entry(unsigned > int, oopDesc*)+0x165 > [0x00007f86c35dd263] StringTable::basic_add(int, Handle, unsigned short*, > int, unsigned int, Thread*)+0xd3 > [0x00007f86c35dd452] StringTable::intern(Handle, unsigned short*, int, > Thread*)+0x182 > [0x00007f86c35dd921] StringTable::intern(oopDesc*, Thread*)+0x131 > (malloc=2628520KB +2601784KB #328565 +325223) > > Has anyone seen this before? > > Here is my java version and gc settings: > > java version "1.8.0_45" > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) > > -Xmx16384M > -Xms16384M > -XX:+AggressiveOpts > -XX:+UnlockExperimentalVMOptions > > -XX:+UseG1GC > -XX:G1HeapRegionSize=8M > -XX:G1NewSizePercent=20 > -XX:G1MaxNewSizePercent=80 > > -XX:MaxGCPauseMillis=250 > -XX:MaxMetaspaceSize=1G > > > _______________________________________________ > hotspot-gc-use mailing listhotspot-gc-use at openjdk.java.nethttp://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Thu Feb 23 11:53:33 2017 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 23 Feb 2017 12:53:33 +0100 Subject: Native memory leak in StringTable::intern using G1 In-Reply-To: References: Message-ID: <1487850813.7074.31.camel@oracle.com> Hi, On Tue, 2017-02-21 at 10:52 -0800, Will Bertelsen wrote: > Hi All, > > I've been experimenting with G1 in production and have noticed a > large native memory leak that eventually exhausts all memory on the > system. I ran it overnight with NMT enabled and this was the biggest > offender: > > [0x00007f86c31cf205] Hashtable (MemoryType)9>::new_entry(unsigned int, oopDesc*)+0x165 > [0x00007f86c35dd263] StringTable::basic_add(int, Handle, unsigned > short*, int, unsigned int, Thread*)+0xd3 > [0x00007f86c35dd452] StringTable::intern(Handle, unsigned short*, > int, Thread*)+0x182 > [0x00007f86c35dd921] StringTable::intern(oopDesc*, Thread*)+0x131 > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?(malloc=2628520KB +2601784KB #328565 > +325223) > > > Has anyone seen this before? This is just a workaround, but if the application runs for so long that it never does a full gc or do a marking cycle (did it?), you could manually trigger string table cleanup by issuing a system.gc with jmap now and then. If you set -XX:+ExplicitGCInvokesConcurrent, it will not be a stop-the- world gc. There is no equivalent to CMSTriggerInterval in G1 which starts a regular concurrent collection cycle every now and then (which is basically the same band-aid). > Here is my java version and gc settings: > > java version "1.8.0_45" > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) You may want to update. While looking for something similar in the bug tracker, I found e.g.?https://bugs.openjdk.java.net/browse/JDK-8133193? fixed in 8u72. Thanks, ? Thomas From willb at eero.com Fri Feb 24 23:44:26 2017 From: willb at eero.com (Will Bertelsen) Date: Fri, 24 Feb 2017 15:44:26 -0800 Subject: Native memory leak in StringTable::intern using G1 In-Reply-To: <1487850813.7074.31.camel@oracle.com> References: <1487850813.7074.31.camel@oracle.com> Message-ID: No, in this configuration it only did young and mixed GCs before it was killed by the system. I've fallen back to CMS for now, but when we upgrade java I might try G1 again to see if this is resolved. On Thu, Feb 23, 2017 at 3:53 AM, Thomas Schatzl wrote: > Hi, > > On Tue, 2017-02-21 at 10:52 -0800, Will Bertelsen wrote: > > Hi All, > > > > I've been experimenting with G1 in production and have noticed a > > large native memory leak that eventually exhausts all memory on the > > system. I ran it overnight with NMT enabled and this was the biggest > > offender: > > > > [0x00007f86c31cf205] Hashtable > (MemoryType)9>::new_entry(unsigned int, oopDesc*)+0x165 > > [0x00007f86c35dd263] StringTable::basic_add(int, Handle, unsigned > > short*, int, unsigned int, Thread*)+0xd3 > > [0x00007f86c35dd452] StringTable::intern(Handle, unsigned short*, > > int, Thread*)+0x182 > > [0x00007f86c35dd921] StringTable::intern(oopDesc*, Thread*)+0x131 > > (malloc=2628520KB +2601784KB #328565 > > +325223) > > > > > > Has anyone seen this before? > > This is just a workaround, but if the application runs for so long that > it never does a full gc or do a marking cycle (did it?), you could > manually trigger string table cleanup by issuing a system.gc with jmap > now and then. > If you set -XX:+ExplicitGCInvokesConcurrent, it will not be a stop-the- > world gc. > > There is no equivalent to CMSTriggerInterval in G1 which starts a > regular concurrent collection cycle every now and then (which is > basically the same band-aid). > > > Here is my java version and gc settings: > > > > java version "1.8.0_45" > > Java(TM) SE Runtime Environment (build 1.8.0_45-b14) > > Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode) > > You may want to update. While looking for something similar in the bug > tracker, I found e.g. https://bugs.openjdk.java.net/browse/JDK-8133193 > fixed in 8u72. > > Thanks, > Thomas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tg at freigmbh.de Tue Feb 28 12:45:44 2017 From: tg at freigmbh.de (Thorsten Goetzke) Date: Tue, 28 Feb 2017 13:45:44 +0100 Subject: Unreachable Memory not freed, Nashorn Demo Message-ID: <9171bcd9-5212-edb6-59f6-aa17b60b50e2@freigmbh.de> Hello, Back in January i posted about Unreachable Objects not claimed by the gc, i am finally able to produce a micro, see below. When I run the class below using -Xmx4g and take a memory snaphsot (hprof or yourkit, doesnt matter), I will see 2 LeakImpl Objects. These Objects have no reported path to root, yet they won't be collected. If i lower the heap space to -Xmx2g the Application throws java.lang.OutOfMemoryError: Java heap space. @Jenny Zhang should I create a new bugreport, or will you take care of this? Best Regards, Thorsten Goetzke package de.frei.demo; import jdk.nashorn.api.scripting.NashornScriptEngine; import jdk.nashorn.api.scripting.NashornScriptEngineFactory; import javax.script.CompiledScript; import javax.script.ScriptException; import javax.script.SimpleBindings; import java.util.function.Function; public final class LeakDemo { private static NashornScriptEngine ENGINE = getNashornScriptEngine(); private static CompiledScript SCRIPT; public static void main(String[] args) throws Exception { simulateLoad(); simulateLoad(); System.gc(); Thread.sleep(1000000); } private static void simulateLoad() throws ScriptException { final CompiledScript compiledScript = getCompiledScript(ENGINE); compiledScript.eval(new SimplestBindings(new LeakImpl())); } private static NashornScriptEngine getNashornScriptEngine() { final NashornScriptEngineFactory factory = new NashornScriptEngineFactory(); final NashornScriptEngine scriptEngine = (NashornScriptEngine) factory.getScriptEngine(); return scriptEngine; } private static CompiledScript getCompiledScript(final NashornScriptEngine scriptEngine) throws ScriptException { if (SCRIPT == null) { SCRIPT = scriptEngine.compile(" var pivot = getItem(\"pivot\");"); } return SCRIPT; } public interface Leak { LiveItem getItem(String id); } public static final class LeakImpl implements Leak { private final byte[] payload = new byte[1024 * 1024 * 1024]; @Override public LiveItem getItem(final String id) { return new LiveItem() { }; } } public interface LiveItem { } public static final class SimplestBindings extends SimpleBindings { public SimplestBindings(Leak leak) { put("getItem",(Function< String, LiveItem>) leak::getItem); } } } From yu.zhang at oracle.com Tue Feb 28 17:06:32 2017 From: yu.zhang at oracle.com (Jenny Zhang) Date: Tue, 28 Feb 2017 09:06:32 -0800 Subject: Unreachable Memory not freed, Nashorn Demo In-Reply-To: <9171bcd9-5212-edb6-59f6-aa17b60b50e2@freigmbh.de> References: <9171bcd9-5212-edb6-59f6-aa17b60b50e2@freigmbh.de> Message-ID: Thorsten Thanks very much for the micro. I have update it to https://bugs.openjdk.java.net/browse/JDK-8173594 Thanks Jenny On 2/28/2017 4:45 AM, Thorsten Goetzke wrote: > Hello, > > Back in January i posted about Unreachable Objects not claimed by the > gc, i am finally able to produce a micro, see below. When I run the > class below using -Xmx4g and take a memory snaphsot (hprof or yourkit, > doesnt matter), I will see 2 LeakImpl Objects. These Objects have no > reported path to root, yet they won't be collected. If i lower the > heap space to -Xmx2g the Application throws > java.lang.OutOfMemoryError: Java heap space. > @Jenny Zhang should I create a new bugreport, or will you take care of > this? > > Best Regards, > Thorsten Goetzke > > package de.frei.demo; > > import jdk.nashorn.api.scripting.NashornScriptEngine; > import jdk.nashorn.api.scripting.NashornScriptEngineFactory; > > import javax.script.CompiledScript; > import javax.script.ScriptException; > import javax.script.SimpleBindings; > import java.util.function.Function; > > > public final class LeakDemo { > > private static NashornScriptEngine ENGINE = > getNashornScriptEngine(); > private static CompiledScript SCRIPT; > > public static void main(String[] args) throws Exception { > simulateLoad(); > simulateLoad(); > System.gc(); > Thread.sleep(1000000); > > } > > private static void simulateLoad() throws ScriptException { > final CompiledScript compiledScript = getCompiledScript(ENGINE); > compiledScript.eval(new SimplestBindings(new LeakImpl())); > } > > private static NashornScriptEngine getNashornScriptEngine() { > final NashornScriptEngineFactory factory = new > NashornScriptEngineFactory(); > final NashornScriptEngine scriptEngine = (NashornScriptEngine) > factory.getScriptEngine(); > return scriptEngine; > } > > > > private static CompiledScript getCompiledScript(final > NashornScriptEngine scriptEngine) throws ScriptException { > if (SCRIPT == null) { > SCRIPT = scriptEngine.compile(" var pivot = > getItem(\"pivot\");"); > } > return SCRIPT; > } > > public interface Leak { > > LiveItem getItem(String id); > } > > > public static final class LeakImpl implements Leak { > private final byte[] payload = new byte[1024 * 1024 * 1024]; > > > @Override > public LiveItem getItem(final String id) { > return new LiveItem() { > }; > } > > > } > > public interface LiveItem { > } > > public static final class SimplestBindings extends SimpleBindings { > public SimplestBindings(Leak leak) { > > put("getItem",(Function< String, LiveItem>) leak::getItem); > } > } > } > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use From poonam.bajaj at oracle.com Tue Feb 28 17:56:42 2017 From: poonam.bajaj at oracle.com (Poonam Bajaj Parhar) Date: Tue, 28 Feb 2017 09:56:42 -0800 Subject: Unreachable Memory not freed, Nashorn Demo In-Reply-To: References: <9171bcd9-5212-edb6-59f6-aa17b60b50e2@freigmbh.de> Message-ID: Hello Thorsten, I ran this test program with jdk9-ea and created a Heap Dump after the first Full GC using -XX:+HeapDumpAfterFullGC. In that heap dump, I can see 2 instances of LeakImpl: Class Name | Objects | Shallow Heap | Retained Heap ---------------------------------------------------------- LeakDemo$LeakImpl| 2 | 32 | ---------------------------------------------------------- the first one is reachable as a local variable from the main thread which is fine: Class Name | Ref. Objects | Shallow Heap | Ref. Shallow Heap | Retained Heap ---------------------------------------------------------------------------------------------------------------- java.lang.Thread @ 0x84f211f8 Thread | 1 | 120 | 16 | 736 '- LeakDemo$LeakImpl @ 0x850d89f0| 1 | 16 | 16 | 16 ---------------------------------------------------------------------------------------------------------------- the other one is reachable through the referent "jdk.nashorn.internal.objects.Global" of a WeakReference: Class Name | Ref. Objects | Shallow Heap | Ref. Shallow Heap | Retained Heap ----------------------------------------------------------------------------------------------------------- class jdk.internal.loader.ClassLoaders @ 0x84f268f8 System Class | 1 | 16 | 16 | 16 '- PLATFORM_LOADER jdk.internal.loader.ClassLoaders$PlatformClassLoader @ 0x84f2a610 | 1 | 96 | 16 | 199,624 '- classes java.util.Vector @ 0x850b2b70 | 1 | 32 | 16 | 68,104 '- elementData java.lang.Object[640] @ 0x850b2b90 | 1 | 2,576 | 16 | 68,072 '- [196] class jdk.nashorn.internal.scripts.JD @ 0x84f49960 | 1 | 8 | 16 | 4,560 '- map$ jdk.nashorn.internal.runtime.PropertyMap @ 0x850d4a88 | 1 | 64 | 16 | 4,552 '- protoHistory java.util.WeakHashMap @ 0x850d5418 | 1 | 48 | 16 | 2,208 '- table java.util.WeakHashMap$Entry[16] @ 0x850d5448 | 1 | 80 | 16 | 2,112 *'- [10] java.util.WeakHashMap$Entry @ 0x850d5498 | 1 | 40 | 16 | 2,032* '- referent jdk.nashorn.internal.objects.Global @ 0x85137a18 | 1 | 544 | 16 | 39,920 '- initscontext javax.script.SimpleScriptContext @ 0x8515c910 | 1 | 32 | 16 | 280 '- engineScope LeakDemo$SimplestBindings @ 0x8515c930 | 1 | 16 | 16 | 248 '- map java.util.HashMap @ 0x8515c940 | 1 | 48 | 16 | 232 '- table java.util.HashMap$Node[16] @ 0x8515c970 | 1 | 80 | 16 | 184 '- [9] java.util.HashMap$Node @ 0x8515c9f8 | 1 | 32 | 16 | 48 '- value LeakDemo$SimplestBindings$$Lambda$118 @ 0x8515ca18| 1 | 16 | 16 | 16 '- arg$1 LeakDemo$LeakImpl @ 0x8515c600 | 1 | 16 | 16 | 1,073,741,856 ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- From the GC logs, the referent is present in an old region: [2.044s][info ][gc,metaspace ] GC(6) Metaspace: 13522K->13518K(1062912K) [2.047s][info ][gc,start ] GC(6) Heap Dump (after full gc) Dumping heap to java_pid20428.hprof ... Heap dump file created [1081050745 bytes in 25.084 secs] [27.137s][info ][gc ] GC(6) Heap Dump (after full gc) 25089.382ms [27.137s][info ][gc ] GC(6) Pause Full (Allocation Failure) 1028M->1028M(1970M) 25171.038ms Also: [10.651s][trace][gc,region] GC(6) G1HR POST-COMPACTION(OLD) [0x0000000085100000, 0x0000000085161f20, 0x0000000085200000] This Full GC didn't discover this WeakReference and didn't clear its referent. It needs to be investigated if it gets cleared and collected in the subsequent GCs. Thanks, Poonam On 2/28/2017 9:06 AM, Jenny Zhang wrote: > Thorsten > > Thanks very much for the micro. I have update it to > > https://bugs.openjdk.java.net/browse/JDK-8173594 > > Thanks > Jenny > > On 2/28/2017 4:45 AM, Thorsten Goetzke wrote: >> Hello, >> >> Back in January i posted about Unreachable Objects not claimed by the >> gc, i am finally able to produce a micro, see below. When I run the >> class below using -Xmx4g and take a memory snaphsot (hprof or >> yourkit, doesnt matter), I will see 2 LeakImpl Objects. These Objects >> have no reported path to root, yet they won't be collected. If i >> lower the heap space to -Xmx2g the Application throws >> java.lang.OutOfMemoryError: Java heap space. >> @Jenny Zhang should I create a new bugreport, or will you take care >> of this? >> >> Best Regards, >> Thorsten Goetzke >> >> package de.frei.demo; >> >> import jdk.nashorn.api.scripting.NashornScriptEngine; >> import jdk.nashorn.api.scripting.NashornScriptEngineFactory; >> >> import javax.script.CompiledScript; >> import javax.script.ScriptException; >> import javax.script.SimpleBindings; >> import java.util.function.Function; >> >> >> public final class LeakDemo { >> >> private static NashornScriptEngine ENGINE = >> getNashornScriptEngine(); >> private static CompiledScript SCRIPT; >> >> public static void main(String[] args) throws Exception { >> simulateLoad(); >> simulateLoad(); >> System.gc(); >> Thread.sleep(1000000); >> >> } >> >> private static void simulateLoad() throws ScriptException { >> final CompiledScript compiledScript = getCompiledScript(ENGINE); >> compiledScript.eval(new SimplestBindings(new LeakImpl())); >> } >> >> private static NashornScriptEngine getNashornScriptEngine() { >> final NashornScriptEngineFactory factory = new >> NashornScriptEngineFactory(); >> final NashornScriptEngine scriptEngine = >> (NashornScriptEngine) factory.getScriptEngine(); >> return scriptEngine; >> } >> >> >> >> private static CompiledScript getCompiledScript(final >> NashornScriptEngine scriptEngine) throws ScriptException { >> if (SCRIPT == null) { >> SCRIPT = scriptEngine.compile(" var pivot = >> getItem(\"pivot\");"); >> } >> return SCRIPT; >> } >> >> public interface Leak { >> >> LiveItem getItem(String id); >> } >> >> >> public static final class LeakImpl implements Leak { >> private final byte[] payload = new byte[1024 * 1024 * 1024]; >> >> >> @Override >> public LiveItem getItem(final String id) { >> return new LiveItem() { >> }; >> } >> >> >> } >> >> public interface LiveItem { >> } >> >> public static final class SimplestBindings extends SimpleBindings { >> public SimplestBindings(Leak leak) { >> >> put("getItem",(Function< String, LiveItem>) leak::getItem); >> } >> } >> } >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: