From christopherberner at gmail.com Mon Mar 2 17:44:38 2015 From: christopherberner at gmail.com (Christopher Berner) Date: Mon, 2 Mar 2015 09:44:38 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? Message-ID: Hello, I work on the Presto project (https://github.com/facebook/presto) and am trying to understand the behavior of G1. We run a 45GB heap on the worker machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except that every day a few machines hit a "to-space exhausted" failure and either dies due to an OutOfMemory error, or does a full gc with such a long pause that it fails our health checks and is restarted by our service manager. Looking at the GC logs, the sequence of events is always the same. The young gen is quite large (~50% of the heap), and every collection is fast, but then it hits a "to-space exhausted" failure which appears to increase heap used (see log below). After that the young gen is tiny and it never recovers. Two questions: 1) why does heap used increase in the middle of the GC cycle? 2) Looking at some of the logs it appears that it starts a full GC, but also throws an OutOfMemory error concurrently (they show up a hundred lines apart or so in stdout). Why would there be an OutOfMemory error before the full GC finished? Thanks for any help! Christopher 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 16136, predicted base time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms] 199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 805 regions, survivors: 11 regions, predicted young region time: 56.53 ms] 199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 805 regions, survivors: 11 regions, old: 0 regions, predicted pause time: 86.82 ms, target pause time: 200.00 ms] , 0.0722119 secs] [Parallel Time: 46.7 ms, GC Workers: 28] [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max: 199078407.5, Diff: 0.6] [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1, Sum: 39.7] [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9] [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum: 182] [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4, Sum: 0.7] [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum: 1155.3] [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum: 25.8] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: 3.9] [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6, Sum: 1289.7] [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max: 199078453.4, Diff: 0.2] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 3.0 ms] [Other: 22.2 ms] [Choose CSet: 0.0 ms] [Ref Proc: 18.0 ms] [Ref Enq: 0.5 ms] [Redirty Cards: 0.9 ms] [Humongous Reclaim: 0.0 ms] [Free CSet: 1.7 ms] [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap: 39.7G(45.0G)->14.6G(45.0G)] [Times: user=1.37 sys=0.00, real=0.08 secs] 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 56252, predicted base time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms] 201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 807 regions, survivors: 10 regions, predicted young region time: 60.67 ms] 201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 807 regions, survivors: 10 regions, old: 0 regions, predicted pause time: 95.67 ms, target pause time: 200.00 ms] 201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: region allocation request failed, allocation request: 3058176 bytes] 201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes] 201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap already fully expanded] 201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 44291850240 bytes, allocation request: 0 bytes, thres hold: 21743271900 bytes (45.00 %), source: end of GC] (to-space exhausted), 9.0961593 secs] [Parallel Time: 8209.7 ms, GC Workers: 28] [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max: 201610864.4, Diff: 0.5] [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6, Sum: 47.8] [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum: 131.1] [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum: 766] [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5, Sum: 0.8] [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff: 6.3, Sum: 229661.8] [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 7.0] [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 2.8] [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff: 0.6, Sum: 229858.3] [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max: 201619073.5, Diff: 0.2] [Code Root Fixup: 0.3 ms] [Code Root Purge: 0.0 ms] [Clear CT: 3.0 ms] [Other: 883.1 ms] [Evacuation Failure: 788.4 ms] [Choose CSet: 0.0 ms] [Ref Proc: 45.0 ms] [Ref Enq: 0.6 ms] [Redirty Cards: 1.4 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.6 ms] [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap: 39.8G(45.0G)->44.1G(45.0G)] [Times: user=46.07 sys=2.21, real=9.10 secs] -------------- next part -------------- An HTML attachment was scrubbed... URL: From simone.bordet at gmail.com Mon Mar 2 18:15:53 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Mon, 2 Mar 2015 19:15:53 +0100 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: Hi, On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner wrote: > Hello, > > I work on the Presto project (https://github.com/facebook/presto) and am > trying to understand the behavior of G1. We run a 45GB heap on the worker > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, Just out of curiosity, you seem to have IHOP=45% and an eden that is 55% of the heap (25 GiB out of 45 GiB). Is there any reason why you keep IHOP this low or you're just running with defaults ? To the hotspot gc experts, is there any way to limit the Eden size without impacting on the ergonomics ? Does -XX:MaxNewSize impact ergonomics ? -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From christopherberner at gmail.com Mon Mar 2 18:52:45 2015 From: christopherberner at gmail.com (Christopher Berner) Date: Mon, 2 Mar 2015 10:52:45 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: We're just running with the default IHOP On Mon, Mar 2, 2015 at 10:15 AM, Simone Bordet wrote: > Hi, > > On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner > wrote: > > Hello, > > > > I work on the Presto project (https://github.com/facebook/presto) and am > > trying to understand the behavior of G1. We run a 45GB heap on the worker > > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, > > Just out of curiosity, you seem to have IHOP=45% and an eden that is > 55% of the heap (25 GiB out of 45 GiB). > Is there any reason why you keep IHOP this low or you're just running > with defaults ? > > To the hotspot gc experts, is there any way to limit the Eden size > without impacting on the ergonomics ? > Does -XX:MaxNewSize impact ergonomics ? > > -- > Simone Bordet > http://bordet.blogspot.com > --- > Finally, no matter how good the architecture and design are, > to deliver bug-free software with optimal performance and reliability, > the implementation technique must be flawless. Victoria Livschitz > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Mon Mar 2 22:44:06 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 02 Mar 2015 14:44:06 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: <54F4E7B6.90704@oracle.com> Christopher, 8. What is 'to-space exhausted'? Why it is slow? How to avoid it? 'to-space exhausted' happens when there is not enough space to copy to during evacuation. It is slow because g1 has to do a lot of work to make sure the heap is in a ready-to-use state. There are several ways you can try to avoid it. Trigger the mixed gc earlier by decreasing -XX:InitiatingHeapOccupancyPercent (default 45), adjust Young Gen size, increase G1ReservePercent, etc. There is no one size fits all tuning. From the log snip you posted, my guess is most of the time the objects die in young gen. But some times they live longer and promoted to old gen. But there is not enough space in old gen. Thanks, Jenny On 3/2/2015 10:52 AM, Christopher Berner wrote: > We're just running with the default IHOP > > On Mon, Mar 2, 2015 at 10:15 AM, Simone Bordet > > wrote: > > Hi, > > On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner > > > wrote: > > Hello, > > > > I work on the Presto project > (https://github.com/facebook/presto) and am > > trying to understand the behavior of G1. We run a 45GB heap on > the worker > > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, > > Just out of curiosity, you seem to have IHOP=45% and an eden that is > 55% of the heap (25 GiB out of 45 GiB). > Is there any reason why you keep IHOP this low or you're just running > with defaults ? > > To the hotspot gc experts, is there any way to limit the Eden size > without impacting on the ergonomics ? > Does -XX:MaxNewSize impact ergonomics ? > > -- > Simone Bordet > http://bordet.blogspot.com > --- > Finally, no matter how good the architecture and design are, > to deliver bug-free software with optimal performance and reliability, > the implementation technique must be flawless. Victoria Livschitz > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Mon Mar 2 22:45:08 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 02 Mar 2015 14:45:08 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: <54F4E7F4.9030105@oracle.com> I am starting a FAQ page, I added this question https://blogs.oracle.com/g1gc/ 9. What is the recommended way to limit Eden size for g1? The recommended way is to set -XX:MaxGCPauseMillis. G1 will adjust YoungGen size trying to meet the pause goal. The Young gen size is between 5 to 60 percent of the heap size. To control it further, you can use Experimental flags: -XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5> -XX:G1MaxNewSizePercent=<60>. G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, -Xmn -Xmn: the same as NewSize=MaxNewSize only -XX:NewSize is set, the young gen size is between specified NewSize and G1MaxNewSizePercent only -XX:MaxNewSize is set, the young gen size is between specified G1NewSizePercent and MaxNewSize. Both -XX:NewSize and -XX:MaxNewSize are used, young gen will be between those 2 sizes. But when heap size change, the young gen size will not change accordingly. If -XX:NewRatio is used, the Young Gen size is heap size * newRatio. NewRatio is ignored if it is used with NewSize and MaxNewSize. Thanks, Jenny On 3/2/2015 10:15 AM, Simone Bordet wrote: > Hi, > > On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner > wrote: >> Hello, >> >> I work on the Presto project (https://github.com/facebook/presto) and am >> trying to understand the behavior of G1. We run a 45GB heap on the worker >> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, > Just out of curiosity, you seem to have IHOP=45% and an eden that is > 55% of the heap (25 GiB out of 45 GiB). > Is there any reason why you keep IHOP this low or you're just running > with defaults ? > > To the hotspot gc experts, is there any way to limit the Eden size > without impacting on the ergonomics ? > Does -XX:MaxNewSize impact ergonomics ? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopherberner at gmail.com Tue Mar 3 02:41:08 2015 From: christopherberner at gmail.com (Christopher Berner) Date: Mon, 2 Mar 2015 18:41:08 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: <54F4E7F4.9030105@oracle.com> References: <54F4E7F4.9030105@oracle.com> Message-ID: Thanks! I'll try adjusting the pause target, and if that doesn't help I'll try those other settings On Mon, Mar 2, 2015 at 2:45 PM, Yu Zhang wrote: > I am starting a FAQ page, I added this question > https://blogs.oracle.com/g1gc/ > 9. What is the recommended way to limit Eden size for g1? > > The recommended way is to set -XX:MaxGCPauseMillis. G1 will adjust > YoungGen size trying to meet the pause goal. The Young gen size is between > 5 to 60 percent of the heap size. To control it further, you can use > Experimental flags: > > -XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5> > -XX:G1MaxNewSizePercent=<60>. > > G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, > -Xmn > > -Xmn: the same as NewSize=MaxNewSize > > only -XX:NewSize is set, the young gen size is between specified NewSize > and G1MaxNewSizePercent > > only -XX:MaxNewSize is set, the young gen size is between > specified G1NewSizePercent and MaxNewSize. > > Both -XX:NewSize and -XX:MaxNewSize are used, young gen will be between > those 2 sizes. But when heap size change, the young gen size will not > change accordingly. > > If -XX:NewRatio is used, the Young Gen size is heap size * newRatio. > NewRatio is ignored if it is used with NewSize and MaxNewSize. > > Thanks, > Jenny > > On 3/2/2015 10:15 AM, Simone Bordet wrote: > > Hi, > > On Mon, Mar 2, 2015 at 6:44 PM, Christopher Berner wrote: > > Hello, > > I work on the Presto project (https://github.com/facebook/presto) and am > trying to understand the behavior of G1. We run a 45GB heap on the worker > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, > > Just out of curiosity, you seem to have IHOP=45% and an eden that is > 55% of the heap (25 GiB out of 45 GiB). > Is there any reason why you keep IHOP this low or you're just running > with defaults ? > > To the hotspot gc experts, is there any way to limit the Eden size > without impacting on the ergonomics ? > Does -XX:MaxNewSize impact ergonomics ? > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From simone.bordet at gmail.com Tue Mar 3 07:16:56 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Tue, 3 Mar 2015 08:16:56 +0100 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: <54F4E7F4.9030105@oracle.com> References: <54F4E7F4.9030105@oracle.com> Message-ID: Jenny, On Mon, Mar 2, 2015 at 11:45 PM, Yu Zhang wrote: > I am starting a FAQ page, I added this question > https://blogs.oracle.com/g1gc/ > > 9. What is the recommended way to limit Eden size for g1? > > The recommended way is to set -XX:MaxGCPauseMillis. G1 will adjust > YoungGen size trying to meet the pause goal. The Young gen size is between > 5 to 60 percent of the heap size. To control it further, you can use > Experimental flags: > > -XX:+UnlockExperimentalVMOpt -XX:G1NewSizePercent=<5> > -XX:G1MaxNewSizePercent=<60>. > > G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, -Xmn > > -Xmn: the same as NewSize=MaxNewSize > > only -XX:NewSize is set, the young gen size is between specified NewSize and > G1MaxNewSizePercent > > only -XX:MaxNewSize is set, the young gen size is between specified > G1NewSizePercent and MaxNewSize. > > Both -XX:NewSize and -XX:MaxNewSize are used, young gen will be between > those 2 sizes. But when heap size change, the young gen size will not > change accordingly. > > If -XX:NewRatio is used, the Young Gen size is heap size * newRatio. > NewRatio is ignored if it is used with NewSize and MaxNewSize. I take that all of these options disable ergonomics and therefore the attempts of G1 to respect MaxGCPauseMillis ? Or setting one or some of these will still make G1 try to respect MaxGCPauseMillis ? Thanks ! -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From thomas.schatzl at oracle.com Tue Mar 3 11:18:01 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 03 Mar 2015 12:18:01 +0100 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: <54F4E7F4.9030105@oracle.com> Message-ID: <1425381481.3315.62.camel@oracle.com> Hi Simone, On Tue, 2015-03-03 at 08:16 +0100, Simone Bordet wrote: > Jenny, > > On Mon, Mar 2, 2015 at 11:45 PM, Yu Zhang wrote: > > I am starting a FAQ page, I added this question > > https://blogs.oracle.com/g1gc/ > > > > 9. What is the recommended way to limit Eden size for g1? > > [...] > > G1 will pick up other settings, such as NewRatio, NewSize, MaxNewSize, -Xmn > > > > -Xmn: the same as NewSize=MaxNewSize > > > > only -XX:NewSize is set, the young gen size is between specified NewSize and > > G1MaxNewSizePercent > > > > only -XX:MaxNewSize is set, the young gen size is between specified > > G1NewSizePercent and MaxNewSize. > > > > Both -XX:NewSize and -XX:MaxNewSize are used, young gen will be between > > those 2 sizes. But when heap size change, the young gen size will not > > change accordingly. > > > > If -XX:NewRatio is used, the Young Gen size is heap size * newRatio. > > NewRatio is ignored if it is used with NewSize and MaxNewSize. > > I take that all of these options disable ergonomics and therefore the > attempts of G1 to respect MaxGCPauseMillis ? > Or setting one or some of these will still make G1 try to respect > MaxGCPauseMillis ? G1 will attempt to respect pause time as much as possible, except if you set the min and max limits to the same value. This is the case if you do that explicitly for both, or if NewRatio is set. Basically, by setting one or the other value, you fix that bound to a certain value. I recommend only setting G1MaxNewSize or MaxNewSize for the case mentioned in the original post. It is appropriate for cases when there may be sudden changes in the survival rate of eden that the gc cannot predict to avoid missing the pause time goal excessively, which seems to be the case. Thanks, Thomas From simone.bordet at gmail.com Tue Mar 3 11:23:49 2015 From: simone.bordet at gmail.com (Simone Bordet) Date: Tue, 3 Mar 2015 12:23:49 +0100 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: <1425381481.3315.62.camel@oracle.com> References: <54F4E7F4.9030105@oracle.com> <1425381481.3315.62.camel@oracle.com> Message-ID: Hi, On Tue, Mar 3, 2015 at 12:18 PM, Thomas Schatzl wrote: > G1 will attempt to respect pause time as much as possible, except if you > set the min and max limits to the same value. This is the case if you do > that explicitly for both, or if NewRatio is set. > Basically, by setting one or the other value, you fix that bound to a > certain value. Thanks for this clarification ! -- Simone Bordet http://bordet.blogspot.com --- Finally, no matter how good the architecture and design are, to deliver bug-free software with optimal performance and reliability, the implementation technique must be flawless. Victoria Livschitz From chkwok at digibites.nl Tue Mar 3 11:43:25 2015 From: chkwok at digibites.nl (Chi Ho Kwok) Date: Tue, 3 Mar 2015 12:43:25 +0100 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: Hi, When there are live objects during an eden collection, they must be copied to a new, empty region. With a huge eden size, this may require more space than there is available, causing a to-space exhaustion. We always run with a fixed new generation size to avoid this kind of issues; when ergonomics think it can hit the pause time target with a very large eden, it will allocate a very large eden as it can be more efficient, that's a bit too unpredictable for us in production. We usually set a NewRatio of 4 to 10, when set to 4, the eden is fixed at 1/5th the size of the full heap or ~9GB. This also pretty much guarantee a tiny static eden collection pause, in your case, of ~21ms. (60/807*208 regions). Your promotion failure happened when 25.5G produced 3.2G of survivors, with a eden of 9G, this should only be 1.2G, which shouldn't be any issue if the old collector is run regularly. The old collector is only triggered after a young collection by the way, so having them spaced closer (smaller eden -> full more quickly) gives it more chance to run and add almost empty regions to the next mixed gc run. Con: GC will run more often, with smaller pauses, and promote more objects to the old generation which require more work to collect (concurrent scan required). But as your collections run once per many minutes, this extra overhead is basically zero. Our prod young collectors run multiple times per second on a 4G eden, so you're not pushing the limits of the collector at all. Kind regards, -- Chi Ho Kwok Digibites Technology chkwok at digibites.nl On 2 March 2015 at 18:44, Christopher Berner wrote: > Hello, > > I work on the Presto project (https://github.com/facebook/presto) and am > trying to understand the behavior of G1. We run a 45GB heap on the worker > machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except > that every day a few machines hit a "to-space exhausted" failure and either > dies due to an OutOfMemory error, or does a full gc with such a long pause > that it fails our health checks and is restarted by our service manager. > > Looking at the GC logs, the sequence of events is always the same. The > young gen is quite large (~50% of the heap), and every collection is fast, > but then it hits a "to-space exhausted" failure which appears to increase > heap used (see log below). After that the young gen is tiny and it never > recovers. > > Two questions: 1) why does heap used increase in the middle of the GC > cycle? 2) Looking at some of the logs it appears that it starts a full GC, > but also throws an OutOfMemory error concurrently (they show up a hundred > lines apart or so in stdout). Why would there be an OutOfMemory error > before the full GC finished? > > Thanks for any help! > Christopher > > 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated > GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing > CSet, _pending_cards: 16136, predicted base > > time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms] > > 199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 805 regions, survivors: 11 regions, predicted young region time: > 56.53 ms] > > 199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 805 regions, survivors: 11 regions, old: 0 regions, predicted pause time: > 86.82 ms, target pause time: 200.00 ms] > > , 0.0722119 secs] > > [Parallel Time: 46.7 ms, GC Workers: 28] > > [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max: > 199078407.5, Diff: 0.6] > > [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1, > Sum: 39.7] > > [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9] > > [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum: 182] > > [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3] > > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4, > Sum: 0.7] > > [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum: > 1155.3] > > [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum: > 25.8] > > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, Sum: > 3.9] > > [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6, > Sum: 1289.7] > > [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max: > 199078453.4, Diff: 0.2] > > [Code Root Fixup: 0.3 ms] > > [Code Root Purge: 0.0 ms] > > [Clear CT: 3.0 ms] > > [Other: 22.2 ms] > > [Choose CSet: 0.0 ms] > > [Ref Proc: 18.0 ms] > > [Ref Enq: 0.5 ms] > > [Redirty Cards: 0.9 ms] > > [Humongous Reclaim: 0.0 ms] > > [Free CSet: 1.7 ms] > > [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap: > 39.7G(45.0G)->14.6G(45.0G)] > > [Times: user=1.37 sys=0.00, real=0.08 secs] > > 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated > GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing > CSet, _pending_cards: 56252, predicted base > > time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms] > > 201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet, > eden: 807 regions, survivors: 10 regions, predicted young region time: > 60.67 ms] > > 201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: > 807 regions, survivors: 10 regions, old: 0 regions, predicted pause time: > 95.67 ms, target pause time: 200.00 ms] > > 201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: > region allocation request failed, allocation request: 3058176 bytes] > > 201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested > expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes] > > 201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: > heap already fully expanded] > > 201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle > initiation, reason: occupancy higher than threshold, occupancy: 44291850240 > bytes, allocation request: 0 bytes, thres > > hold: 21743271900 bytes (45.00 %), source: end of GC] > > (to-space exhausted), 9.0961593 secs] > > [Parallel Time: 8209.7 ms, GC Workers: 28] > > [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max: > 201610864.4, Diff: 0.5] > > [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6, > Sum: 47.8] > > [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum: 131.1] > > [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum: > 766] > > [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1] > > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5, > Sum: 0.8] > > [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff: 6.3, > Sum: 229661.8] > > [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 7.0] > > [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: > 2.8] > > [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff: > 0.6, Sum: 229858.3] > > [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max: > 201619073.5, Diff: 0.2] > > [Code Root Fixup: 0.3 ms] > > [Code Root Purge: 0.0 ms] > > [Clear CT: 3.0 ms] > > [Other: 883.1 ms] > > [Evacuation Failure: 788.4 ms] > > [Choose CSet: 0.0 ms] > > [Ref Proc: 45.0 ms] > > [Ref Enq: 0.6 ms] > > [Redirty Cards: 1.4 ms] > > [Humongous Reclaim: 0.1 ms] > > [Free CSet: 0.6 ms] > > [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap: > 39.8G(45.0G)->44.1G(45.0G)] > > [Times: user=46.07 sys=2.21, real=9.10 secs] > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christopherberner at gmail.com Wed Mar 4 20:56:06 2015 From: christopherberner at gmail.com (Christopher Berner) Date: Wed, 4 Mar 2015 12:56:06 -0800 Subject: G1 "to-space exhausted" causes used heap space to increase? In-Reply-To: References: Message-ID: Adjusting -XX:G1MaxNewSizePercent seemed to work better than changing the target pause time, at least for us. Thanks for all the help! On Tue, Mar 3, 2015 at 3:43 AM, Chi Ho Kwok wrote: > Hi, > > When there are live objects during an eden collection, they must be copied > to a new, empty region. With a huge eden size, this may require more space > than there is available, causing a to-space exhaustion. We always run with > a fixed new generation size to avoid this kind of issues; when ergonomics > think it can hit the pause time target with a very large eden, it will > allocate a very large eden as it can be more efficient, that's a bit too > unpredictable for us in production. > > We usually set a NewRatio of 4 to 10, when set to 4, the eden is fixed at > 1/5th the size of the full heap or ~9GB. This also pretty much guarantee a > tiny static eden collection pause, in your case, of ~21ms. (60/807*208 > regions). Your promotion failure happened when 25.5G produced 3.2G of > survivors, with a eden of 9G, this should only be 1.2G, which shouldn't be > any issue if the old collector is run regularly. The old collector is only > triggered after a young collection by the way, so having them spaced closer > (smaller eden -> full more quickly) gives it more chance to run and add > almost empty regions to the next mixed gc run. > > Con: GC will run more often, with smaller pauses, and promote more objects > to the old generation which require more work to collect (concurrent scan > required). But as your collections run once per many minutes, this extra > overhead is basically zero. Our prod young collectors run multiple times > per second on a 4G eden, so you're not pushing the limits of the collector > at all. > > > Kind regards, > > -- > Chi Ho Kwok > Digibites Technology > chkwok at digibites.nl > > On 2 March 2015 at 18:44, Christopher Berner > wrote: > >> Hello, >> >> I work on the Presto project (https://github.com/facebook/presto) and am >> trying to understand the behavior of G1. We run a 45GB heap on the worker >> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except >> that every day a few machines hit a "to-space exhausted" failure and either >> dies due to an OutOfMemory error, or does a full gc with such a long pause >> that it fails our health checks and is restarted by our service manager. >> >> Looking at the GC logs, the sequence of events is always the same. The >> young gen is quite large (~50% of the heap), and every collection is fast, >> but then it hits a "to-space exhausted" failure which appears to increase >> heap used (see log below). After that the young gen is tiny and it never >> recovers. >> >> Two questions: 1) why does heap used increase in the middle of the GC >> cycle? 2) Looking at some of the logs it appears that it starts a full GC, >> but also throws an OutOfMemory error concurrently (they show up a hundred >> lines apart or so in stdout). Why would there be an OutOfMemory error >> before the full GC finished? >> >> Thanks for any help! >> Christopher >> >> 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated >> GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing >> CSet, _pending_cards: 16136, predicted base >> >> time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms] >> >> 199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 805 regions, survivors: 11 regions, predicted young region time: >> 56.53 ms] >> >> 199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet, >> eden: 805 regions, survivors: 11 regions, old: 0 regions, predicted pause >> time: 86.82 ms, target pause time: 200.00 ms] >> >> , 0.0722119 secs] >> >> [Parallel Time: 46.7 ms, GC Workers: 28] >> >> [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max: >> 199078407.5, Diff: 0.6] >> >> [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1, >> Sum: 39.7] >> >> [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9] >> >> [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum: >> 182] >> >> [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3] >> >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4, >> Sum: 0.7] >> >> [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum: >> 1155.3] >> >> [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum: >> 25.8] >> >> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2, >> Sum: 3.9] >> >> [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6, >> Sum: 1289.7] >> >> [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max: >> 199078453.4, Diff: 0.2] >> >> [Code Root Fixup: 0.3 ms] >> >> [Code Root Purge: 0.0 ms] >> >> [Clear CT: 3.0 ms] >> >> [Other: 22.2 ms] >> >> [Choose CSet: 0.0 ms] >> >> [Ref Proc: 18.0 ms] >> >> [Ref Enq: 0.5 ms] >> >> [Redirty Cards: 0.9 ms] >> >> [Humongous Reclaim: 0.0 ms] >> >> [Free CSet: 1.7 ms] >> >> [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap: >> 39.7G(45.0G)->14.6G(45.0G)] >> >> [Times: user=1.37 sys=0.00, real=0.08 secs] >> >> 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated >> GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing >> CSet, _pending_cards: 56252, predicted base >> >> time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms] >> >> 201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet, >> eden: 807 regions, survivors: 10 regions, predicted young region time: >> 60.67 ms] >> >> 201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet, >> eden: 807 regions, survivors: 10 regions, old: 0 regions, predicted pause >> time: 95.67 ms, target pause time: 200.00 ms] >> >> 201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >> region allocation request failed, allocation request: 3058176 bytes] >> >> 201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes] >> >> 201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: >> heap already fully expanded] >> >> 201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle >> initiation, reason: occupancy higher than threshold, occupancy: 44291850240 >> bytes, allocation request: 0 bytes, thres >> >> hold: 21743271900 bytes (45.00 %), source: end of GC] >> >> (to-space exhausted), 9.0961593 secs] >> >> [Parallel Time: 8209.7 ms, GC Workers: 28] >> >> [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max: >> 201610864.4, Diff: 0.5] >> >> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6, >> Sum: 47.8] >> >> [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum: >> 131.1] >> >> [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum: >> 766] >> >> [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1] >> >> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5, >> Sum: 0.8] >> >> [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff: >> 6.3, Sum: 229661.8] >> >> [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: >> 7.0] >> >> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, >> Sum: 2.8] >> >> [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff: >> 0.6, Sum: 229858.3] >> >> [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max: >> 201619073.5, Diff: 0.2] >> >> [Code Root Fixup: 0.3 ms] >> >> [Code Root Purge: 0.0 ms] >> >> [Clear CT: 3.0 ms] >> >> [Other: 883.1 ms] >> >> [Evacuation Failure: 788.4 ms] >> >> [Choose CSet: 0.0 ms] >> >> [Ref Proc: 45.0 ms] >> >> [Ref Enq: 0.6 ms] >> >> [Redirty Cards: 1.4 ms] >> >> [Humongous Reclaim: 0.1 ms] >> >> [Free CSet: 0.6 ms] >> >> [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap: >> 39.8G(45.0G)->44.1G(45.0G)] >> >> [Times: user=46.07 sys=2.21, real=9.10 secs] >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Thu Mar 5 19:00:53 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Thu, 05 Mar 2015 11:00:53 -0800 Subject: G1GC, Java8u40ea, Metaspace questions In-Reply-To: <54E5CD2B.7030201@finkzeit.at> References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at> <54DE3254.9030503@oracle.com> <54DE41A7.6050004@finkzeit.at> <54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com> <54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com> <54E5CD2B.7030201@finkzeit.at> Message-ID: <54F8A7E5.5080606@oracle.com> Wolfgang, Thanks for reporting this. I can reproduce this behavior with a micro. After consulting with Stefan and Jon, it is the current behavior. For now you can keep MaxMetaspaceFreeRatio low to bring HWM down. We might file an enhancement bug on this. You do not need a mixed gc to clean metaspace. Thanks, Jenny On 2/19/2015 3:46 AM, Wolfgang Pedot wrote: > One more, something just came to me: > > Class unloading happens during the concurrent marking-cycle so the > mixed collects that would free up unused classloaders in oldGen happen > after that, right? > That would mean the classes can only be cleaned up at the next cycle > and stay in Metaspace until then. My test causes only > Metaspace-triggered concurrent cycles so the garbage-collector is > always behind by one cycle and therefor the amount of classes that can > be unloaded can be different each time, irregardless of the percentage > of wasted heap. I guess I have to extend my test-scenario in a way > that also causes at least some heap-driven concurrent cycles and see > what happens then. > Still does not explain why I hardly ever see HWM go down but it > explains some of my more confusing test-results... > > regards > Wolfgang > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.pedot at finkzeit.at Thu Mar 5 20:41:34 2015 From: wolfgang.pedot at finkzeit.at (Wolfgang Pedot) Date: Thu, 05 Mar 2015 21:41:34 +0100 Subject: G1GC, Java8u40ea, Metaspace questions In-Reply-To: <54F8A7E5.5080606@oracle.com> References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at> <54DE3254.9030503@oracle.com> <54DE41A7.6050004@finkzeit.at> <54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com> <54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com> <54E5CD2B.7030201@finkzeit.at> <54F8A7E5.5080606@oracle.com> Message-ID: <54F8BF7E.2000805@finkzeit.at> Jenny, thanks for getting back to me with this info. I think I found a good setting for now and I am letting a smaller system run with that under more normal use (most concurrent cycles triggered by heap with only some Metaspace-spikes). Definetly looking forward to use this "for real" after 8u40 is released. As for my thoughts below: As far as I know otherwise unused Classes are kept alive by their ClassLoaders which are stored in the heap, right? So if Classloaders get promoted to oldGen mixedGCs are required to clean them up before the Classes can be unloaded in the next concurrent cycle. That would explain why it usually takes an additional concurrent cycle (triggered by heap-occupation) after a spike of class generation before Metaspace usage returns to normal. Or maybe stuff that keeps the ClassLoaders alive needs to be collected first... regards Wolfgang Am 05.03.2015 20:00, schrieb Yu Zhang: > Wolfgang, > > Thanks for reporting this. I can reproduce this behavior with a micro. > After consulting with Stefan and Jon, it is the current behavior. For > now you can keep MaxMetaspaceFreeRatio low to bring HWM down. We > might file an enhancement bug on this. > > You do not need a mixed gc to clean metaspace. > Thanks, > Jenny > On 2/19/2015 3:46 AM, Wolfgang Pedot wrote: >> One more, something just came to me: >> >> Class unloading happens during the concurrent marking-cycle so the >> mixed collects that would free up unused classloaders in oldGen >> happen after that, right? >> That would mean the classes can only be cleaned up at the next cycle >> and stay in Metaspace until then. My test causes only >> Metaspace-triggered concurrent cycles so the garbage-collector is >> always behind by one cycle and therefor the amount of classes that >> can be unloaded can be different each time, irregardless of the >> percentage of wasted heap. I guess I have to extend my test-scenario >> in a way that also causes at least some heap-driven concurrent cycles >> and see what happens then. >> Still does not explain why I hardly ever see HWM go down but it >> explains some of my more confusing test-results... >> >> regards >> Wolfgang >> > -- Mit freundlichen Gr??en Wolfgang Pedot F&E ????????????????? Fink Zeitsysteme GmbH | M?slestra?e 19-21 | 6844 Altach | ?sterreich Tel: +43 5576 72388 | Fax: +43 5576 72388 14 wolfgang.pedot at finkzeit.at | www.finkzeit.at Landesgericht Feldkirch, 72223k | USt.ld: ATU36401407 Wir erbringen unsere Leistungen ausschlie?lich auf Basis unserer AGB und Leistungs- und Nutzungsvereinbarung, die wir auf unserer Webseite unter www.finkzeit.at/rechtliches ver?ffentlicht haben. From stefan.karlsson at oracle.com Fri Mar 6 08:14:32 2015 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 06 Mar 2015 09:14:32 +0100 Subject: G1GC, Java8u40ea, Metaspace questions In-Reply-To: <54F8BF7E.2000805@finkzeit.at> References: <821215C9-36AC-41BB-A9A6-1E136341778F@finkzeit.at> <54DE3254.9030503@oracle.com> <54DE41A7.6050004@finkzeit.at> <54DE5495.2010501@oracle.com> <54E394FB.3040204@oracle.com> <54E39BCC.7090802@finkzeit.at> <54E41127.3040002@oracle.com> <54E5CD2B.7030201@finkzeit.at> <54F8A7E5.5080606@oracle.com> <54F8BF7E.2000805@finkzeit.at> Message-ID: <54F961E8.1010000@oracle.com> Hi Wolfgang, On 2015-03-05 21:41, Wolfgang Pedot wrote: > Jenny, > > thanks for getting back to me with this info. I think I found a good > setting for now and I am letting a smaller system run with that under > more normal use (most concurrent cycles triggered by heap with only > some Metaspace-spikes). > Definetly looking forward to use this "for real" after 8u40 is released. 8u40 has now been released. > > As for my thoughts below: > As far as I know otherwise unused Classes are kept alive by their > ClassLoaders which are stored in the heap, right? There are different ways to hold classes alive: 1) You have a live reference to the java.lang.ClassLoader (or subclass) object. 2) You have a live reference to any of the java.lang.Class objects belonging to the ClassLoader. 3) You have an instance of a class, that is described by any of the java.lang.Class objects belonging to the ClassLoader. 4) You have a "dependency" between a class in another ClassLoader, refering to a class in the ClassLoader that is kept alive. E.g. from class resolution in the constant pool, super classes, interfaces, JSR 292 specific code. You have to break all of these chains before your classes and class loader will be eligible for class unloading. > So if Classloaders get promoted to oldGen mixedGCs are required to > clean them up before the Classes can be unloaded in the next > concurrent cycle. That would explain why it usually takes an > additional concurrent cycle (triggered by heap-occupation) after a > spike of class generation before Metaspace usage returns to normal. Or > maybe stuff that keeps the ClassLoaders alive needs to be collected > first... We only require one marking cycle to clean out metadata. Maybe something is holding references to your class loader, classes, instances, but then gets cleaned out during the second GC. Things to look out for are, for example, SoftReferences and Finalizers. After the remark phase, after the concurrent marking phase, we have enough information to unload the classes. Most of the JVM internal data structures are cleaned out during the remark phase, the actual metaspace memory is handed back during the cleanup phase. If the JVM manages to clean out an entire "virtual space area" of metadata, the memory will be hand back to the OS and the amount of committed memory will be decreased. If not, it puts the committed memory onto the free lists so that it can be used by other metaspaces. StefanK > > regards > Wolfgang > > > > Am 05.03.2015 20:00, schrieb Yu Zhang: >> Wolfgang, >> >> Thanks for reporting this. I can reproduce this behavior with a micro. >> After consulting with Stefan and Jon, it is the current behavior. >> For now you can keep MaxMetaspaceFreeRatio low to bring HWM down. We >> might file an enhancement bug on this. >> >> You do not need a mixed gc to clean metaspace. >> Thanks, >> Jenny >> On 2/19/2015 3:46 AM, Wolfgang Pedot wrote: >>> One more, something just came to me: >>> >>> Class unloading happens during the concurrent marking-cycle so the >>> mixed collects that would free up unused classloaders in oldGen >>> happen after that, right? >>> That would mean the classes can only be cleaned up at the next cycle >>> and stay in Metaspace until then. My test causes only >>> Metaspace-triggered concurrent cycles so the garbage-collector is >>> always behind by one cycle and therefor the amount of classes that >>> can be unloaded can be different each time, irregardless of the >>> percentage of wasted heap. I guess I have to extend my test-scenario >>> in a way that also causes at least some heap-driven concurrent >>> cycles and see what happens then. >>> Still does not explain why I hardly ever see HWM go down but it >>> explains some of my more confusing test-results... >>> >>> regards >>> Wolfgang >>> >> > > From narmak101 at gmail.com Tue Mar 24 21:48:51 2015 From: narmak101 at gmail.com (Kamran Khawaja) Date: Tue, 24 Mar 2015 17:48:51 -0400 Subject: Using G1 with Apache Solr Message-ID: I'm running Solr 4.7.2 with Java 7u75 with the following JVM params: -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintAdaptiveSizePolicy -XX:+PrintReferenceGC -Xmx3072m -Xms3072m -XX:+UseG1GC -XX:+UseLargePages -XX:+AggressiveOpts -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:InitiatingHeapOccupancyPercent=35 What I'm currently seeing is that many of the gc pauses are under an acceptable 0.25 seconds but seeing way too many full GCs with an average stop time of 3.2 seconds. You can find the gc logs here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 I initially tested without specifying the HeapRegionSize but that resulted in the "humongous" message in the gc logs and a ton of full gc pauses. Any pointers or areas to further investigate would be appreciated. Thanks, -- Kam -------------- next part -------------- An HTML attachment was scrubbed... URL: From java at elyograg.org Wed Mar 25 06:47:34 2015 From: java at elyograg.org (Shawn Heisey) Date: Wed, 25 Mar 2015 00:47:34 -0600 Subject: Using G1 with Apache Solr In-Reply-To: References: Message-ID: <55125A06.7080107@elyograg.org> On 3/24/2015 3:48 PM, Kamran Khawaja wrote: > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params: > > -verbose:gc > -XX:+PrintGCDateStamps > -XX:+PrintGCDetails > -XX:+PrintAdaptiveSizePolicy > -XX:+PrintReferenceGC > -Xmx3072m > -Xms3072m > -XX:+UseG1GC > -XX:+UseLargePages > -XX:+AggressiveOpts > -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m > -XX:InitiatingHeapOccupancyPercent=35 > > > What I'm currently seeing is that many of the gc pauses are under an > acceptable 0.25 seconds but seeing way too many full GCs with an average > stop time of 3.2 seconds. > > You can find the gc logs > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 > > I initially tested without specifying the HeapRegionSize but that > resulted in the "humongous" message in the gc logs and a ton of full gc > pauses. When I replied the first time, I only sent it to Kamran. I quickly realized that I'd made that error, but I did not remember that the original message was on this list, so I sent the reply again, assuming that I saw the original on the solr-user mailing list. Now I am bringing the silliness full-circle by sending the same reply here. Some additional info: When I initially brought my settings up on this list a few months ago, I got the recommendation to try changing InitiatingHeapOccupancyPercent to 70-75 from the default of 45 ... so setting it to 35 might not be the best idea. I do currently have it set to 75 (not reflected on the wiki), but I haven't done any further analysis. I have now upgraded java on those machines to 8u40 with the following settings, I hope to have a useful gc.log soon for comparison purposes. -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75 -XX:+UseLargePages -XX:+AggressiveOpts ---- original reply ---- This is similar to the settings I've been working on that I've documented on my wiki page, with better results than you are seeing, and a larger heap than you have configured: https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector You have one additional option that I don't -- InitiatingHeapOccupancyPercent. I would suggest running without that option to see how it affects your GC times. I'm curious what OS you're running under, whether the OS and Java are 64-bit, and whether you have actually enabled huge pages in your operating system. If it's Linux and you have enabled huge pages, have you turned off transparent huge pages as documented by Oracle: https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge On my servers, I do *not* have huge pages configured in the operating system, so the UseLargePages java option isn't doing anything. One final thing ... Oracle developers have claimed that Java 8u40 has some major improvements to the G1 collector, particularly for programs that allocate very large objects. Can you try 8u40? Thanks, Shawn From thomas.schatzl at oracle.com Wed Mar 25 14:28:12 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 25 Mar 2015 15:28:12 +0100 Subject: Using G1 with Apache Solr In-Reply-To: References: Message-ID: <1427293692.3163.34.camel@oracle.com> Hi Kamran, On Tue, 2015-03-24 at 17:48 -0400, Kamran Khawaja wrote: > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params: > -verbose:gc > -XX:+PrintGCDateStamps > -XX:+PrintGCDetails > -XX:+PrintAdaptiveSizePolicy > -XX:+PrintReferenceGC > -Xmx3072m > -Xms3072m > -XX:+UseG1GC > -XX:+UseLargePages > -XX:+AggressiveOpts > -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m > -XX:InitiatingHeapOccupancyPercent=35 > What I'm currently seeing is that many of the gc pauses are under an > acceptable 0.25 seconds but seeing way too many full GCs with an > average stop time of 3.2 seconds. > > You can find the gc logs > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 > > I initially tested without specifying the HeapRegionSize but that > resulted in the "humongous" message in the gc logs and a ton of full > gc pauses. > > Any pointers or areas to further investigate would be appreciated. The problem seems to be somewhat inconsistent survival rate in the young gen. Most of the time, >5% of the young gen survives, while every now and then >33% (or more) survives. Just before these full gcs the heap seems already be fairly full, and the existing mechanisms can not handle this. There are a few things you could try: - disable PLAB resizing (-XX:-ResizePLAB), as this may decrease the amount of space that is actually required for copying. - increase the evacuation reserve (-XX:G1ReservePercent=15; default is 10) which purpose is exactly some safety buffer for such cases. - cap the maximum young generation size, so that even when a large part of the young generation survives, this part is not that big. E.g. G1MaxNewSizePercent=25 (which limits young gen size to 768M which seems okay to me; default is 60; you also need to set -XX: +UnlockExperimentalVMOptions in front of that) Thanks, Thomas From narmak101 at gmail.com Wed Mar 25 18:05:46 2015 From: narmak101 at gmail.com (Kamran Khawaja) Date: Wed, 25 Mar 2015 14:05:46 -0400 Subject: Using G1 with Apache Solr In-Reply-To: <55125A06.7080107@elyograg.org> References: <55125A06.7080107@elyograg.org> Message-ID: Solr is being run on a CentOS 7 server. Both the os and java are 64 bit. I see that THP is enabled on the server. I'll have to discuss with the rest of my team about disabling THP and upgrading to java 8 but I'll post back when I have some results from my testing. Thanks, -- Kamran Khawaja On Wed, Mar 25, 2015 at 2:47 AM, Shawn Heisey wrote: > On 3/24/2015 3:48 PM, Kamran Khawaja wrote: > > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params: > > > > -verbose:gc > > -XX:+PrintGCDateStamps > > -XX:+PrintGCDetails > > -XX:+PrintAdaptiveSizePolicy > > -XX:+PrintReferenceGC > > -Xmx3072m > > -Xms3072m > > -XX:+UseG1GC > > -XX:+UseLargePages > > -XX:+AggressiveOpts > > -XX:+ParallelRefProcEnabled > > -XX:G1HeapRegionSize=8m > > -XX:InitiatingHeapOccupancyPercent=35 > > > > > > What I'm currently seeing is that many of the gc pauses are under an > > acceptable 0.25 seconds but seeing way too many full GCs with an average > > stop time of 3.2 seconds. > > > > You can find the gc logs > > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 > > > > I initially tested without specifying the HeapRegionSize but that > > resulted in the "humongous" message in the gc logs and a ton of full gc > > pauses. > > When I replied the first time, I only sent it to Kamran. I quickly > realized that I'd made that error, but I did not remember that the > original message was on this list, so I sent the reply again, assuming > that I saw the original on the solr-user mailing list. Now I am > bringing the silliness full-circle by sending the same reply here. > > Some additional info: > > When I initially brought my settings up on this list a few months ago, I > got the recommendation to try changing InitiatingHeapOccupancyPercent to > 70-75 from the default of 45 ... so setting it to 35 might not be the > best idea. I do currently have it set to 75 (not reflected on the > wiki), but I haven't done any further analysis. > > I have now upgraded java on those machines to 8u40 with the following > settings, I hope to have a useful gc.log soon for comparison purposes. > > -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m > -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75 > -XX:+UseLargePages -XX:+AggressiveOpts > > > ---- original reply ---- > > This is similar to the settings I've been working on that I've > documented on my wiki page, with better results than you are seeing, and > a larger heap than you have configured: > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector > > You have one additional option that I don't -- > InitiatingHeapOccupancyPercent. I would suggest running without that > option to see how it affects your GC times. > > I'm curious what OS you're running under, whether the OS and Java are > 64-bit, and whether you have actually enabled huge pages in your > operating system. If it's Linux and you have enabled huge pages, have > you turned off transparent huge pages as documented by Oracle: > > > https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge > > On my servers, I do *not* have huge pages configured in the operating > system, so the UseLargePages java option isn't doing anything. > > One final thing ... Oracle developers have claimed that Java 8u40 has > some major improvements to the G1 collector, particularly for programs > that allocate very large objects. Can you try 8u40? > > Thanks, > Shawn > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Wed Mar 25 20:24:38 2015 From: charlie.hunt at oracle.com (charlie hunt) Date: Wed, 25 Mar 2015 15:24:38 -0500 Subject: Using G1 with Apache Solr In-Reply-To: References: <55125A06.7080107@elyograg.org> Message-ID: <85F3242A-B523-49FE-BAF2-1F710D9BDC94@oracle.com> If on Linux, most definitely disable THP (transparent huge pages). You will likely not have a good experience with any GC with THP enabled. charlie > On Mar 25, 2015, at 1:05 PM, Kamran Khawaja wrote: > > Solr is being run on a CentOS 7 server. Both the os and java are 64 bit. I see that THP is enabled on the server. > I'll have to discuss with the rest of my team about disabling THP and upgrading to java 8 but I'll post back when I have some results from my testing. > > > Thanks, > > -- > Kamran Khawaja > > > On Wed, Mar 25, 2015 at 2:47 AM, Shawn Heisey > wrote: > On 3/24/2015 3:48 PM, Kamran Khawaja wrote: > > I'm running Solr 4.7.2 with Java 7u75 with the following JVM params: > > > > -verbose:gc > > -XX:+PrintGCDateStamps > > -XX:+PrintGCDetails > > -XX:+PrintAdaptiveSizePolicy > > -XX:+PrintReferenceGC > > -Xmx3072m > > -Xms3072m > > -XX:+UseG1GC > > -XX:+UseLargePages > > -XX:+AggressiveOpts > > -XX:+ParallelRefProcEnabled > > -XX:G1HeapRegionSize=8m > > -XX:InitiatingHeapOccupancyPercent=35 > > > > > > What I'm currently seeing is that many of the gc pauses are under an > > acceptable 0.25 seconds but seeing way too many full GCs with an average > > stop time of 3.2 seconds. > > > > You can find the gc logs > > here: https://www.dropbox.com/s/v04b336v2k5l05e/g1_gc_7u75.log.gz?dl=0 > > > > I initially tested without specifying the HeapRegionSize but that > > resulted in the "humongous" message in the gc logs and a ton of full gc > > pauses. > > When I replied the first time, I only sent it to Kamran. I quickly > realized that I'd made that error, but I did not remember that the > original message was on this list, so I sent the reply again, assuming > that I saw the original on the solr-user mailing list. Now I am > bringing the silliness full-circle by sending the same reply here. > > Some additional info: > > When I initially brought my settings up on this list a few months ago, I > got the recommendation to try changing InitiatingHeapOccupancyPercent to > 70-75 from the default of 45 ... so setting it to 35 might not be the > best idea. I do currently have it set to 75 (not reflected on the > wiki), but I haven't done any further analysis. > > I have now upgraded java on those machines to 8u40 with the following > settings, I hope to have a useful gc.log soon for comparison purposes. > > -XX:+UseG1GC -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m > -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=75 > -XX:+UseLargePages -XX:+AggressiveOpts > > > ---- original reply ---- > > This is similar to the settings I've been working on that I've > documented on my wiki page, with better results than you are seeing, and > a larger heap than you have configured: > > https://wiki.apache.org/solr/ShawnHeisey#G1_.28Garbage_First.29_Collector > > You have one additional option that I don't -- > InitiatingHeapOccupancyPercent. I would suggest running without that > option to see how it affects your GC times. > > I'm curious what OS you're running under, whether the OS and Java are > 64-bit, and whether you have actually enabled huge pages in your > operating system. If it's Linux and you have enabled huge pages, have > you turned off transparent huge pages as documented by Oracle: > > https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge > > On my servers, I do *not* have huge pages configured in the operating > system, so the UseLargePages java option isn't doing anything. > > One final thing ... Oracle developers have claimed that Java 8u40 has > some major improvements to the G1 collector, particularly for programs > that allocate very large objects. Can you try 8u40? > > Thanks, > Shawn > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From ejones at twitter.com Thu Mar 26 15:04:06 2015 From: ejones at twitter.com (Evan Jones) Date: Thu, 26 Mar 2015 11:04:06 -0400 Subject: GC / safepoint pauses consuming more real time than user plus system Message-ID: I finally figured out the source of a problematic garbage collection pauses that take more real time than user plus system time, on our systems that are otherwise unloaded: It turns out that writes to the mmap-ed hsperfdata file can block when the system is under heavy disk IO. Since safepoint and GC threads increment counters in this file, it causes long safepoint and garbage collection pauses. In case anyone ever observes pauses that look like this, you may want to add the -XX:+PerfDisableSharedMem JVM flag and see if that resolves them. It has worked for our services. See the following for more detail: http://www.evanjones.ca/jvm-mmap-pause.html Here is an example "suspicious" pause. I was seeing many of these, across basically all of Twitter's services, which caused me to investigate the issue. 2014-12-10T12:38:44.419+0000: 58758.830: [GC (Allocation Failure)[ParNew: 11868438K->103534K(13212096K), 0.7651580 secs] 12506389K->741669K(17406400K), 0.7652510 secs] [Times: user=0.36 sys=0.01, real=0.77 secs] -------------- next part -------------- An HTML attachment was scrubbed... URL: From gabi_io at yahoo.com Fri Mar 27 15:30:30 2015 From: gabi_io at yahoo.com (Medan Gavril) Date: Fri, 27 Mar 2015 15:30:30 +0000 (UTC) Subject: G1 root cause and tuning Message-ID: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> Hi , I saw your G1 presentation and I found it good and interesting. I am new to G1 tuning and I would need you suggestions if you ?have time. In our app, when we have a FULL GC :? ? 1. it restarts the application? ? 2. we cannot get the right data to understand the root cause We switched from CMS to G1 in order to avoid long FULL GCs. JRE 1.17 update 17 it is being used. GC params: wrapper.java.additional.1=-serverwrapper.java.additional.2=-XX:+PrintCommandLineFlagswrapper.java.additional.3=-XX:+UseG1GC wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500 wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffewrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffewrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryErrorwrapper.java.additional.11=-verbose:gcwrapper.java.additional.12=-XX:+PrintGCDetailswrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR% wrapper.java.additional.52=-XX:+PrintGCTimeStampswrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTimewrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy The error from wrapper log is: INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ? ?[Eden: 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 8127M(22480M)->7436M(22480M)]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?[Times: user=1.51 sys=0.02, real=0.19 secs]?INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.265: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | Total time for which application threads were stopped: 0.2031307 secondsINFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 189267984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | ?93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed]INFO ? | jvm 1 ? ?| 2015/03/25 15:23:43.268 | 93238.285: [Full GCERROR ?| wrapper ?| 2015/03/25 15:25:57.694 | JVM appears hung: Timed out waiting for signal from JVM.ERROR ?| wrapper ?| 2015/03/25 15:25:58.021 | JVM did not exit? INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.335: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 10603200512 bytes, allocation request: 14584896 bytes, threshold: 10607394780 bytes (45.00 %), source: concurrent humongous allocation]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.337: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.337: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | 92696.337: [GC pause (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, target pause time: 2500.00 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.338: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 regions, survivors: 8 regions, predicted young region time: 32.04 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.289 | ?92696.338: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 regions, survivors: 8 regions, old: 0 regions, predicted pause time: 197.80 ms, target pause time: 2500.00 ms]INFO ? | jvm 1 ? ?| 2015/03/25 15:14:41.398 | ?(initial-mark), 0.15117107 secs] We increased the wrapper timeout but still no useful data about the FULL GC. Any suggestion is highly appreciated. Currently I suggested to add ?"PrintHeapAtGCExtended " Best Regards, Gabi Medan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: logs.zip Type: application/octet-stream Size: 1295341 bytes Desc: not available URL: From yu.zhang at oracle.com Fri Mar 27 23:18:31 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Fri, 27 Mar 2015 16:18:31 -0700 Subject: G1 root cause and tuning In-Reply-To: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> Message-ID: <5515E547.2060901@oracle.com> Medan, I could not find the humongous allocation in the logs you attached. But from the snip you provided, it seems the humongous objects allocation (the bigger one is ~10g) might be the issue. If you can provide a cleaner gc log( with -Xloggc:gc.log) without the wrapper information, it would be easier to analyze. Thanks, Jenny On 3/27/2015 8:30 AM, Medan Gavril wrote: > Hi , > > I saw your G1 presentation and I found it good and interesting. I am > new to G1 tuning and I would need you suggestions if you have time. > > In our app, when we have a FULL GC : > 1. it restarts the application > 2. we cannot get the right data to understand the root cause > > We switched from CMS to G1 in order to avoid long FULL GCs. > > JRE 1.17 update 17 it is being used. > > GC params: > > wrapper.java.additional.1=-server > wrapper.java.additional.2=-XX:+PrintCommandLineFlags > wrapper.java.additional.3=-XX:+UseG1GC > wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500 > wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe > wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe > wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError > wrapper.java.additional.11=-verbose:gc > wrapper.java.additional.12=-XX:+PrintGCDetails > wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR% > > wrapper.java.additional.52=-XX:+PrintGCTimeStamps > wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime > wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy > > The error from wrapper log is: > > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Eden: > 692M(968M)->0B(972M) Survivors: 56M->52M Heap: > 8127M(22480M)->7436M(22480M)]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Times: user=1.51 > sys=0.02, real=0.19 secs] / > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous > allocation request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | Total time for which > application threads were stopped: 0.2031307 seconds/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous > allocation request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation > request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > */INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/* > /ERROR | wrapper | 2015/03/25 15:25:57.694 | JVM appears hung: > *Timed out waiting for signal from JVM.*/ > /ERROR | wrapper | 2015/03/25 15:25:58.021 | JVM did not exit / > > > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.335: > [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, > reason: occupancy higher than threshold, occupancy: 10603200512 bytes, > allocation request: 14584896 bytes, threshold: 10607394780 bytes > (45.00 %), source: concurrent humongous allocation]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: > [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, > reason: requested by GC cause, GC cause: G1 Humongous Allocation]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: > [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: > concurrent cycle initiation requested]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [GC pause > (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing > CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, > target pause time: 2500.00 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: > [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 > regions, survivors: 8 regions, predicted young region time: 32.04 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: > [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 > regions, survivors: 8 regions, old: 0 regions, predicted pause time: > 197.80 ms, target pause time: 2500.00 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.398 | (initial-mark), > 0.15117107 secs]/ > / > / > /We increased the wrapper timeout but still no useful data about the > FULL GC./ > / > / > /Any suggestion is highly appreciated. Currently I suggested to add > "PrintHeapAtGCExtended "/ > / > / > /Best Regards, > Gabi Medan/ > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmohta.coder at gmail.com Mon Mar 30 09:47:05 2015 From: rmohta.coder at gmail.com (Rohit Mohta) Date: Mon, 30 Mar 2015 10:47:05 +0100 Subject: Java 7 Default GC for server Message-ID: Hi All, I have quite some questions around default GC in server mode (a) Is the formula for calculating number GC threads is 3 + (5 * cores/8) (b) In the GC logs, I can see the below lines printed even when the application is idle. Is this something to do with JIT or some other JVM internal operation? 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which application threads were stopped: 0.0000500 seconds 2015-03-05T14:42:18.320+0000: 520807.126: Application time: 0.0000240 seconds 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which application threads were stopped: 0.0000500 seconds 2015-03-05T14:42:58.405+0000: 520847.212: Application time: 40.0857170 seconds 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which application threads were stopped: 0.0001980 seconds 2015-03-05T14:42:58.406+0000: 520847.212: Application time: 0.0000250 seconds 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which application threads were stopped: 0.0000520 seconds 2015-03-05T14:43:28.406+0000: 520877.213: Application time: 30.0001550 second (c) We have about 15 JVM's on a single server. Linux Server has 24 cores and about 37GB of RAM. When we restart all the JVM's, they start with a good heap size, about 700MB+. And we have no issues with that. After a day or so, some of the processes drop down to less than 100MB of heap size and they start doing very frequent minor and major GC. We have lot of unused memory on the server. Why won't GC cause expansion? I know we can set a Xms to a minimum value, but we are curious to know why few of them go from 750MB to 100MB, whereas some of them stay around 500MB. Is this has to do with SizeIncrement, SizeSupplement or AdaptiveeSize values? Thanks, Rohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmohta.coder at gmail.com Mon Mar 30 09:54:37 2015 From: rmohta.coder at gmail.com (Rohit Mohta) Date: Mon, 30 Mar 2015 10:54:37 +0100 Subject: Java 7 Print Tenuring Distribution Message-ID: Hi, We are using JDK 7 in server mode. There is no explicit GC configuration, so it's using default GC collector. We are trying to print the tenuring distribution in the logs, but it won't print. We have tried +PrintTenuringDistribution and also -PrintTenuringDistribution, but neither work. Is this not configured to work with Default Parallel GC? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jon.masamitsu at oracle.com Mon Mar 30 20:59:51 2015 From: jon.masamitsu at oracle.com (Jon Masamitsu) Date: Mon, 30 Mar 2015 13:59:51 -0700 Subject: Java 7 Default GC for server In-Reply-To: References: Message-ID: <5519B947.4020906@oracle.com> On 03/30/2015 02:47 AM, Rohit Mohta wrote: > Hi All, > > I have quite some questions around default GC in server mode > > (a) Is the formula for calculating number GC threads is > 3 + (5 * cores/8) For N hardware threads for N <= 8, GC threads = N For N > 8, 8 + (N-8) * 5 / 8 > > (b) In the GC logs, I can see the below lines printed even when the > application is idle. Is this something to do with JIT or some other > JVM internal operation? Yes, some other (than GC) JVM operation that requires a safepoint. > > 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which > application threads were stopped: 0.0000500 seconds > 2015-03-05T14:42:18.320+0000: 520807.126: Application time: 0.0000240 > seconds > 2015-03-05T14:42:18.320+0000: 520807.126: Total time for which > application threads were stopped: 0.0000500 seconds > 2015-03-05T14:42:58.405+0000: 520847.212: Application time: 40.0857170 > seconds > 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which > application threads were stopped: 0.0001980 seconds > 2015-03-05T14:42:58.406+0000: 520847.212: Application time: 0.0000250 > seconds > 2015-03-05T14:42:58.406+0000: 520847.212: Total time for which > application threads were stopped: 0.0000520 seconds > 2015-03-05T14:43:28.406+0000: 520877.213: Application time: 30.0001550 > second > > (c) We have about 15 JVM's on a single server. Linux Server has 24 > cores and about 37GB of RAM. When we restart all the JVM's, they start > with a good heap size, about 700MB+. And we have no issues with that. > After a day or so, some of the processes drop down to less than 100MB > of heap size and they start doing very frequent minor and major GC. We > have lot of unused memory on the server. I don't recall seeing that happen before. Maybe another on the list has an idea. > Why won't GC cause expansion? > I know we can set a Xms to a minimum value, but we are curious to know > why few of them go from 750MB to 100MB, whereas some of them stay > around 500MB. Is this has to do with SizeIncrement, SizeSupplement or > AdaptiveeSize values? Try -XX:+PrintAdaptiveSizePolicy and see it that tells you why the heap is not growing. Jon > > Thanks, > Rohit > > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Tue Mar 31 00:19:54 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 30 Mar 2015 17:19:54 -0700 Subject: G1 root cause and tuning In-Reply-To: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> Message-ID: <5519E82A.5050808@oracle.com> Medan, Thanks for the logs. The log messages are somewhat mangled, some of the records are not complete. There is 1 Full gc in wrapper.log.14. Others do not have full gc. This workload has a lot of humongous objects allocations, up to 10g size. Though g1 can reclaim some humongous objects at young gc, a lot of reclamations are done during full gc. But this log snip does not make sense to me. 152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 115.49 ms, remaining time: 2384.51 ms, target pause time: 2500.00 ms] 152549.805: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 155 regions, survivors: 23 regions, predicted young region time: 71.62 ms] 152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted pause time: 187.12 ms, target pause time: 2500.00 ms] , 0.12006414 secs] [Parallel Time: 93.1 ms] [GC Worker Start (ms): 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.9 152549804.9 152549805.0 152549805.0 Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff: 0.2] [Ext Root Scanning (ms): 13.3 13.4 18.6 13.2 17.0 18.5 15.3 17.2 14.8 0.1 17.0 17.2 11.4 Avg: 14.4, Min: 0.1, Max: 18.6, Diff: 18.5] [Update RS (ms): 51.9 52.5 48.9 52.0 51.3 48.4 50.8 50.0 50.7 47.1 49.1 49.6 52.2 * Avg: 50.3, Min: 47.1, Max: 52.5, Diff: 5.4] [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24 Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21] [Scan RS (ms): 0.3 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Avg: 0.1, Min: 0.0, Max: 0.3, Diff: 0.3] [Object Copy (ms): 21.7 21.4 19.7 21.8 18.9 20.3 21.1 20.0 21.6 18.2 21.0 20.2 23.5 Avg: 20.7, Min: 18.2, Max: 23.5, Diff: 5.3] [Termination (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42 Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50] [GC Worker End (ms): 152549892.1 152549892.0 152549892.1 152549892.0 152549892.1 152549892.1 152549892.0 152549892.1 152549892.0 152549892.0 152549892.0 152549892.1 152549892.1 Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff: 0.1] [GC Worker (ms): 87.3 87.3 87.3 87.3 87.3 87.3 87.2 87.3 87.2 87.1 87.1 87.1 87.1 Avg: 87.2, Min: 87.1, Max: 87.3, Diff: 0.2] [GC Worker Other (ms): 5.9 5.9 5.9 5.9 5.9 5.9 5.9 5.9 5.9 27.8 6.0 6.1 6.1 Avg: 7.6, Min: 5.9, Max: 27.8, Diff: 21.9] [Clear CT: 0.2 ms] [Other: 26.7 ms] [Choose CSet: 0.0 ms] [Ref Proc: 24.7 ms] [Ref Enq: 0.1 ms] [Free CSet: 1.0 ms] [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: 7037M(22480M)->6476M(22480M)] [Times: user=1.34 sys=0.00, real=0.12 secs] 152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes] 152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes] 152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] Total time for which application threads were stopped: 0.1240664 seconds 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes] 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes] 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 305776936 bytes] 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 305776936 bytes, attempted expansion amount: 306184192 bytes] 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] 152549.958: [Full GC Before allocating humongous object, 6 out of 22g heap is used, but allocation 300m object caused a full gc? I do not have an explanation for this. * Thanks, Jenny On 3/27/2015 8:30 AM, Medan Gavril wrote: > Hi , > > I saw your G1 presentation and I found it good and interesting. I am > new to G1 tuning and I would need you suggestions if you have time. > > In our app, when we have a FULL GC : > 1. it restarts the application > 2. we cannot get the right data to understand the root cause > > We switched from CMS to G1 in order to avoid long FULL GCs. > > JRE 1.17 update 17 it is being used. > > GC params: > > wrapper.java.additional.1=-server > wrapper.java.additional.2=-XX:+PrintCommandLineFlags > wrapper.java.additional.3=-XX:+UseG1GC > wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500 > wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe > wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe > wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError > wrapper.java.additional.11=-verbose:gc > wrapper.java.additional.12=-XX:+PrintGCDetails > wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR% > > wrapper.java.additional.52=-XX:+PrintGCTimeStamps > wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime > wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy > > The error from wrapper log is: > > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Eden: > 692M(968M)->0B(972M) Survivors: 56M->52M Heap: > 8127M(22480M)->7436M(22480M)]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Times: user=1.51 > sys=0.02, real=0.19 secs] / > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous > allocation request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | Total time for which > application threads were stopped: 0.2031307 seconds/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous > allocation request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation > request failed, allocation request: 189267984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) expand the heap, requested expansion > amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/ > /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: > [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap > expansion operation failed]/ > */INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/* > /ERROR | wrapper | 2015/03/25 15:25:57.694 | JVM appears hung: > *Timed out waiting for signal from JVM.*/ > /ERROR | wrapper | 2015/03/25 15:25:58.021 | JVM did not exit / > > > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.335: > [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, > reason: occupancy higher than threshold, occupancy: 10603200512 bytes, > allocation request: 14584896 bytes, threshold: 10607394780 bytes > (45.00 %), source: concurrent humongous allocation]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: > [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, > reason: requested by GC cause, GC cause: G1 Humongous Allocation]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: > [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: > concurrent cycle initiation requested]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [GC pause > (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing > CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, > target pause time: 2500.00 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: > [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 > regions, survivors: 8 regions, predicted young region time: 32.04 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: > [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 > regions, survivors: 8 regions, old: 0 regions, predicted pause time: > 197.80 ms, target pause time: 2500.00 ms]/ > /INFO | jvm 1 | 2015/03/25 15:14:41.398 | (initial-mark), > 0.15117107 secs]/ > / > / > /We increased the wrapper timeout but still no useful data about the > FULL GC./ > / > / > /Any suggestion is highly appreciated. Currently I suggested to add > "PrintHeapAtGCExtended "/ > / > / > /Best Regards, > Gabi Medan/ > > > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Tue Mar 31 01:41:03 2015 From: charlie.hunt at oracle.com (charlie hunt) Date: Mon, 30 Mar 2015 20:41:03 -0500 Subject: G1 root cause and tuning In-Reply-To: <5519E82A.5050808@oracle.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> <5519E82A.5050808@oracle.com> Message-ID: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> Hi Jenny, One possibility is that there is not enough available contiguous regions to satisfy a 300+ MB humongous allocation. If we assume a 22 GB Java heap, (a little larger than the 22480M shown in the log), with 2048 G1 regions (default as you know), the region size would be about 11 MB. That implies there needs to be about 30 contiguous G1 regions available to satisfy the humongous allocation request. An unrelated question ? do other GCs have a similar pattern of a rather large percentage of time in Ref Proc relative to the overall pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. If that?s the case, then if -XX:+ParallelRefProcEnabled is not already set, there may be some low hanging tuning fruit. But, it is not going to address the frequent humongous allocation problem. It is also interesting in that the pause time goal is 2500 ms, yet the actual pause time is 120 ms, and eden is being sized at less than 1 GB out of a 22 GB Java heap. Are the frequent humongous allocations messing with the heap sizing heuristics? hths, charlie > On Mar 30, 2015, at 7:19 PM, Yu Zhang wrote: > > Medan, > > Thanks for the logs. The log messages are somewhat mangled, some of the records are not complete. > There is 1 Full gc in wrapper.log.14. Others do not have full gc. This workload has a lot of humongous objects allocations, up to 10g size. Though g1 can reclaim some humongous objects at young gc, a lot of reclamations are done during full gc. > > But this log snip does not make sense to me. > > 152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 115.49 ms, remaining time: 2384.51 ms, target pause time: 2500.00 ms] > 152549.805: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 155 regions, survivors: 23 regions, predicted young region time: 71.62 ms] > 152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted pause time: 187.12 ms, target pause time: 2500.00 ms] > , 0.12006414 secs] > [Parallel Time: 93.1 ms] > [GC Worker Start (ms): 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 152549804.9 152549804.9 152549805.0 152549805.0 > Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff: 0.2] > [Ext Root Scanning (ms): 13.3 13.4 18.6 13.2 17.0 18.5 15.3 17.2 14.8 0.1 17.0 17.2 11.4 > Avg: 14.4, Min: 0.1, Max: 18.6, Diff: 18.5] > [Update RS (ms): 51.9 52.5 48.9 52.0 51.3 48.4 50.8 50.0 50.7 47.1 49.1 49.6 52.2 > Avg: 50.3, Min: 47.1, Max: 52.5, Diff: 5.4] > [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24 > Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21] > [Scan RS (ms): 0.3 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.1, Min: 0.0, Max: 0.3, Diff: 0.3] > [Object Copy (ms): 21.7 21.4 19.7 21.8 18.9 20.3 21.1 20.0 21.6 18.2 21.0 20.2 23.5 > Avg: 20.7, Min: 18.2, Max: 23.5, Diff: 5.3] > [Termination (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 > Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] > [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42 > Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50] > [GC Worker End (ms): 152549892.1 152549892.0 152549892.1 152549892.0 152549892.1 152549892.1 152549892.0 152549892.1 152549892.0 152549892.0 152549892.0 152549892.1 152549892.1 > Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff: 0.1] > [GC Worker (ms): 87.3 87.3 87.3 87.3 87.3 87.3 87.2 87.3 87.2 87.1 87.1 87.1 87.1 > Avg: 87.2, Min: 87.1, Max: 87.3, Diff: 0.2] > [GC Worker Other (ms): 5.9 5.9 5.9 5.9 5.9 5.9 5.9 5.9 5.9 27.8 6.0 6.1 6.1 > Avg: 7.6, Min: 5.9, Max: 27.8, Diff: 21.9] > [Clear CT: 0.2 ms] > [Other: 26.7 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 24.7 ms] > [Ref Enq: 0.1 ms] > [Free CSet: 1.0 ms] > [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: 7037M(22480M)->6476M(22480M)] > [Times: user=1.34 sys=0.00, real=0.12 secs] > 152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes] > 152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes] > 152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] > Total time for which application threads were stopped: 0.1240664 seconds > 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 305776936 bytes] > 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 260046848 bytes, attempted expansion amount: 260046848 bytes] > 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] > 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 305776936 bytes] > 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 305776936 bytes, attempted expansion amount: 306184192 bytes] > 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] > 152549.958: [Full GC > > Before allocating humongous object, 6 out of 22g heap is used, but allocation 300m object caused a full gc? I do not have an explanation for this. > > Thanks, > Jenny > On 3/27/2015 8:30 AM, Medan Gavril wrote: >> Hi , >> >> I saw your G1 presentation and I found it good and interesting. I am new to G1 tuning and I would need you suggestions if you have time. >> >> In our app, when we have a FULL GC : >> 1. it restarts the application >> 2. we cannot get the right data to understand the root cause >> >> We switched from CMS to G1 in order to avoid long FULL GCs. >> >> JRE 1.17 update 17 it is being used. >> >> GC params: >> >> wrapper.java.additional.1=-server >> wrapper.java.additional.2=-XX:+PrintCommandLineFlags >> wrapper.java.additional.3=-XX:+UseG1GC >> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500 >> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe >> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe >> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError >> wrapper.java.additional.11=-verbose:gc >> wrapper.java.additional.12=-XX:+PrintGCDetails >> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR% >> >> wrapper.java.additional.52=-XX:+PrintGCTimeStamps >> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime >> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy >> >> The error from wrapper log is: >> >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Eden: 692M(968M)->0B(972M) Survivors: 56M->52M Heap: 8127M(22480M)->7436M(22480M)] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Times: user=1.51 sys=0.02, real=0.19 secs] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | Total time for which application threads were stopped: 0.2031307 seconds >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: humongous allocation request failed, allocation request: 189267984 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 188743680 bytes, attempted expansion amount: 188743680 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: allocation request failed, allocation request: 189267984 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) expand the heap, requested expansion amount: 189267984 bytes, attempted expansion amount: 192937984 bytes] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap expansion operation failed] >> INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [Full GC >> ERROR | wrapper | 2015/03/25 15:25:57.694 | JVM appears hung: Timed out waiting for signal from JVM. >> ERROR | wrapper | 2015/03/25 15:25:58.021 | JVM did not exit >> >> >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.335: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 10603200512 bytes, allocation request: 14584896 bytes, threshold: 10607394780 bytes (45.00 %), source: concurrent humongous allocation] >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: requested by GC cause, GC cause: G1 Humongous Allocation] >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: concurrent cycle initiation requested] >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [GC pause (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, target pause time: 2500.00 ms] >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 114 regions, survivors: 8 regions, predicted young region time: 32.04 ms] >> INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 regions, survivors: 8 regions, old: 0 regions, predicted pause time: 197.80 ms, target pause time: 2500.00 ms] >> INFO | jvm 1 | 2015/03/25 15:14:41.398 | (initial-mark), 0.15117107 secs] >> >> We increased the wrapper timeout but still no useful data about the FULL GC. >> >> Any suggestion is highly appreciated. Currently I suggested to add "PrintHeapAtGCExtended " >> >> Best Regards, >> Gabi Medan >> >> >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > > _______________________________________________ > hotspot-gc-use mailing list > hotspot-gc-use at openjdk.java.net > http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use -------------- next part -------------- An HTML attachment was scrubbed... URL: From yu.zhang at oracle.com Tue Mar 31 02:23:12 2015 From: yu.zhang at oracle.com (Yu Zhang) Date: Mon, 30 Mar 2015 19:23:12 -0700 Subject: G1 root cause and tuning In-Reply-To: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> <5519E82A.5050808@oracle.com> <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> Message-ID: <551A0510.4080604@oracle.com> Charlie, Thanks for the comments. please see my response inline. Thanks, Jenny On 3/30/2015 6:41 PM, charlie hunt wrote: > Hi Jenny, > > One possibility is that there is not enough available contiguous > regions to satisfy a 300+ MB humongous allocation. > > If we assume a 22 GB Java heap, (a little larger than the 22480M shown > in the log), with 2048 G1 regions (default as you know), the region > size would be about 11 MB. That implies there needs to be about 30 > contiguous G1 regions available to satisfy the humongous allocation > request. Good point! > > An unrelated question ? do other GCs have a similar pattern of a > rather large percentage of time in Ref Proc relative to the overall > pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. If that?s > the case, then if -XX:+ParallelRefProcEnabled is not already set, > there may be some low hanging tuning fruit. But, it is not going to > address the frequent humongous allocation problem. It is also > interesting in that the pause time goal is 2500 ms, yet the actual > pause time is 120 ms, and eden is being sized at less than 1 GB out of > a 22 GB Java heap. Are the frequent humongous allocations messing > with the heap sizing heuristics? Most of the time, the RefProc is below 10ms, but jumps to 20-60ms, so it might help with enabling parallelrefproc. I do not remember in jdk7, if it is on by default or not. This log is really strange, as most of the time, the heap usage is ~9g out of 22g, then the humongous allocations jumps in. As the log entries are mangled, it is hard to connect the dots. > > hths, > > charlie > >> On Mar 30, 2015, at 7:19 PM, Yu Zhang > > wrote: >> >> Medan, >> >> Thanks for the logs. The log messages are somewhat mangled, some of >> the records are not complete. >> There is 1 Full gc in wrapper.log.14. Others do not have full gc. >> This workload has a lot of humongous objects allocations, up to 10g >> size. Though g1 can reclaim some humongous objects at young gc, a lot >> of reclamations are done during full gc. >> >> But this log snip does not make sense to me. >> >> 152549.805: [GC pause (young) 152549.805: [G1Ergonomics (CSet >> Construction) start choosing CSet, predicted base time: 115.49 ms, >> remaining time: 2384.51 ms, target pause time: 2500.00 ms] >> 152549.805: [G1Ergonomics (CSet Construction) add young regions to >> CSet, eden: 155 regions, survivors: 23 regions, predicted young >> region time: 71.62 ms] >> 152549.805: [G1Ergonomics (CSet Construction) finish choosing CSet, >> eden: 155 regions, survivors: 23 regions, old: 0 regions, predicted >> pause time: 187.12 ms, target pause time: 2500.00 ms] >> , 0.12006414 secs] >> [Parallel Time: 93.1 ms] >> [GC Worker Start (ms): 152549804.8 152549804.8 152549804.8 >> 152549804.8 152549804.8 152549804.8 152549804.8 152549804.8 >> 152549804.8 152549804.9 152549804.9 152549805.0 152549805.0 >> Avg: 152549804.8, Min: 152549804.8, Max: 152549805.0, Diff: 0.2] >> [Ext Root Scanning (ms): 13.3 13.4 18.6 13.2 17.0 18.5 >> 15.3 17.2 14.8 0.1 17.0 17.2 11.4 >> Avg: 14.4, Min: 0.1, Max: 18.6, Diff: 18.5] >> [Update RS (ms): 51.9 52.5 48.9 52.0 51.3 48.4 50.8 >> 50.0 50.7 47.1 49.1 49.6 52.2 >> * Avg: 50.3, Min: 47.1, Max: 52.5, Diff: 5.4] >> [Processed Buffers : 27 20 14 29 21 18 18 29 17 8 17 24 24 >> Sum: 266, Avg: 20, Min: 8, Max: 29, Diff: 21] >> [Scan RS (ms): 0.3 0.0 0.0 0.2 0.0 0.0 0.0 0.0 0.0 >> 0.0 0.0 0.0 0.0 >> Avg: 0.1, Min: 0.0, Max: 0.3, Diff: 0.3] >> [Object Copy (ms): 21.7 21.4 19.7 21.8 18.9 20.3 21.1 >> 20.0 21.6 18.2 21.0 20.2 23.5 >> Avg: 20.7, Min: 18.2, Max: 23.5, Diff: 5.3] >> [Termination (ms): 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 >> 0.0 0.0 0.0 0.0 >> Avg: 0.0, Min: 0.0, Max: 0.0, Diff: 0.0] >> [Termination Attempts : 41 40 33 38 51 32 43 31 32 45 34 1 42 >> Sum: 463, Avg: 35, Min: 1, Max: 51, Diff: 50] >> [GC Worker End (ms): 152549892.1 152549892.0 152549892.1 >> 152549892.0 152549892.1 152549892.1 152549892.0 152549892.1 >> 152549892.0 152549892.0 152549892.0 152549892.1 152549892.1 >> Avg: 152549892.1, Min: 152549892.0, Max: 152549892.1, Diff: 0.1] >> [GC Worker (ms): 87.3 87.3 87.3 87.3 87.3 87.3 87.2 >> 87.3 87.2 87.1 87.1 87.1 87.1 >> Avg: 87.2, Min: 87.1, Max: 87.3, Diff: 0.2] >> [GC Worker Other (ms): 5.9 5.9 5.9 5.9 5.9 5.9 5.9 5.9 >> 5.9 27.8 6.0 6.1 6.1 >> Avg: 7.6, Min: 5.9, Max: 27.8, Diff: 21.9] >> [Clear CT: 0.2 ms] >> [Other: 26.7 ms] >> [Choose CSet: 0.0 ms] >> [Ref Proc: 24.7 ms] >> [Ref Enq: 0.1 ms] >> [Free CSet: 1.0 ms] >> [Eden: 620M(932M)->0B(960M) Survivors: 92M->64M Heap: >> 7037M(22480M)->6476M(22480M)] >> [Times: user=1.34 sys=0.00, real=0.12 secs] >> 152549.925: [G1Ergonomics (Heap Sizing) attempt heap expansion, >> reason: humongous allocation request failed, allocation request: >> 305776936 bytes] >> 152549.925: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 260046848 bytes, attempted expansion amount: >> 260046848 bytes] >> 152549.925: [G1Ergonomics (Heap Sizing) did not expand the heap, >> reason: heap expansion operation failed] >> Total time for which application threads were stopped: 0.1240664 seconds >> 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, >> reason: humongous allocation request failed, allocation request: >> 305776936 bytes] >> 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 260046848 bytes, attempted expansion amount: >> 260046848 bytes] >> 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, >> reason: heap expansion operation failed] >> 152549.958: [G1Ergonomics (Heap Sizing) attempt heap expansion, >> reason: allocation request failed, allocation request: 305776936 bytes] >> 152549.958: [G1Ergonomics (Heap Sizing) expand the heap, requested >> expansion amount: 305776936 bytes, attempted expansion amount: >> 306184192 bytes] >> 152549.958: [G1Ergonomics (Heap Sizing) did not expand the heap, >> reason: heap expansion operation failed] >> 152549.958: [Full GC >> >> Before allocating humongous object, 6 out of 22g heap is used, but >> allocation 300m object caused a full gc? I do not have an >> explanation for this. >> >> * >> Thanks, >> Jenny >> On 3/27/2015 8:30 AM, Medan Gavril wrote: >>> Hi , >>> >>> I saw your G1 presentation and I found it good and interesting. I am >>> new to G1 tuning and I would need you suggestions if you have time. >>> >>> In our app, when we have a FULL GC : >>> 1. it restarts the application >>> 2. we cannot get the right data to understand the root cause >>> >>> We switched from CMS to G1 in order to avoid long FULL GCs. >>> >>> JRE 1.17 update 17 it is being used. >>> >>> GC params: >>> >>> wrapper.java.additional.1=-server >>> wrapper.java.additional.2=-XX:+PrintCommandLineFlags >>> wrapper.java.additional.3=-XX:+UseG1GC >>> wrapper.java.additional.7=-XX:MaxGCPauseMillis=2500 >>> wrapper.java.additional.8=-Dsun.rmi.dgc.client.gcInterval=0x7ffffffffffffffe >>> wrapper.java.additional.9=-Dsun.rmi.dgc.server.gcInterval=0x7ffffffffffffffe >>> wrapper.java.additional.10=-XX:+HeapDumpOnOutOfMemoryError >>> wrapper.java.additional.11=-verbose:gc >>> wrapper.java.additional.12=-XX:+PrintGCDetails >>> wrapper.java.additional.13=-Ducmdb.home=%SERVER_BASE_DIR% >>> >>> wrapper.java.additional.52=-XX:+PrintGCTimeStamps >>> wrapper.java.additional.53=-XX:+PrintGCApplicationStoppedTime >>> wrapper.java.additional.55=-XX:+PrintAdaptiveSizePolicy >>> >>> The error from wrapper log is: >>> >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Eden: >>> 692M(968M)->0B(972M) Survivors: 56M->52M Heap: >>> 8127M(22480M)->7436M(22480M)]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | [Times: user=1.51 >>> sys=0.02, real=0.19 secs] / >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: >>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >>> humongous allocation request failed, allocation request: 189267984 >>> bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: >>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion >>> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.265: >>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap >>> expansion operation failed]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | Total time for which >>> application threads were stopped: 0.2031307 seconds/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >>> humongous allocation request failed, allocation request: 189267984 >>> bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion >>> amount: 188743680 bytes, attempted expansion amount: 188743680 bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap >>> expansion operation failed]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) attempt heap expansion, reason: >>> allocation request failed, allocation request: 189267984 bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) expand the heap, requested expansion >>> amount: 189267984 bytes, attempted expansion amount: 192937984 bytes]/ >>> /INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: >>> [G1Ergonomics (Heap Sizing) did not expand the heap, reason: heap >>> expansion operation failed]/ >>> */INFO | jvm 1 | 2015/03/25 15:23:43.268 | 93238.285: [Full GC/* >>> /ERROR | wrapper | 2015/03/25 15:25:57.694 | JVM appears hung: >>> *Timed out waiting for signal from JVM.*/ >>> /ERROR | wrapper | 2015/03/25 15:25:58.021 | JVM did not exit / >>> >>> >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.335: >>> [G1Ergonomics (Concurrent Cycles) request concurrent cycle >>> initiation, reason: occupancy higher than threshold, occupancy: >>> 10603200512 bytes, allocation request: 14584896 bytes, threshold: >>> 10607394780 bytes (45.00 %), source: concurrent humongous allocation]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: >>> [G1Ergonomics (Concurrent Cycles) request concurrent cycle >>> initiation, reason: requested by GC cause, GC cause: G1 Humongous >>> Allocation]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: >>> [G1Ergonomics (Concurrent Cycles) initiate concurrent cycle, reason: >>> concurrent cycle initiation requested]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.337: [GC pause >>> (young) 92696.338: [G1Ergonomics (CSet Construction) start choosing >>> CSet, predicted base time: 165.76 ms, remaining time: 2334.24 ms, >>> target pause time: 2500.00 ms]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: >>> [G1Ergonomics (CSet Construction) add young regions to CSet, eden: >>> 114 regions, survivors: 8 regions, predicted young region time: >>> 32.04 ms]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.289 | 92696.338: >>> [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 114 >>> regions, survivors: 8 regions, old: 0 regions, predicted pause time: >>> 197.80 ms, target pause time: 2500.00 ms]/ >>> /INFO | jvm 1 | 2015/03/25 15:14:41.398 | (initial-mark), >>> 0.15117107 secs]/ >>> / >>> / >>> /We increased the wrapper timeout but still no useful data about the >>> FULL GC./ >>> / >>> / >>> /Any suggestion is highly appreciated. Currently I suggested to add >>> "PrintHeapAtGCExtended "/ >>> / >>> / >>> /Best Regards, >>> Gabi Medan/ >>> >>> >>> >>> _______________________________________________ >>> hotspot-gc-use mailing list >>> hotspot-gc-use at openjdk.java.net >>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use >> >> _______________________________________________ >> hotspot-gc-use mailing list >> hotspot-gc-use at openjdk.java.net >> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thomas.schatzl at oracle.com Tue Mar 31 11:30:55 2015 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 31 Mar 2015 13:30:55 +0200 Subject: G1 root cause and tuning In-Reply-To: <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> <5519E82A.5050808@oracle.com> <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> Message-ID: <1427801455.3432.78.camel@oracle.com> Hi all, On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote: > Hi Jenny, > > One possibility is that there is not enough available contiguous > regions to satisfy a 300+ MB humongous allocation. > > If we assume a 22 GB Java heap, (a little larger than the 22480M shown > in the log), with 2048 G1 regions (default as you know), the region > size would be about 11 MB. That implies there needs to be about 30 > contiguous G1 regions available to satisfy the humongous allocation > request. > > An unrelated question ? do other GCs have a similar pattern of a > rather large percentage of time in Ref Proc relative to the overall > pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. If that?s > the case, then if -XX:+ParallelRefProcEnabled is not already set, > there may be some low hanging tuning fruit. But, it is not going to > address the frequent humongous allocation problem. It is also > interesting in that the pause time goal is 2500 ms, yet the actual > pause time is 120 ms, and eden is being sized at less than 1 GB out of > a 22 GB Java heap. Are the frequent humongous allocations messing > with the heap sizing heuristics? While I have no solution for the problem we are aware of these problems: - https://bugs.openjdk.java.net/browse/JDK-7068229 for dynamically enabling MT reference processing - https://bugs.openjdk.java.net/browse/JDK-8038487 to use mixed GC instead of Full GC to clear out space for failing humoungous object allocations. I am not sure about what jdk release "JRE 1.17 update 17" actually is. From the given strings in the PrintGCDetails output, it seems to be something quite old, I would guess jdk6? In that case, if possible I would recommend trying a newer version that improves humongous object handling significantly (e.g. 8u40 is latest official). Another option that works in all versions I am aware of is increasing heap region size with -XX:G1HeapRegionSize=M, where X is 8/16 or 32; it seems that 4M region size has been chosen by ergonomics. Start with the smaller of the suggested values. Thanks, Thomas From charlie.hunt at oracle.com Tue Mar 31 12:35:15 2015 From: charlie.hunt at oracle.com (charlie hunt) Date: Tue, 31 Mar 2015 07:35:15 -0500 Subject: G1 root cause and tuning In-Reply-To: <1427801455.3432.78.camel@oracle.com> References: <1685276618.3563093.1427470230963.JavaMail.yahoo@mail.yahoo.com> <5519E82A.5050808@oracle.com> <4564A6BF-D8DC-4200-86E2-5E9C1C75F194@oracle.com> <1427801455.3432.78.camel@oracle.com> Message-ID: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com> To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations. This would offer a better opportunity for that humongous allocation to be satisfied. hths, charlie > On Mar 31, 2015, at 6:30 AM, Thomas Schatzl wrote: > > Hi all, > > On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote: >> Hi Jenny, >> >> One possibility is that there is not enough available contiguous >> regions to satisfy a 300+ MB humongous allocation. >> >> If we assume a 22 GB Java heap, (a little larger than the 22480M shown >> in the log), with 2048 G1 regions (default as you know), the region >> size would be about 11 MB. That implies there needs to be about 30 >> contiguous G1 regions available to satisfy the humongous allocation >> request. >> >> An unrelated question ? do other GCs have a similar pattern of a >> rather large percentage of time in Ref Proc relative to the overall >> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. If that?s >> the case, then if -XX:+ParallelRefProcEnabled is not already set, >> there may be some low hanging tuning fruit. But, it is not going to >> address the frequent humongous allocation problem. It is also >> interesting in that the pause time goal is 2500 ms, yet the actual >> pause time is 120 ms, and eden is being sized at less than 1 GB out of >> a 22 GB Java heap. Are the frequent humongous allocations messing >> with the heap sizing heuristics? > > While I have no solution for the problem we are aware of these problems: > > - https://bugs.openjdk.java.net/browse/JDK-7068229 for dynamically > enabling MT reference processing > > - https://bugs.openjdk.java.net/browse/JDK-8038487 to use mixed GC > instead of Full GC to clear out space for failing humoungous object > allocations. > > I am not sure about what jdk release "JRE 1.17 update 17" actually is. > From the given strings in the PrintGCDetails output, it seems to be > something quite old, I would guess jdk6? > > In that case, if possible I would recommend trying a newer version that > improves humongous object handling significantly (e.g. 8u40 is latest > official). > > Another option that works in all versions I am aware of is increasing > heap region size with -XX:G1HeapRegionSize=M, where X is 8/16 or 32; > it seems that 4M region size has been chosen by ergonomics. > Start with the smaller of the suggested values. > > Thanks, > Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlie.hunt at oracle.com Tue Mar 31 12:52:13 2015 From: charlie.hunt at oracle.com (charlie hunt) Date: Tue, 31 Mar 2015 07:52:13 -0500 Subject: G1 root cause and tuning In-Reply-To: <1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com> References: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com> <1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com> Message-ID: <3A50F3D7-6C40-4D3E-B8D9-4822F111DB8A@oracle.com> Just as a clarification, the -XX:+ParallelRefProcEnabled will help reduce the time spent in reference processing. It will not help address the issue of seeing Full GCs as a result of frequent humongous object allocations, or a humongous allocations where there is not sufficient contiguous regions available to satisfy the humongous allocation request. Thomas?s suggestion to increase the region size may help with the Full GCs as a result of humongous object allocations. thanks, charlie > On Mar 31, 2015, at 7:42 AM, Medan Gavril wrote: > > HI Charlie, > > Currenltly we can only go to java 7 update 7x(latest). > > We will try the following changes: > 1. -XX:G1HeapRegionSize=8 (then increase) > 2. -XX:+ParallelRefProcEnabled > > Please let me know if you have any other suggestion. > > Best Regards, > Gabi Medan > > > On Tuesday, March 31, 2015 3:35 PM, charlie hunt wrote: > > > To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations. This would offer a better opportunity for that humongous allocation to be satisfied. > > hths, > > charlie > >> On Mar 31, 2015, at 6:30 AM, Thomas Schatzl > wrote: >> >> Hi all, >> >> On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote: >>> Hi Jenny, >>> >>> One possibility is that there is not enough available contiguous >>> regions to satisfy a 300+ MB humongous allocation. >>> >>> If we assume a 22 GB Java heap, (a little larger than the 22480M shown >>> in the log), with 2048 G1 regions (default as you know), the region >>> size would be about 11 MB. That implies there needs to be about 30 >>> contiguous G1 regions available to satisfy the humongous allocation >>> request. >>> >>> An unrelated question ? do other GCs have a similar pattern of a >>> rather large percentage of time in Ref Proc relative to the overall >>> pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. If that?s >>> the case, then if -XX:+ParallelRefProcEnabled is not already set, >>> there may be some low hanging tuning fruit. But, it is not going to >>> address the frequent humongous allocation problem. It is also >>> interesting in that the pause time goal is 2500 ms, yet the actual >>> pause time is 120 ms, and eden is being sized at less than 1 GB out of >>> a 22 GB Java heap. Are the frequent humongous allocations messing >>> with the heap sizing heuristics? >> >> While I have no solution for the problem we are aware of these problems: >> >> - https://bugs.openjdk.java.net/browse/JDK-7068229 for dynamically >> enabling MT reference processing >> >> - https://bugs.openjdk.java.net/browse/JDK-8038487 to use mixed GC >> instead of Full GC to clear out space for failing humoungous object >> allocations. >> >> I am not sure about what jdk release "JRE 1.17 update 17" actually is. >> From the given strings in the PrintGCDetails output, it seems to be >> something quite old, I would guess jdk6? >> >> In that case, if possible I would recommend trying a newer version that >> improves humongous object handling significantly (e.g. 8u40 is latest >> official). >> >> Another option that works in all versions I am aware of is increasing >> heap region size with -XX:G1HeapRegionSize=M, where X is 8/16 or 32; >> it seems that 4M region size has been chosen by ergonomics. >> Start with the smaller of the suggested values. >> >> Thanks, >> Thomas > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gabi_io at yahoo.com Tue Mar 31 05:47:35 2015 From: gabi_io at yahoo.com (Medan Gavril) Date: Mon, 30 Mar 2015 22:47:35 -0700 Subject: G1 root cause and tuning In-Reply-To: <551A0510.4080604@oracle.com> Message-ID: <1427780855.80050.YahooMailAndroidMobile@web161701.mail.bf1.yahoo.com> Hi guys, Thanks a lot for your comments. So what should be the next step? Enable? XX:+ParallelRefProcEnabled ? Any other suggestions? About the logs... were they ok parsed? Do you want to add them to gc.log? However i do not want to be overwritten at restart. Best regards,? Gabi Medan Sent from Yahoo Mail on Android From:"Yu Zhang" Date:Tue, Mar 31, 2015 at 5:23 am Subject:Re: G1 root cause and tuning Charlie, Thanks for the comments. please see my response inline. Thanks, Jenny On 3/30/2015 6:41 PM, charlie hunt wrote: -------------- next part -------------- An HTML attachment was scrubbed... URL: From gabi_io at yahoo.com Tue Mar 31 12:42:56 2015 From: gabi_io at yahoo.com (Medan Gavril) Date: Tue, 31 Mar 2015 12:42:56 +0000 (UTC) Subject: G1 root cause and tuning In-Reply-To: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com> References: <2667A0C8-0624-44A7-A8DE-C4BD32D2B154@oracle.com> Message-ID: <1791857739.2422310.1427805776536.JavaMail.yahoo@mail.yahoo.com> HI Charlie, Currenltly we can only go to java 7 update 7x(latest). We will try the following changes:? ?1.?-XX:G1HeapRegionSize=8 (then increase)? ? 2.?-XX:+ParallelRefProcEnabled Please let me know if you have any other suggestion. Best Regards,Gabi Medan On Tuesday, March 31, 2015 3:35 PM, charlie hunt wrote: To add to Thomas?s good suggestions, I suppose one other alternative is to make application changes to break up the 300+ MB allocation into smaller MB allocations. ?This would offer a better opportunity for that humongous allocation to be satisfied. hths, charlie On Mar 31, 2015, at 6:30 AM, Thomas Schatzl wrote: Hi all, On Mon, 2015-03-30 at 20:41 -0500, charlie hunt wrote: Hi Jenny, One possibility is that there is not enough available contiguous regions to satisfy a 300+ MB humongous allocation. If we assume a 22 GB Java heap, (a little larger than the 22480M shown in the log), with 2048 G1 regions (default as you know), the region size would be about 11 MB. That implies there needs to be about 30 contiguous G1 regions available to satisfy the humongous allocation request. An unrelated question ? do other GCs have a similar pattern of a rather large percentage of time in Ref Proc relative to the overall pause time, i.e. 24.7 ms / 120 ms ~ 20% of the pause time. ?If that?s the case, then if -XX:+ParallelRefProcEnabled is not already set, there may be some low hanging tuning fruit. But, it is not going to address the frequent humongous allocation problem. ?It is also interesting in that the pause time goal is 2500 ms, yet the actual pause time is 120 ms, and eden is being sized at less than 1 GB out of a 22 GB Java heap. ?Are the frequent humongous allocations messing with the heap sizing heuristics? While I have no solution for the problem we are aware of these problems: -?https://bugs.openjdk.java.net/browse/JDK-7068229?for dynamically enabling MT reference processing -?https://bugs.openjdk.java.net/browse/JDK-8038487?to use mixed GC instead of Full GC to clear out space for failing humoungous object allocations. I am not sure about what jdk release "JRE 1.17 update 17" actually is. >From the given strings in the PrintGCDetails output, it seems to be something quite old, I would guess jdk6? In that case, if possible I would recommend trying a newer version that improves humongous object handling significantly (e.g. 8u40 is latest official). Another option that works in all versions I am aware of is increasing heap region size with -XX:G1HeapRegionSize=M, where X is 8/16 or 32; it seems that 4M region size has been chosen by ergonomics. Start with the smaller of the suggested values. Thanks, ?Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: