G1 "to-space exhausted" causes used heap space to increase?
Christopher Berner
christopherberner at gmail.com
Wed Mar 4 20:56:06 UTC 2015
Adjusting -XX:G1MaxNewSizePercent seemed to work better than changing the
target pause time, at least for us. Thanks for all the help!
On Tue, Mar 3, 2015 at 3:43 AM, Chi Ho Kwok <chkwok at digibites.nl> wrote:
> Hi,
>
> When there are live objects during an eden collection, they must be copied
> to a new, empty region. With a huge eden size, this may require more space
> than there is available, causing a to-space exhaustion. We always run with
> a fixed new generation size to avoid this kind of issues; when ergonomics
> think it can hit the pause time target with a very large eden, it will
> allocate a very large eden as it can be more efficient, that's a bit too
> unpredictable for us in production.
>
> We usually set a NewRatio of 4 to 10, when set to 4, the eden is fixed at
> 1/5th the size of the full heap or ~9GB. This also pretty much guarantee a
> tiny static eden collection pause, in your case, of ~21ms. (60/807*208
> regions). Your promotion failure happened when 25.5G produced 3.2G of
> survivors, with a eden of 9G, this should only be 1.2G, which shouldn't be
> any issue if the old collector is run regularly. The old collector is only
> triggered after a young collection by the way, so having them spaced closer
> (smaller eden -> full more quickly) gives it more chance to run and add
> almost empty regions to the next mixed gc run.
>
> Con: GC will run more often, with smaller pauses, and promote more objects
> to the old generation which require more work to collect (concurrent scan
> required). But as your collections run once per many minutes, this extra
> overhead is basically zero. Our prod young collectors run multiple times
> per second on a 4G eden, so you're not pushing the limits of the collector
> at all.
>
>
> Kind regards,
>
> --
> Chi Ho Kwok
> Digibites Technology
> chkwok at digibites.nl
>
> On 2 March 2015 at 18:44, Christopher Berner <christopherberner at gmail.com>
> wrote:
>
>> Hello,
>>
>> I work on the Presto project (https://github.com/facebook/presto) and am
>> trying to understand the behavior of G1. We run a 45GB heap on the worker
>> machines with "-XX:G1HeapRegionSize=32M", and it works smoothly, except
>> that every day a few machines hit a "to-space exhausted" failure and either
>> dies due to an OutOfMemory error, or does a full gc with such a long pause
>> that it fails our health checks and is restarted by our service manager.
>>
>> Looking at the GC logs, the sequence of events is always the same. The
>> young gen is quite large (~50% of the heap), and every collection is fast,
>> but then it hits a "to-space exhausted" failure which appears to increase
>> heap used (see log below). After that the young gen is tiny and it never
>> recovers.
>>
>> Two questions: 1) why does heap used increase in the middle of the GC
>> cycle? 2) Looking at some of the logs it appears that it starts a full GC,
>> but also throws an OutOfMemory error concurrently (they show up a hundred
>> lines apart or so in stdout). Why would there be an OutOfMemory error
>> before the full GC finished?
>>
>> Thanks for any help!
>> Christopher
>>
>> 2015-03-02T00:56:32.131-0800: 199078.406: [GC pause (GCLocker Initiated
>> GC) (young) 199078.407: [G1Ergonomics (CSet Construction) start choosing
>> CSet, _pending_cards: 16136, predicted base
>>
>> time: 30.29 ms, remaining time: 169.71 ms, target pause time: 200.00 ms]
>>
>> 199078.407: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 805 regions, survivors: 11 regions, predicted young region time:
>> 56.53 ms]
>>
>> 199078.407: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 805 regions, survivors: 11 regions, old: 0 regions, predicted pause
>> time: 86.82 ms, target pause time: 200.00 ms]
>>
>> , 0.0722119 secs]
>>
>> [Parallel Time: 46.7 ms, GC Workers: 28]
>>
>> [GC Worker Start (ms): Min: 199078406.9, Avg: 199078407.2, Max:
>> 199078407.5, Diff: 0.6]
>>
>> [Ext Root Scanning (ms): Min: 0.8, Avg: 1.4, Max: 3.9, Diff: 3.1,
>> Sum: 39.7]
>>
>> [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 3.4, Diff: 3.4, Sum: 58.9]
>>
>> [Processed Buffers: Min: 0, Avg: 6.5, Max: 22, Diff: 22, Sum:
>> 182]
>>
>> [Scan RS (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 5.3]
>>
>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.4, Diff: 0.4,
>> Sum: 0.7]
>>
>> [Object Copy (ms): Min: 40.1, Avg: 41.3, Max: 43.7, Diff: 3.6, Sum:
>> 1155.3]
>>
>> [Termination (ms): Min: 0.8, Avg: 0.9, Max: 1.1, Diff: 0.3, Sum:
>> 25.8]
>>
>> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.3, Diff: 0.2,
>> Sum: 3.9]
>>
>> [GC Worker Total (ms): Min: 45.7, Avg: 46.1, Max: 46.3, Diff: 0.6,
>> Sum: 1289.7]
>>
>> [GC Worker End (ms): Min: 199078453.2, Avg: 199078453.3, Max:
>> 199078453.4, Diff: 0.2]
>>
>> [Code Root Fixup: 0.3 ms]
>>
>> [Code Root Purge: 0.0 ms]
>>
>> [Clear CT: 3.0 ms]
>>
>> [Other: 22.2 ms]
>>
>> [Choose CSet: 0.0 ms]
>>
>> [Ref Proc: 18.0 ms]
>>
>> [Ref Enq: 0.5 ms]
>>
>> [Redirty Cards: 0.9 ms]
>>
>> [Humongous Reclaim: 0.0 ms]
>>
>> [Free CSet: 1.7 ms]
>>
>> [Eden: 25.2G(25.1G)->0.0B(25.2G) Survivors: 352.0M->320.0M Heap:
>> 39.7G(45.0G)->14.6G(45.0G)]
>>
>> [Times: user=1.37 sys=0.00, real=0.08 secs]
>>
>> 2015-03-02T01:38:44.545-0800: 201610.820: [GC pause (GCLocker Initiated
>> GC) (young) 201610.820: [G1Ergonomics (CSet Construction) start choosing
>> CSet, _pending_cards: 56252, predicted base
>>
>> time: 35.00 ms, remaining time: 165.00 ms, target pause time: 200.00 ms]
>>
>> 201610.820: [G1Ergonomics (CSet Construction) add young regions to CSet,
>> eden: 807 regions, survivors: 10 regions, predicted young region time:
>> 60.67 ms]
>>
>> 201610.820: [G1Ergonomics (CSet Construction) finish choosing CSet,
>> eden: 807 regions, survivors: 10 regions, old: 0 regions, predicted pause
>> time: 95.67 ms, target pause time: 200.00 ms]
>>
>> 201611.305: [G1Ergonomics (Heap Sizing) attempt heap expansion, reason:
>> region allocation request failed, allocation request: 3058176 bytes]
>>
>> 201611.319: [G1Ergonomics (Heap Sizing) expand the heap, requested
>> expansion amount: 3058176 bytes, attempted expansion amount: 33554432 bytes]
>>
>> 201611.319: [G1Ergonomics (Heap Sizing) did not expand the heap, reason:
>> heap already fully expanded]
>>
>> 201619.914: [G1Ergonomics (Concurrent Cycles) request concurrent cycle
>> initiation, reason: occupancy higher than threshold, occupancy: 44291850240
>> bytes, allocation request: 0 bytes, thres
>>
>> hold: 21743271900 bytes (45.00 %), source: end of GC]
>>
>> (to-space exhausted), 9.0961593 secs]
>>
>> [Parallel Time: 8209.7 ms, GC Workers: 28]
>>
>> [GC Worker Start (ms): Min: 201610864.0, Avg: 201610864.2, Max:
>> 201610864.4, Diff: 0.5]
>>
>> [Ext Root Scanning (ms): Min: 1.2, Avg: 1.7, Max: 4.7, Diff: 3.6,
>> Sum: 47.8]
>>
>> [Update RS (ms): Min: 0.0, Avg: 4.7, Max: 6.0, Diff: 6.0, Sum:
>> 131.1]
>>
>> [Processed Buffers: Min: 0, Avg: 27.4, Max: 48, Diff: 48, Sum:
>> 766]
>>
>> [Scan RS (ms): Min: 0.1, Avg: 0.3, Max: 1.2, Diff: 1.1, Sum: 7.1]
>>
>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.5, Diff: 0.5,
>> Sum: 0.8]
>>
>> [Object Copy (ms): Min: 8200.9, Avg: 8202.2, Max: 8207.2, Diff:
>> 6.3, Sum: 229661.8]
>>
>> [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum:
>> 7.0]
>>
>> [GC Worker Other (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2,
>> Sum: 2.8]
>>
>> [GC Worker Total (ms): Min: 8209.0, Avg: 8209.2, Max: 8209.5, Diff:
>> 0.6, Sum: 229858.3]
>>
>> [GC Worker End (ms): Min: 201619073.3, Avg: 201619073.4, Max:
>> 201619073.5, Diff: 0.2]
>>
>> [Code Root Fixup: 0.3 ms]
>>
>> [Code Root Purge: 0.0 ms]
>>
>> [Clear CT: 3.0 ms]
>>
>> [Other: 883.1 ms]
>>
>> [Evacuation Failure: 788.4 ms]
>>
>> [Choose CSet: 0.0 ms]
>>
>> [Ref Proc: 45.0 ms]
>>
>> [Ref Enq: 0.6 ms]
>>
>> [Redirty Cards: 1.4 ms]
>>
>> [Humongous Reclaim: 0.1 ms]
>>
>> [Free CSet: 0.6 ms]
>>
>> [Eden: 25.2G(25.2G)->0.0B(32.0M) Survivors: 320.0M->3264.0M Heap:
>> 39.8G(45.0G)->44.1G(45.0G)]
>>
>> [Times: user=46.07 sys=2.21, real=9.10 secs]
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20150304/5abf2ce8/attachment.html>
More information about the hotspot-gc-use
mailing list