G1 to-space-overflow on one server (but not on another identical server under identical load)

John Cuthbertson john.cuthbertson at oracle.com
Fri Nov 30 10:21:52 PST 2012


Hi Ryan,

I would definitely be interested in both logs. I would be interested in 
plotting any differences in the promotion rates or start times and 
durations of marking cycles. Can you also give me details of your 
command-line flags?

There's a couple of experiments to try:

* Lowering your IHOP (InitiatiangHeapOccupancyPercent - default value is 
45) to start the marking cycle a little bit earlier
* Increase the value of G1ReservePercent (default value 10).

Also do you know approximately if your application creates large arrays? 
If so then you could be running into issues with "holes" in the heap 
associated with these (in G1 parlance) humongous objects. This can be 
reduced by explicitly setting the value for G1HeapRegionSize. By my 
guess your region size will be either 8m or 16m (based upon the size of 
your heap). Increasing the value can increase the packing density since 
some humongous objects are no longer humongous; reducing it can increase 
the number of humongous objects but reduce the size and number of 
"holes" since the humongous objects may occupy a larger number of 
regions but the wasted space at the end of these objects is smaller.

After looking at your logs I may have a few more suggestions.

BTW we have a couple of CRs (and ideas) to reduce the duration of pauses 
that experience a to-space overflow/exhaustion. I'm not sure they're 
public yet but they are:

<https://jbs.oracle.com/bugs/browse/JDK-8003235>8003235       G1: 
Parellelize displaced header restoration during evacuation failures
8003237       G1: Reduce unnecessary (and failing) allocation attempts 
when handling an evacuation failure

The first one should address the high "other" time; the second should 
help to reduce the object copy time

Thanks.

JohnC

On 11/29/12 19:29, Ryan Gardner wrote:
> We're doing some load testing of an instance of a solr search that has 
> a pretty frequent replication... the object allocation rates are all 
> over the map, and tuning it with CMS was very difficult and we weren't 
> able to meet our latency targets - so we decided to try G1. 
>
> I'm running some tests now, and I have two identical servers that are 
> getting identical traffic. on one server, here's an excerpt from the 
> gc log: 
>
> Here are the VM arguments that I'm using:   
>   -XX:+PrintGCDateStamps -XX:+PrintGCDetails 
> -XX:+ExplicitGCInvokesConcurrent -XX:MaxGCPauseMillis=150 
> -XX:+AggressiveOpts -XX:+UseG1GC -Xmx24G -Xms24G
>
> Is there some option I can set to stack the deck in favor of avoiding 
> a "to-space-overflow"?
>
> If anyone is interested, I can provide the full GC logs from both of 
> the servers - 
>
> 2012-11-30T01:42:21.857+0000: 4571.106: [GC pause (young), 0.08561600 
> secs]
>    [Parallel Time:  62.0 ms]
>       [GC Worker Start (ms):  4571107.0  4571107.0  4571107.0 
>  4571107.0  4571107.1  4571107.1  4571107.2  4571107.2  4571107.2 
>  4571107.3  4571107.3  4571107.3  4571107.4  4571107.4  4571107.4 
>  4571107.5  4571107.5  4571107.5
>        Avg: 4571107.2, Min: 4571107.0, Max: 4571107.5, Diff:   0.5]
>       [Ext Root Scanning (ms):  1.2  1.2  1.1  1.1  1.1  1.0  1.0  1.0 
>  1.0  1.0  0.9  1.2  0.8  0.9  1.0  0.9  0.8  0.8
>        Avg:   1.0, Min:   0.8, Max:   1.2, Diff:   0.5]
>       [Update RS (ms):  8.1  8.3  8.1  8.1  8.0  8.1  8.2  7.9  8.1 
>  8.1  8.2  7.9  8.0  8.0  7.8  8.1  8.1  8.9
>        Avg:   8.1, Min:   7.8, Max:   8.9, Diff:   1.1]
>          [Processed Buffers : 13 14 15 13 15 11 17 15 11 14 14 11 12 
> 13 13 11 11 11
>           Sum: 234, Avg: 13, Min: 11, Max: 17, Diff: 6]
>       [Scan RS (ms):  0.4  0.3  0.5  0.5  0.7  0.6  0.6  0.7  0.6  0.5 
>  0.5  0.5  0.7  0.6  0.6  0.4  0.6  0.1
>        Avg:   0.5, Min:   0.1, Max:   0.7, Diff:   0.6]
>       [Object Copy (ms):  47.2  47.1  46.9  47.0  47.0  48.6  46.9 
>  47.0  46.9  46.9  47.0  46.9  47.0  47.7  47.1  47.0  46.9  48.3
>        Avg:  47.2, Min:  46.9, Max:  48.6, Diff:   1.7]
>       [Termination (ms):  1.9  1.8  1.9  1.9  1.9  0.3  1.8  1.8  1.9 
>  1.9  1.8  1.8  1.9  1.2  1.7  1.9  1.8  0.0
>        Avg:   1.6, Min:   0.0, Max:   1.9, Diff:   1.9]
>          [Termination Attempts : 9 8 14 13 16 16 8 1 17 20 12 14 18 11 
> 18 18 17 18
>           Sum: 248, Avg: 13, Min: 1, Max: 20, Diff: 19]
>       [GC Worker End (ms):  4571166.1  4571165.7  4571165.7  4571166.0 
>  4571166.0  4571165.9  4571165.8  4571166.0  4571165.7  4571165.7 
>  4571165.7  4571165.7  4571165.7  4571165.9  4571166.1  4571166.2 
>  4571165.7  4571165.9
>        Avg: 4571165.8, Min: 4571165.7, Max: 4571166.2, Diff:   0.5]
>       [GC Worker (ms):  59.1  58.7  58.7  58.9  58.9  58.8  58.6  58.8 
>  58.5  58.4  58.4  58.4  58.3  58.6  58.7  58.7  58.2  58.4
>        Avg:  58.6, Min:  58.2, Max:  59.1, Diff:   0.9]
>       [GC Worker Other (ms):  3.3  3.4  3.4  3.4  3.4  3.5  3.6  3.6 
>  3.6  3.7  3.7  3.7  3.7  3.8  3.8  3.8  3.9  3.9
>        Avg:   3.6, Min:   3.3, Max:   3.9, Diff:   0.5]
>    [Clear CT:   2.3 ms]
>    [Other:  21.2 ms]
>       [Choose CSet:   0.2 ms]
>       [Ref Proc:   5.8 ms]
>       [Ref Enq:   0.1 ms]
>       [Free CSet:  13.6 ms]
>    [Eden: 6728M(6728M)->0B(4472M) Survivors: 376M->616M Heap: 
> 10758M(24576M)->4264M(24576M)]
>  [Times: user=1.08 sys=0.00, real=0.08 secs] 
> 2012-11-30T01:42:27.493+0000: 4576.742: [GC pause (young), 0.06992900 
> secs]
>    [Parallel Time:  52.4 ms]
>       [GC Worker Start (ms):  4576742.7  4576742.8  4576742.8 
>  4576742.8  4576742.9  4576742.9  4576743.0  4576743.0  4576743.1 
>  4576743.1  4576743.1  4576743.2  4576743.2  4576743.2  4576743.3 
>  4576743.3  4576743.3  4576743.3
>        Avg: 4576743.1, Min: 4576742.7, Max: 4576743.3, Diff:   0.6]
>       [Ext Root Scanning (ms):  1.4  1.3  1.4  1.1  1.0  1.2  1.1  1.0 
>  1.2  0.9  1.1  1.1  1.2  1.1  1.0  1.0  0.8  0.7
>        Avg:   1.1, Min:   0.7, Max:   1.4, Diff:   0.7]
>       [Update RS (ms):  0.8  0.7  0.6  0.8  0.7  2.4  0.7  0.8  0.7 
>  1.1  0.7  0.8  0.4  0.4  0.4  0.7  0.8  1.2
>        Avg:   0.8, Min:   0.4, Max:   2.4, Diff:   2.0]
>          [Processed Buffers : 2 2 1 7 4 1 6 2 1 1 2 1 4 1 1 1 1 1
>           Sum: 39, Avg: 2, Min: 1, Max: 7, Diff: 6]
>       [Scan RS (ms):  0.1  0.4  0.3  0.3  0.3  0.1  0.4  0.2  0.0  0.0 
>  0.1  0.0  0.2  0.3  0.4  0.1  0.2  0.0
>        Avg:   0.2, Min:   0.0, Max:   0.4, Diff:   0.3]
>       [Object Copy (ms):  47.2  47.4  47.3  47.3  47.4  45.7  47.4 
>  47.4  47.2  47.4  47.4  47.1  47.5  47.2  47.2  47.1  47.4  47.0
>        Avg:  47.2, Min:  45.7, Max:  47.5, Diff:   1.8]
>       [Termination (ms):  0.3  0.1  0.2  0.2  0.3  0.3  0.0  0.0  0.2 
>  0.0  0.1  0.3  0.0  0.2  0.2  0.3  0.1  0.3
>        Avg:   0.2, Min:   0.0, Max:   0.3, Diff:   0.3]
>          [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>           Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0]
>       [GC Worker End (ms):  4576792.6  4576792.6  4576792.6  4576792.7 
>  4576792.8  4576792.7  4576792.8  4576792.9  4576792.8  4576792.7 
>  4576792.6  4576792.6  4576792.8  4576792.6  4576792.6  4576792.6 
>  4576792.6  4576792.6
>        Avg: 4576792.7, Min: 4576792.6, Max: 4576792.9, Diff:   0.3]
>       [GC Worker (ms):  49.8  49.8  49.8  49.9  49.9  49.8  49.8  49.8 
>  49.7  49.5  49.5  49.4  49.6  49.3  49.3  49.3  49.3  49.3
>        Avg:  49.6, Min:  49.3, Max:  49.9, Diff:   0.6]
>       [GC Worker Other (ms):  2.6  2.6  2.6  2.7  2.7  2.8  2.8  2.9 
>  3.0  3.0  3.0  3.0  3.1  3.1  3.1  3.1  3.2  3.2
>        Avg:   2.9, Min:   2.6, Max:   3.2, Diff:   0.6]
>    [Clear CT:   1.6 ms]
>    [Other:  15.9 ms]
>       [Choose CSet:   0.2 ms]
>       [Ref Proc:   3.5 ms]
>       [Ref Enq:   0.1 ms]
>       [Free CSet:  10.0 ms]
>    [Eden: 4472M(4472M)->0B(16704M) Survivors: 616M->288M Heap: 
> 8736M(24576M)->4279M(24576M)]
>  [Times: user=0.92 sys=0.01, real=0.07 secs] 
> 2012-11-30T01:43:06.366+0000: 4615.615: [GC pause (young) (to-space 
> overflow), 7.55314600 secs]
>    [Parallel Time: 5994.5 ms]
>       [GC Worker Start (ms):  4615616.3  4615616.4  4615616.4 
>  4615616.4  4615616.4  4615616.4  4615616.5  4615616.5  4615616.5 
>  4615616.6  4615616.6  4615616.7  4615616.7  4615616.7  4615616.8 
>  4615616.8  4615616.8  4615616.9
>        Avg: 4615616.6, Min: 4615616.3, Max: 4615616.9, Diff:   0.6]
>       [Ext Root Scanning (ms):  1.2  1.3  1.3  1.2  1.2  1.2  1.6  1.0 
>  1.1  1.0  0.9  0.9  1.4  1.1  0.9  0.9  0.9  1.0
>        Avg:   1.1, Min:   0.9, Max:   1.6, Diff:   0.8]
>       [Update RS (ms):  9.8  10.1  9.5  9.5  10.0  9.6  10.4  9.5  9.8 
>  9.7  10.9  9.7  9.1  9.9  10.2  9.7  9.5  9.6
>        Avg:   9.8, Min:   9.1, Max:  10.9, Diff:   1.8]
>          [Processed Buffers : 16 17 14 19 14 13 14 14 15 18 13 13 12 
> 14 13 12 14 14
>           Sum: 259, Avg: 14, Min: 12, Max: 19, Diff: 7]
>       [Scan RS (ms):  1.1  0.6  1.3  1.3  0.8  1.0  0.1  1.3  1.0  1.1 
>  0.1  1.2  1.3  0.7  0.5  1.0  1.2  0.9
>        Avg:   0.9, Min:   0.1, Max:   1.3, Diff:   1.2]
>       [Object Copy (ms):  5979.6  5978.9  5979.2  5979.1  5979.5 
>  5979.3  5978.7  5979.6  5979.2  5979.5  5979.2  5979.3  5978.9 
>  5979.0  5979.1  5979.2  5979.0  5979.5
>        Avg: 5979.2, Min: 5978.7, Max: 5979.6, Diff:   1.0]
>       [Termination (ms):  0.1  0.7  0.2  0.5  0.1  0.3  0.6  0.2  0.3 
>  0.0  0.2  0.2  0.6  0.5  0.4  0.3  0.6  0.0
>        Avg:   0.3, Min:   0.0, Max:   0.7, Diff:   0.7]
>          [Termination Attempts : 3 1 1 3 3 5 3 4 4 4 4 1 3 3 3 5 6 4
>           Sum: 60, Avg: 3, Min: 1, Max: 6, Diff: 5]
>       [GC Worker End (ms):  4621608.7  4621608.0  4621608.0  4621608.7 
>  4621608.4  4621608.5  4621608.0  4621608.3  4621608.6  4621608.0 
>  4621608.2  4621608.0  4621608.0  4621608.1  4621608.3  4621608.4 
>  4621608.0  4621608.1
>        Avg: 4621608.2, Min: 4621608.0, Max: 4621608.7, Diff:   0.7]
>       [GC Worker (ms):  5992.4  5991.6  5991.6  5992.3  5992.0  5992.1 
>  5991.5  5991.8  5992.1  5991.4  5991.5  5991.3  5991.3  5991.3 
>  5991.6  5991.6  5991.2  5991.2
>        Avg: 5991.7, Min: 5991.2, Max: 5992.4, Diff:   1.2]
>       [GC Worker Other (ms):  2.9  2.9  3.0  3.0  3.0  3.0  3.0  3.0 
>  3.1  3.1  3.2  3.2  3.2  3.3  3.3  3.3  3.4  3.5
>        Avg:   3.1, Min:   2.9, Max:   3.5, Diff:   0.6]
>    [Clear CT:   3.3 ms]
>    [Other: 1555.4 ms]
>       [Choose CSet:   0.4 ms]
>       [Ref Proc:  57.3 ms]
>       [Ref Enq:   0.3 ms]
>       [Free CSet:  19.4 ms]
>    [Eden: 16704M(16704M)->0B(2784M) Survivors: 288M->2128M Heap: 
> 21362M(24576M)->22239M(24576M)]
>  [Times: user=35.37 sys=0.33, real=7.55 secs] 
> 2012-11-30T01:43:16.740+0000: 4625.989: [GC pause (young) (to-space 
> overflow) (initial-mark), 23.61614000 secs]
>    [Parallel Time: 21781.4 ms]
>       [GC Worker Start (ms):  4625989.9  4625990.0  4625990.1 
>  4625990.1  4625990.2  4625990.2  4625990.2  4625990.3  4625990.3 
>  4625990.3  4625990.4  4625990.4  4625990.4  4625990.4  4625990.5 
>  4625990.5  4625990.5  4625990.5
>        Avg: 4625990.3, Min: 4625989.9, Max: 4625990.5, Diff:   0.6]
>       [Ext Root Scanning (ms):  1.7  2.0  1.4  1.2  1.4  1.4  1.4  1.4 
>  1.1  1.2  1.2  1.0  1.1  1.1  1.1  0.9  1.1  1.1
>        Avg:   1.3, Min:   0.9, Max:   2.0, Diff:   1.1]
>       [Update RS (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.4  0.4  0.5 
>  0.6  0.6  1.4  0.5  0.9  0.6  0.7  0.5  0.7
>        Avg:   0.4, Min:   0.0, Max:   1.4, Diff:   1.4]
>          [Processed Buffers : 0 0 0 0 0 0 16 4 11 22 30 39 30 26 9 62 21 9
>           Sum: 279, Avg: 15, Min: 0, Max: 62, Diff: 62]
>       [Scan RS (ms):  0.1  0.1  0.1  0.1  0.1  0.1  0.8  0.6  0.6  0.5 
>  0.6  0.0  0.8  0.4  0.5  0.7  0.6  0.4
>        Avg:   0.4, Min:   0.0, Max:   0.8, Diff:   0.7]
>       [Object Copy (ms):  21776.6  21776.1  21776.5  21776.6  21776.2 
>  21776.4  21775.0  21774.7  21775.3  21775.3  21775.0  21774.8 
>  21775.4  21774.6  21775.2  21774.6  21774.7  21775.0
>        Avg: 21775.5, Min: 21774.6, Max: 21776.6, Diff:   2.1]
>       [Termination (ms):  0.0  0.3  0.4  0.4  0.5  0.3  0.7  1.0  0.6 
>  0.4  0.6  0.7  0.2  1.0  0.5  1.0  1.0  0.7
>        Avg:   0.6, Min:   0.0, Max:   1.0, Diff:   1.0]
>          [Termination Attempts : 1 1 1 2 2 1 1 1 3 1 1 1 1 1 1 2 2 1
>           Sum: 24, Avg: 1, Min: 1, Max: 3, Diff: 2]
>       [GC Worker End (ms):  4647768.4  4647768.5  4647768.5  4647768.4 
>  4647768.4  4647768.7  4647768.8  4647768.5  4647768.5  4647768.4 
>  4647768.7  4647768.5  4647768.4  4647768.4  4647768.6  4647768.6 
>  4647768.4  4647768.4
>        Avg: 4647768.5, Min: 4647768.4, Max: 4647768.8, Diff:   0.3]
>       [GC Worker (ms):  21778.5  21778.5  21778.4  21778.3  21778.3 
>  21778.5  21778.5  21778.2  21778.2  21778.1  21778.3  21778.1 
>  21778.0  21778.0  21778.1  21778.1  21777.9  21777.9
>        Avg: 21778.2, Min: 21777.9, Max: 21778.5, Diff:   0.6]
>       [GC Worker Other (ms):  2.9  3.0  3.0  3.1  3.1  3.2  3.2  3.2 
>  3.3  3.3  3.3  3.3  3.4  3.4  3.4  3.5  3.5  3.5
>        Avg:   3.3, Min:   2.9, Max:   3.5, Diff:   0.6]
>    [Clear CT:   1.5 ms]
>    [Other: 1833.3 ms]
>       [Choose CSet:   0.0 ms]
>       [Ref Proc:   0.8 ms]
>       [Ref Enq:   0.0 ms]
>       [Free CSet:   3.6 ms]
>    [Eden: 1384M(2784M)->0B(4912M) Survivors: 2128M->0B Heap: 
> 23623M(24576M)->23623M(24576M)]
>  [Times: user=33.81 sys=0.54, real=23.62 secs] 
> 2012-11-30T01:43:40.357+0000: 4649.606: [GC 
> concurrent-root-region-scan-start]
> 2012-11-30T01:43:40.357+0000: 4649.606: [GC 
> concurrent-root-region-scan-end, 0.0000690]
> 2012-11-30T01:43:40.357+0000: 4649.606: [GC concurrent-mark-start]
> 2012-11-30T01:43:40.358+0000: 4649.607: [GC pause (young), 1.66905500 
> secs]
>    [Parallel Time: 1667.5 ms]
>       [GC Worker Start (ms):  4649607.0  4649607.0  4649607.1 
>  4649607.2  4649607.2  4649607.3  4649607.3  4649607.3  4649607.4 
>  4649607.8  4649607.4  4649607.4  4649607.5  4649607.5  4649607.6 
>  4649607.6  4649607.6  4649607.7
>        Avg: 4649607.4, Min: 4649607.0, Max: 4649607.8, Diff:   0.8]
>       [Ext Root Scanning (ms):  1.6  2.1  2.3  1.5  2.1  1.6  2.0  1.9 
>  1.8  1.7  1.9  2.1  1.6  1.7  1.4  1.6  1.2  1.3
>        Avg:   1.7, Min:   1.2, Max:   2.3, Diff:   1.1]
>       [SATB Filtering (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 
>  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
>        Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
>       [Update RS (ms):  1662.9  1662.8  1662.5  1663.0  1662.6  1663.1 
>  1662.6  1662.7  1662.9  1662.5  1662.6  1662.4  1662.9  1662.8 
>  1663.1  1662.8  1663.2  1663.1
>        Avg: 1662.8, Min: 1662.4, Max: 1663.2, Diff:   0.7]
>          [Processed Buffers : 10947 10965 17033 10676 10631 10692 
> 10524 10706 16833 10703 10751 12889 10735 18332 18581 19972 16982 19856
>           Sum: 247808, Avg: 13767, Min: 10524, Max: 19972, Diff: 9448]
>       [Scan RS (ms):  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 
>  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
>        Avg:   0.0, Min:   0.0, Max:   0.0, Diff:   0.0]
>       [Object Copy (ms):  0.5  0.1  0.1  0.3  0.2  0.1  0.1  0.1  0.0 
>  0.1  0.2  0.1  0.1  0.1  0.1  0.1  0.1  0.1
>        Avg:   0.1, Min:   0.0, Max:   0.5, Diff:   0.5]
>       [Termination (ms):  0.0  0.0  0.1  0.1  0.0  0.1  0.1  0.1  0.0 
>  0.0  0.1  0.1  0.1  0.1  0.1  0.0  0.1  0.0
>        Avg:   0.0, Min:   0.0, Max:   0.1, Diff:   0.1]
>          [Termination Attempts : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
>           Sum: 18, Avg: 1, Min: 1, Max: 1, Diff: 0]
>       [GC Worker End (ms):  4651272.1  4651272.1  4651272.2  4651272.1 
>  4651272.1  4651272.2  4651272.1  4651272.1  4651272.1  4651272.1 
>  4651272.1  4651272.1  4651272.1  4651272.2  4651272.1  4651272.1 
>  4651272.1  4651272.1
>        Avg: 4651272.1, Min: 4651272.1, Max: 4651272.2, Diff:   0.0]
>       [GC Worker (ms):  1665.1  1665.1  1665.0  1665.0  1664.9  1664.9 
>  1664.8  1664.8  1664.8  1664.4  1664.8  1664.7  1664.7  1664.6 
>  1664.6  1664.5  1664.5  1664.5
>        Avg: 1664.8, Min: 1664.4, Max: 1665.1, Diff:   0.8]
>       [GC Worker Other (ms):  2.4  2.4  2.5  2.6  2.6  2.6  2.7  2.7 
>  2.7  3.2  2.8  2.8  2.8  2.9  3.0  3.0  3.0  3.1
>        Avg:   2.8, Min:   2.4, Max:   3.2, Diff:   0.8]
>    [Complete CSet Marking:   0.0 ms]
>    [Clear CT:   0.5 ms]
>    [Other:   1.1 ms]
>       [Choose CSet:   0.0 ms]
>       [Ref Proc:   0.8 ms]
>       [Ref Enq:   0.0 ms]
>       [Free CSet:   0.2 ms]
>    [Eden: 0B(4912M)->0B(4912M) Survivors: 0B->0B Heap: 
> 23623M(24576M)->23623M(24576M)]
> ------------------------------------------------------------------------
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20121130/51937093/attachment-0001.html 


More information about the hotspot-gc-use mailing list