G1 GC consuming all CPU time
Claes Redestad
claes.redestad at oracle.com
Tue Jun 3 22:57:45 UTC 2014
Hi,
guessing it's due to the concurrent GC threads tripping over
themselves: the microbenchmark is creating one, big linked structure
that will occupy most of the old gen, and then you're doing intense
pointer updates which will trigger scans and updates of remembered sets
etc. I actually don't know half the details and am mostly just guessing. :-)
Converted your micro to a JMH micro to ease with experimenting[1]
(hope you don't mind) then verified the regression reproduces:
Parallel:
java -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1 .*G1GraphBench.*
~1625 ops/ms
G1:
java -XX:+UseG1GC -jar target/microbenchmarks.jar -wi 3 -i 10 -f 1
.*G1GraphBench.*
~12 ops/ms
Testing my hunch, let's try forcing the concurrent refinement to use
only one thread:
java -XX:+UseG1GC -XX:-G1UseAdaptiveConcRefinement
-XX:G1ConcRefinementThreads=1 -jar target/microbenchmarks.jar -wi 3 -i
10 -f 1 .*G1GraphBench.*
~1550 ops/ms
I guess we have a winner! I won't hazard to try and answer your
questions about how this should be resolved - perhaps the adaptive
policy can detect this corner case and scale down the number of
refinement threads when they start interfering with each other, or
something.
/Claes
[1]
package org.sample;
import org.openjdk.jmh.annotations.GenerateMicroBenchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import java.util.Random;
@State(Scope.Thread)
public class G1GraphBench {
private static class Node {
private Node next;
private Node[] others = new Node[10];
}
Node head = buildGraph();
private static Node buildGraph() {
// Create a collection of Node objects
Node[] array = new Node[10000000];
for (int i = 0; i < array.length; i++) {
array[i] = new Node();
}
// Each Node refers to 10 other random Nodes
Random random = new Random(12);
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array[i].others.length; j++) {
int k = random.nextInt(array.length);
array[i].others[j] = array[k];
}
}
// The first Node serves as the head of a queue
return array[0];
}
@GenerateMicroBenchmark
public Node nodeBench() {
Node node = head;
head = node.next;
node.next = null;
// Add the other Nodes to the head of the queue
for (Node other : node.others) {
other.next = head;
head = other;
}
return head;
}
}
On 2014-06-03 23:43, Peter Harvey wrote:
> Thanks for the response. Here are the additional logs.
>
> 0.094: [GC pause (young), 0.0347877 secs]
> [Parallel Time: 34.1 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 94.2, Avg: 104.4, Max: 126.4,
> Diff: 32.2]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 3.3, Max: 25.0,
> Diff: 25.0, Sum: 26.6]
> [Update RS (ms): Min: 0.0, Avg: 2.1, Max: 5.3, Diff: 5.3,
> Sum: 16.7]
> [Processed Buffers: Min: 0, Avg: 2.3, Max: 9, Diff: 9,
> Sum: 18]
> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
> 0.0]
> [Object Copy (ms): Min: 1.8, Avg: 18.3, Max: 29.9, Diff:
> 28.2, Sum: 146.4]
> [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1,
> Sum: 0.6]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.1]
> [GC Worker Total (ms): Min: 1.9, Avg: 23.8, Max: 34.1, Diff:
> 32.2, Sum: 190.4]
> [GC Worker End (ms): Min: 128.2, Avg: 128.3, Max: 128.3,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.0 ms]
> [Other: 0.6 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.3 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.0 ms]
> [Eden: 24.0M(24.0M)->0.0B(11.0M) Survivors: 0.0B->3072.0K Heap:
> 62.1M(254.0M)->62.2M(254.0M)]
> [Times: user=0.09 sys=0.03, real=0.04 secs]
> 0.131: [GC pause (young), 0.0295093 secs]
> [Parallel Time: 28.1 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 130.9, Avg: 135.5, Max: 158.7,
> Diff: 27.8]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
> 0.4, Sum: 1.2]
> [Update RS (ms): Min: 0.0, Avg: 11.4, Max: 27.5, Diff: 27.5,
> Sum: 90.8]
> [Processed Buffers: Min: 0, Avg: 23.8, Max: 42, Diff: 42,
> Sum: 190]
> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
> 0.0]
> [Object Copy (ms): Min: 0.0, Avg: 11.7, Max: 17.1, Diff:
> 17.1, Sum: 93.8]
> [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3,
> Sum: 1.7]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.1]
> [GC Worker Total (ms): Min: 0.2, Avg: 23.5, Max: 28.1, Diff:
> 27.8, Sum: 187.7]
> [GC Worker End (ms): Min: 159.0, Avg: 159.0, Max: 159.0,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 1.3 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.0 ms]
> [Eden: 11.0M(11.0M)->0.0B(23.0M) Survivors: 3072.0K->2048.0K
> Heap: 73.2M(254.0M)->82.7M(508.0M)]
> [Times: user=0.19 sys=0.00, real=0.03 secs]
> 0.166: [GC pause (young), 0.0385523 secs]
> [Parallel Time: 35.9 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 166.4, Avg: 169.8, Max: 192.4,
> Diff: 25.9]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
> 0.4, Sum: 1.9]
> [Update RS (ms): Min: 0.0, Avg: 10.9, Max: 31.9, Diff: 31.9,
> Sum: 87.2]
> [Processed Buffers: Min: 0, Avg: 14.6, Max: 26, Diff: 26,
> Sum: 117]
> [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum:
> 0.1]
> [Object Copy (ms): Min: 3.5, Avg: 21.4, Max: 27.0, Diff:
> 23.4, Sum: 171.1]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
> Sum: 0.4]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.1]
> [GC Worker Total (ms): Min: 10.0, Avg: 32.6, Max: 35.9,
> Diff: 25.9, Sum: 260.7]
> [GC Worker End (ms): Min: 202.3, Avg: 202.4, Max: 202.4,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.0 ms]
> [Other: 2.6 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.0 ms]
> [Eden: 23.0M(23.0M)->0.0B(46.0M) Survivors: 2048.0K->4096.0K
> Heap: 105.7M(508.0M)->110.1M(1016.0M)]
> [Times: user=0.19 sys=0.00, real=0.04 secs]
> 0.222: [GC pause (young), 0.0558720 secs]
> [Parallel Time: 53.0 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 222.0, Avg: 222.2, Max: 222.5,
> Diff: 0.5]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff:
> 0.4, Sum: 1.5]
> [Update RS (ms): Min: 7.7, Avg: 8.7, Max: 10.9, Diff: 3.2,
> Sum: 69.4]
> [Processed Buffers: Min: 7, Avg: 8.5, Max: 12, Diff: 5,
> Sum: 68]
> [Scan RS (ms): Min: 0.0, Avg: 0.3, Max: 0.6, Diff: 0.6, Sum:
> 2.3]
> [Object Copy (ms): Min: 41.7, Avg: 43.6, Max: 44.3, Diff:
> 2.7, Sum: 348.5]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.0]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.2]
> [GC Worker Total (ms): Min: 52.4, Avg: 52.7, Max: 52.9,
> Diff: 0.5, Sum: 421.8]
> [GC Worker End (ms): Min: 274.9, Avg: 274.9, Max: 274.9,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.0 ms]
> [Other: 2.8 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.0 ms]
> [Eden: 46.0M(46.0M)->0.0B(74.0M) Survivors: 4096.0K->7168.0K
> Heap: 156.1M(1016.0M)->158.6M(1625.0M)]
> [Times: user=0.48 sys=0.01, real=0.06 secs]
> 0.328: [GC pause (young), 0.0853794 secs]
> [Parallel Time: 82.8 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 327.9, Avg: 330.8, Max: 351.1,
> Diff: 23.2]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
> 0.3, Sum: 2.0]
> [Update RS (ms): Min: 0.0, Avg: 5.5, Max: 8.3, Diff: 8.3,
> Sum: 43.9]
> [Processed Buffers: Min: 0, Avg: 2.3, Max: 3, Diff: 3,
> Sum: 18]
> [Scan RS (ms): Min: 0.0, Avg: 2.2, Max: 3.3, Diff: 3.3, Sum:
> 17.4]
> [Object Copy (ms): Min: 59.5, Avg: 71.8, Max: 73.7, Diff:
> 14.2, Sum: 574.7]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.2]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.2]
> [GC Worker Total (ms): Min: 59.5, Avg: 79.8, Max: 82.7,
> Diff: 23.2, Sum: 638.4]
> [GC Worker End (ms): Min: 410.6, Avg: 410.7, Max: 410.7,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 2.6 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.1 ms]
> [Eden: 74.0M(74.0M)->0.0B(94.0M) Survivors: 7168.0K->11.0M
> Heap: 232.6M(1625.0M)->237.6M(2112.0M)]
> [Times: user=0.59 sys=0.00, real=0.09 secs]
> 0.447: [GC pause (young), 0.1239103 secs]
> [Parallel Time: 121.5 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 447.5, Avg: 447.7, Max: 448.5,
> Diff: 0.9]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
> 0.3, Sum: 1.9]
> [Update RS (ms): Min: 26.5, Avg: 28.2, Max: 28.7, Diff: 2.2,
> Sum: 225.7]
> [Processed Buffers: Min: 38, Avg: 39.8, Max: 44, Diff: 6,
> Sum: 318]
> [Scan RS (ms): Min: 0.3, Avg: 0.7, Max: 1.9, Diff: 1.6, Sum:
> 5.3]
> [Object Copy (ms): Min: 92.1, Avg: 92.2, Max: 92.3, Diff:
> 0.2, Sum: 737.5]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1,
> Sum: 0.3]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.2]
> [GC Worker Total (ms): Min: 120.6, Avg: 121.4, Max: 121.5,
> Diff: 0.9, Sum: 970.8]
> [GC Worker End (ms): Min: 569.0, Avg: 569.0, Max: 569.0,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 2.3 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.1 ms]
> [Eden: 94.0M(94.0M)->0.0B(111.0M) Survivors: 11.0M->14.0M Heap:
> 331.6M(2112.0M)->334.6M(2502.0M)]
> [Times: user=0.80 sys=0.05, real=0.12 secs]
> 0.599: [GC pause (young), 0.1479438 secs]
> [Parallel Time: 145.7 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 599.4, Avg: 599.5, Max: 599.8,
> Diff: 0.4]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
> 0.3, Sum: 1.9]
> [Update RS (ms): Min: 41.8, Avg: 43.0, Max: 44.0, Diff: 2.1,
> Sum: 343.6]
> [Processed Buffers: Min: 67, Avg: 70.9, Max: 73, Diff: 6,
> Sum: 567]
> [Scan RS (ms): Min: 0.0, Avg: 0.8, Max: 1.9, Diff: 1.9, Sum:
> 6.2]
> [Object Copy (ms): Min: 101.3, Avg: 101.6, Max: 101.7, Diff:
> 0.3, Sum: 812.6]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.1]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.2]
> [GC Worker Total (ms): Min: 145.2, Avg: 145.6, Max: 145.6,
> Diff: 0.4, Sum: 1164.6]
> [GC Worker End (ms): Min: 745.1, Avg: 745.1, Max: 745.1,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 2.2 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.1 ms]
> [Eden: 111.0M(111.0M)->0.0B(124.0M) Survivors: 14.0M->16.0M
> Heap: 445.6M(2502.0M)->448.6M(2814.0M)]
> [Times: user=1.20 sys=0.05, real=0.15 secs]
> 0.787: [GC pause (young), 0.1625321 secs]
> [Parallel Time: 160.0 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 786.6, Avg: 786.7, Max: 786.9,
> Diff: 0.4]
> [Ext Root Scanning (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff:
> 0.3, Sum: 1.8]
> [Update RS (ms): Min: 46.4, Avg: 47.0, Max: 49.0, Diff: 2.5,
> Sum: 376.0]
> [Processed Buffers: Min: 75, Avg: 78.0, Max: 79, Diff: 4,
> Sum: 624]
> [Scan RS (ms): Min: 0.0, Avg: 0.9, Max: 1.5, Diff: 1.5, Sum:
> 7.4]
> [Object Copy (ms): Min: 110.6, Avg: 111.7, Max: 112.0, Diff:
> 1.4, Sum: 893.5]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.1]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff:
> 0.1, Sum: 0.3]
> [GC Worker Total (ms): Min: 159.6, Avg: 159.9, Max: 160.0,
> Diff: 0.4, Sum: 1279.0]
> [GC Worker End (ms): Min: 946.5, Avg: 946.5, Max: 946.6,
> Diff: 0.1]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 2.4 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.2 ms]
> [Eden: 124.0M(124.0M)->0.0B(135.0M) Survivors: 16.0M->18.0M
> Heap: 572.6M(2814.0M)->576.6M(3064.0M)]
> [Times: user=1.37 sys=0.00, real=0.16 secs]
> 0.981: [GC pause (young), 0.2063055 secs]
> [Parallel Time: 204.1 ms, GC Workers: 8]
> [GC Worker Start (ms): Min: 980.8, Avg: 980.9, Max: 981.0,
> Diff: 0.2]
> [Ext Root Scanning (ms): Min: 0.1, Avg: 0.3, Max: 0.3, Diff:
> 0.2, Sum: 2.1]
> [Update RS (ms): Min: 55.9, Avg: 57.8, Max: 58.8, Diff: 2.9,
> Sum: 462.8]
> [Processed Buffers: Min: 100, Avg: 101.5, Max: 103, Diff:
> 3, Sum: 812]
> [Scan RS (ms): Min: 0.0, Avg: 1.0, Max: 3.1, Diff: 3.1, Sum:
> 8.3]
> [Object Copy (ms): Min: 144.7, Avg: 144.8, Max: 144.9, Diff:
> 0.1, Sum: 1158.3]
> [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0,
> Sum: 0.3]
> [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff:
> 0.0, Sum: 0.2]
> [GC Worker Total (ms): Min: 203.8, Avg: 204.0, Max: 204.0,
> Diff: 0.2, Sum: 1631.9]
> [GC Worker End (ms): Min: 1184.9, Avg: 1184.9, Max: 1184.9,
> Diff: 0.0]
> [Code Root Fixup: 0.0 ms]
> [Clear CT: 0.1 ms]
> [Other: 2.1 ms]
> [Choose CSet: 0.0 ms]
> [Ref Proc: 0.1 ms]
> [Ref Enq: 0.0 ms]
> [Free CSet: 0.1 ms]
> [Eden: 135.0M(135.0M)->0.0B(143.0M) Survivors: 18.0M->20.0M
> Heap: 711.6M(3064.0M)->714.6M(3264.0M)]
> [Times: user=1.40 sys=0.11, real=0.21 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9166222455142531
>
> Start
> Stop
> Sleep
> CPU Load Is 0.907013989900451
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9085635227776081
>
> Start
> Stop
> Sleep
> CPU Load Is 0.909945506396622
>
>
>
> Note that all the logged GC occurs during the construction of my graph
> of Nodes, which is /before/ my algorithm (modifyGraph) starts, There
> is no log of GC activity once the algorithm starts, but there is
> significant (100%) CPU usage.
>
> My questions are:
>
> * Why is the G1 garbage collector consuming so much CPU time? What
> is it doing?
> * Why is the G1 garbage collector not logging anything? The only
> reason I even know it's the garbage collector consuming my CPU
> time is that (a) I only see this behaviour when the G1 collector
> is enabled and (b) the load on the CPU correlates with the value
> of -XX:ParallelGCThreads.
> * Are there particular object-graph structures that the G1 garbage
> collector will struggle with? Should complex graphs be considered
> bad coding practice?
> * How can I write my code to avoid this behaviour in the G1 garbage
> collector? For example, if all my Nodes are in an array, will this
> fix it?
> * Should this be considered a bug in the G1 garbage collector? This
> is far beyond 'a small increase in CPU usage'.
>
> Just to demonstrate the issue further, I timed my calls to
> modifyGraph() and trialled different GC parameters:
>
> * -XX:+UseG1GC -XX:ParallelGCThreads=1 took 82.393 seconds and CPU
> load was 0.1247
> * -XX:+UseG1GC -XX:ParallelGCThreads=4 took 19.829 seconds and CPU
> load was 0.5960
> * -XX:+UseG1GC -XX:ParallelGCThreads=8 took 14.815 seconds and CPU
> load was 0.9184
> * -XX:+UseConcMarkSweepGC took 0.322 seconds and CPU load was 0.1119
> regardless of the setting of -XX:ParallelGCThreads
>
> So using the CMS GC made my application 44x faster (14.815 seconds
> versus 0.322 seconds) and placed 1/8th of the load (0.9184 versus
> 0.1119) on the CPU.
>
> If my code represents some kind of hypothetical worst case for the G1
> garbage collector, I think it should be documented and/or fixed somehow.
>
> Regards,
> Peter.
>
>
>
> On Tue, Jun 3, 2014 at 3:16 PM, Tao Mao <yiyeguhu at gmail.com
> <mailto:yiyeguhu at gmail.com>> wrote:
>
> And, use –XX:+PrintGCDetails –XX:+PrintGCTimeStamps to get more
> log. Thanks. -Tao
>
>
> On Tue, Jun 3, 2014 at 2:13 PM, Tao Mao <yiyeguhu at gmail.com
> <mailto:yiyeguhu at gmail.com>> wrote:
>
> Hi Peter,
>
> What was your actual question?
> Try -XX:ParallelGCThreads=<value> if you want less CPU usage
> from GC.
>
> Thanks.
> Tao
>
>
> On Tue, Jun 3, 2014 at 11:49 AM, Peter Harvey
> <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:
>
> Small correction. The last example of output was with
> "-XX:+UseConcMarkSweepGC -verbose:gc".
>
>
> On Tue, Jun 3, 2014 at 12:41 PM, Peter Harvey
> <harvey at actenum.com <mailto:harvey at actenum.com>> wrote:
>
> I have an algorithm (at bottom of email) which builds
> a graph of 'Node' objects with random connections
> between them. It then repeatedly processes a queue of
> those Nodes, adding new Nodes to the queue as it goes.
> This is a single-threaded algorithm that will never
> terminate. Our actual production code is much more
> complex, but I've trimmed it down as much as possible.
>
> On Windows 7 with JRE 7u60, enabling the G1 garbage
> collector will cause the JRE to consume all 8 cores of
> my CPU. No other garbage collector does this. You can
> see the differences in CPU load in the example output
> below. It's also worth nothing that "-verbose:gc" with
> the G1 garbage collector prints nothing after my
> algorithm starts. Presumably the G1 garbage collector
> is doing something (concurrent mark?), but it's not
> printing anything about it.
>
> When run with VM args "-XX:+UseG1GC -verbose:gc" I get
> output like this (note the huge CPU load value which
> should not be this high for a single-threaded
> algorithm on an 8 core CPU):
>
> [GC pause (young) 62M->62M(254M), 0.0394214 secs]
> [GC pause (young) 73M->83M(508M), 0.0302781 secs]
> [GC pause (young) 106M->111M(1016M), 0.0442273 secs]
> [GC pause (young) 157M->161M(1625M), 0.0660902 secs]
> [GC pause (young) 235M->240M(2112M), 0.0907231 secs]
> [GC pause (young) 334M->337M(2502M), 0.1356917 secs]
> [GC pause (young) 448M->450M(2814M), 0.1219090 secs]
> [GC pause (young) 574M->577M(3064M), 0.1778062 secs]
> [GC pause (young) 712M->715M(3264M), 0.1878443 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9196154547182949
>
> Start
> Stop
> Sleep
> CPU Load Is 0.9150735995043818
>
> ...
>
>
>
> When run with VM args "-XX:+UseParallelGC -verbose:gc"
> I get output like this:
>
> [GC 65536K->64198K(249344K), 0.0628289 secs]
> [GC 129734K->127974K(314880K), 0.1583369 secs]
> [Full GC 127974K->127630K(451072K), 0.9675224 secs]
> [GC 258702K->259102K(451072K), 0.3543645 secs]
> [Full GC 259102K->258701K(732672K), 1.8085702 secs]
> [GC 389773K->390181K(790528K), 0.3332060 secs]
> [GC 579109K->579717K(803328K), 0.5126388 secs]
> [Full GC 579717K->578698K(1300480K), 4.0647303 secs]
> [GC 780426K->780842K(1567232K), 0.4364933 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.03137771539054431
>
> Start
> Stop
> Sleep
> CPU Load Is 0.032351299224373145
>
> ...
>
>
>
> When run with VM args "-verbose:gc" I get output like
> this:
>
> [GC 69312K->67824K(251136K), 0.1533803 secs]
> [GC 137136K->135015K(251136K), 0.0970460 secs]
> [GC 137245K(251136K), 0.0095245 secs]
> [GC 204327K->204326K(274368K), 0.1056259 secs]
> [GC 273638K->273636K(343680K), 0.1081515 secs]
> [GC 342948K->342946K(412992K), 0.1181966 secs]
> [GC 412258K->412257K(482304K), 0.1126966 secs]
> [GC 481569K->481568K(551808K), 0.1156015 secs]
> [GC 550880K->550878K(620928K), 0.1184089 secs]
> [GC 620190K->620189K(690048K), 0.1209312 secs]
> [GC 689501K->689499K(759552K), 0.1199338 secs]
> [GC 758811K->758809K(828864K), 0.1162532 secs]
> CPU Load Is -1.0
>
> Start
> Stop
> Sleep
> CPU Load Is 0.10791719146608299
>
> Start
> [GC 821213K(828864K), 0.1966807 secs]
> Stop
> Sleep
> CPU Load Is 0.1540065314146181
>
> Start
> Stop
> Sleep
> [GC 821213K(1328240K), 0.1962688 secs]
> CPU Load Is 0.08427292195744103
>
> ...
>
>
>
> Why is the G1 garbage collector consuming so much CPU
> time? Is it stuck in the mark phase as I am modifying
> the graph structure?
>
> I'm not a subscriber to the list, so please CC me in
> any response.
>
> Thanks,
> Peter.
>
> --
>
> import java.lang.management.ManagementFactory;
> import com.sun.management.OperatingSystemMXBean;
> import java.util.Random;
>
> @SuppressWarnings("restriction")
> public class Node {
> private static OperatingSystemMXBean os =
> (OperatingSystemMXBean)
> ManagementFactory.getOperatingSystemMXBean();
>
> private Node next;
>
> private Node[] others = new Node[10];
>
> public static void main(String[] args) throws
> InterruptedException {
>
> // Build a graph of Nodes
> Node head = buildGraph();
>
> while (true) {
> // Print CPU load for this process
> System.out.println("CPU Load Is " +
> os.getProcessCpuLoad());
> System.out.println();
>
> // Modify the graph
> System.out.println("Start");
> head = modifyGraph(head);
> System.out.println("Stop");
>
> // Sleep, as otherwise we tend to DoS the host computer...
> System.out.println("Sleep");
> Thread.sleep(1000);
> }
> }
>
> private static Node buildGraph() {
>
> // Create a collection of Node objects
> Node[] array = new Node[10000000];
> for (int i = 0; i < array.length; i++) {
> array[i] = new Node();
> }
>
> // Each Node refers to 10 other random Nodes
> Random random = new Random(12);
> for (int i = 0; i < array.length; i++) {
> for (int j = 0; j < array[i].others.length; j++) {
> int k = random.nextInt(array.length);
> array[i].others[j] = array[k];
> }
> }
>
> // The first Node serves as the head of a queue
> return array[0];
> }
>
> private static Node modifyGraph(Node head) {
>
> // Perform a million iterations
> for (int i = 0; i < 1000000; i++) {
>
> // Pop a Node off the head of the queue
> Node node = head;
> head = node.next;
> node.next = null;
>
> // Add the other Nodes to the head of the queue
> for (Node other : node.others) {
> other.next = head;
> head = other;
> }
> }
> return head;
> }
>
> }
>
> --
> *Actenum Corporation*
> Peter Harvey | Cell: 780.729.8192 <tel:780.729.8192>
> | harvey at actenum.com <mailto:harvey at actenum.com> |
> www.actenum.com <http://www.actenum.com>
>
>
>
>
> --
> *Actenum Corporation*
> Peter Harvey | Cell: 780.729.8192 <tel:780.729.8192> |
> harvey at actenum.com <mailto:harvey at actenum.com> |
> www.actenum.com <http://www.actenum.com>
>
>
>
>
>
>
> --
> *Actenum Corporation*
> Peter Harvey | Cell: 780.729.8192 | harvey at actenum.com
> <mailto:harvey at actenum.com> | www.actenum.com <http://www.actenum.com>
More information about the hotspot-gc-dev
mailing list