Very long allocation stall
Uberto Barbini
uberto.gama at gmail.com
Thu Sep 27 15:13:43 UTC 2018
Thank you very much Per for the detailed explanation.
>From the gc.log I see:
Live 4849M
Allocation rate (avg/max): 2128 / 5490
Since my program is able to run without OOM with 1g heap, I'm a bit
confused about what exactly Live means.
In the meanwhile I'll also look at the peak allocation rate which seems a
bit higher than needed (my guesstimate were 10k runs/s * 100 moves per run
* 1k per move = 1GB per sec.)
Thank you again
Uberto
----
[2018-09-26T19:31:01.512+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-5) 2218.161ms
[2018-09-26T19:31:01.514+0100][info][gc,phases ] GC(139) Concurrent
Select Relocation Set 2.322ms
[2018-09-26T19:31:01.518+0100][info][gc,phases ] GC(139) Concurrent
Prepare Relocation Set 4.155ms
[2018-09-26T19:31:01.519+0100][info][gc,phases ] GC(139) Pause Relocate
Start 0.083ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-9) 2224.524ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-15) 2224.466ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-3) 2224.340ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-11) 2224.326ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-5) 6.433ms
[2018-09-26T19:31:01.530+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-7) 2231.541ms
[2018-09-26T19:31:01.519+0100][info][gc ] Allocation Stall
(ForkJoinPool.commonPool-worker-13) 2224.212ms
[2018-09-26T19:31:01.581+0100][info][gc,phases ] GC(139) Concurrent
Relocate 62.703ms
[2018-09-26T19:31:01.581+0100][info][gc,load ] GC(139) Load:
7.51/8.25/7.38
[2018-09-26T19:31:01.581+0100][info][gc,mmu ] GC(139) MMU: 2ms/0.0%,
5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, 100ms/0.0%
[2018-09-26T19:31:01.581+0100][info][gc,marking ] GC(139) Mark: 2
stripe(s), 3 proactive flush(es), 1 terminate flush(es), 0 completion(s), 0
continuation(s)
[2018-09-26T19:31:01.581+0100][info][gc,reloc ] GC(139) Relocation:
Incomplete
[2018-09-26T19:31:01.581+0100][info][gc,nmethod ] GC(139) NMethods: 385
registered, 180 unregistered
[2018-09-26T19:31:01.581+0100][info][gc,metaspace] GC(139) Metaspace: 8M
used, 8M capacity, 9M committed, 10M reserved
[2018-09-26T19:31:01.581+0100][info][gc,ref ] GC(139) Soft: 103
encountered, 0 discovered, 0 enqueued
[2018-09-26T19:31:01.582+0100][info][gc,ref ] GC(139) Weak: 148
encountered, 75 discovered, 0 enqueued
[2018-09-26T19:31:01.582+0100][info][gc,ref ] GC(139) Final: 0
encountered, 0 discovered, 0 enqueued
[2018-09-26T19:31:01.582+0100][info][gc,ref ] GC(139) Phantom: 42
encountered, 36 discovered, 0 enqueued
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139)
Mark Start Mark End Relocate Start Relocate End
High Low
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Capacity:
12288M (100%) 12288M (100%) 12288M (100%) 12288M (100%)
12288M (100%) 12288M (100%)
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Reserve:
42M (0%) 42M (0%) 42M (0%) 42M (0%)
42M (0%) 36M (0%)
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Free:
3442M (28%) 0M (0%) 0M (0%) 3550M (29%)
3798M (31%) 0M (0%)
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Used:
8804M (72%) 12246M (100%) 12246M (100%) 8696M (71%)
12252M (100%) 8448M (69%)
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Live:
- 4849M (39%) 4849M (39%) 4849M (39%)
- -
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Allocated:
- 3442M (28%) 3444M (28%) 3830M (31%)
- -
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Garbage:
- 3954M (32%) 3952M (32%) 54M (0%)
- -
[2018-09-26T19:31:01.582+0100][info][gc,heap ] GC(139) Reclaimed:
- - 2M (0%) 3900M (32%)
- -
[2018-09-26T19:31:01.582+0100][info][gc ] GC(139) Garbage
Collection (Allocation Rate) 8804M(72%)->8696M(71%)
[2018-09-26T19:31:01.681+0100][info][gc,start ] GC(140) Garbage
Collection (Allocation Rate)
[2018-09-26T19:31:01.682+0100][info][gc,phases ] GC(140) Pause Mark Start
0.094ms
[2018-09-26T19:31:02.682+0100][info][gc,stats ] === Garbage Collection
Statistics
=======================================================================================================================
[2018-09-26T19:31:02.682+0100][info][gc,stats ]
Last 10s Last 10m
Last 10h Total
[2018-09-26T19:31:02.682+0100][info][gc,stats ]
Avg / Max Avg / Max
Avg / Max Avg / Max
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Collector: Garbage
Collection Cycle 3063.847 / 3099.858 2011.178 /
3671.006 2011.178 / 3671.006 2011.178 / 3671.006 ms
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Contention: Mark
Segment Reset Contention 1 / 3 1 / 8
1 / 8 1 / 8 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Contention: Mark SeqNum
Reset Contention 0 / 1 0 / 2
0 / 2 0 / 2 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Contention: Relocation
Contention 0 / 0 2 / 51
2 / 51 2 / 51 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Critical: Allocation
Stall 1922.303 / 2231.541 1448.690 / 2297.381
1448.690 / 2297.381 1448.690 / 2297.381 ms
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Critical: Allocation
Stall 2 / 8 1 / 12
1 / 12 1 / 12 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Critical: GC Locker
Stall 0.000 / 0.000 0.000 / 0.000
0.000 / 0.000 0.000 / 0.000 ms
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Critical: GC Locker
Stall 0 / 0 0 / 0
0 / 0 0 / 0 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Allocation
Rate 1428 / 2914 2128 / 5490
2128 / 5490 2128 / 5490 MB/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Heap Used
After Mark 12246 / 12246 10840 / 12246
10840 / 12246 10840 / 12246 MB
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Heap Used
After Relocation 8659 / 9040 6584 / 9040
6584 / 9040 6584 / 9040 MB
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Heap Used
Before Mark 9076 / 9298 7126 / 9478
7126 / 9478 7126 / 9478 MB
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Heap Used
Before Relocation 12246 / 12246 10850 / 12246
10850 / 12246 10850 / 12246 MB
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Out Of
Memory 0 / 0 0 / 0
0 / 0 0 / 0 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Page Cache
Flush 0 / 0 0 / 32
0 / 32 0 / 32 MB/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Page Cache
Hit L1 719 / 1457 1053 / 2604
1053 / 2604 1053 / 2604 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Page Cache
Hit L2 0 / 0 0 / 0
0 / 0 0 / 0 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Page Cache
Miss 0 / 0 19 / 1940
19 / 1940 19 / 1940 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Undo Object
Allocation Failed 0 / 0 0 / 4
0 / 4 0 / 4 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Undo Object
Allocation Succeeded 0 / 0 2 / 51
2 / 51 2 / 51 ops/s
[2018-09-26T19:31:02.682+0100][info][gc,stats ] Memory: Undo Page
Allocation 0 / 3 0 / 10
0 / 10 0 / 10 ops/s
On Thu, 27 Sep 2018 at 09:39, Per Liden <per.liden at oracle.com> wrote:
> Hi,
>
> On 09/26/2018 10:43 PM, Uberto Barbini wrote:
> > Hi Per,
> >
> > thanks for your reply.
> > My comments inline
> >
> > > Do you have any idea of what can be the possible cause?
> >
> > If you see allocation stalls it's an indication that ZGC can't keep
> up
> > with the allocation rate of your application. I.e. the headroom in
> the
> > heap (max heap size - live-set size) is too small. To avoid this,
> > increase the max heap size until you don't get allocation stalls and
> > you
> > have a healthy/acceptable GC frequency.
> >
> >
> > Although the application is pretty aggressive on the allocation, the
> > live-set is relatively small (<1gb), so I thought 6gb of heap should be
> > enough.
> > I've doubled it to 12gb of heap and performance improved a bit but there
> > are still lots of Allocation Stalls.
> >
> > Another (secondary) option to play with is -XX:ConcGCThreads to give
> > more CPU-time to the GC (essentially making it faster so each cycle
> > will
> > be shorter). You can use -Xlog:gc+init to see how many threads ZGC
> > automatically selects now and go from there.
> >
> >
> > Changing the number of threads improved things a lot, but didn't really
> > solved the problem, only moved it further.
> > Using XX:ConcGCThreads=4, in 60 seconds it is able to calculate about
> > 480k positions, same than g1 (but with better latency) and almost the
> > double then before.
> > But increasing the search depth to 120 seconds for move, the Allocation
> > Stalls popped up again and the app stopped again.
> > Increasing to more than 4 didn't change anything since I've only 4 cores
> > in my cpu.
>
> Ok, so it sounds like you've over-provisioned your system. With 4
> concurrent GC threads on a 4 core machine, the GC and your application
> will compete for CPU-time. This is typically bad for latency. On the
> other hand, it seems like you're trying to optimize for throughput
> rather than latency here?
>
> >
> > I'm not familiar enough with ZGC to try an explanation, but in my
> > application only one thread keep most of the long living memory
> > allocations (the tree of positions) whilst the other threads are used to
> > evaluate each new position, which means: 1) allocate many objects 2)
> > calculate the result 3) return results to main thread and deallocate all
> > objects. Is it possible that this behavior confound ZGC?
>
> No, that behavior is fine. From what I can tell, your application just
> has a very high allocation rate. You need a heap headroom that is at
> least "allocation_rate * gc_cycle_time" to avoid allocation stalls. For
> example, let's say it takes 2 second to complete a GC cycle and your
> application is allocating 10GB/s, then you need at least a 20GB headroom.
>
> In you case, you say you have a live-set that is <1GB, so a complete GC
> cycle should be pretty short, like well below 1 second. If your ~11GB of
> headroom isn't enough, then you must be allocating more than ~11GB/s?
>
> You can use -Xlog:gc+alloc=debug to continuously print the allocation
> rate. You can also see it in the statistics counters with -Xlog:gc+stats
> (or just -Xlog:gc*). You can see the GC cycle time, live-set size there
> too.
>
> cheers,
> Per
>
More information about the zgc-dev
mailing list