Long pause time of Final Mark in latest 11u shenandoah
Liang Mao
maoliang.ml at alibaba-inc.com
Mon Aug 31 04:31:32 UTC 2020
Hi Roman,
Thanks very much for your quick reply! I tried the flush interval but it
didn't work. And then I looked into the code and reduced the number of
ShenandoahSATBBufferSize which resolved the problem. But now I don't have
the testing environment and cannot do more verificatoins. Will let you
know once I have a double check.
BTW, do you have any experience of performance impact if we reduce the
default value of ShenandoahSATBBufferSize? I guess is it possible that
the "snapshots" in Shenandoah would be more than G1 because G1 satb
only handles old objects?
Thanks,
Liang
----------------------------------------------------------------------
Hello Liang,
It sticks out that this is a really short concurrent-mark phase (69ms)
but a really long final-mark phase (680ms). It seems likely that most
of the reachable graph gets 'hidden' in SATB queues, which are only
processed at final-mark. Also, the SATB flush interval is 100ms, so we
don't get to flush the SATB queues at all, before hitting final-mark.
It may be worth experimenting with lower
ShenandoahSATBBufferFlushInterval, e.g.
-XX:ShenandoahSATBBufferFlushInterval=30 or so.
We are considering to add machinery to force-flush SATB queues during
conc-mark, before hitting final-mark, so that we decrease the
likelyhood of unmarked subgraphs hiding in SATB buffers.
Let me know if decreasing the SATB flush interval helps!
Thanks for the report!
Cheers,
Roman
------------------------------------------------------------------
From:MAO, Liang <maoliang.ml at alibaba-inc.com>
Send Time:2020 Aug. 27 (Thu.) 14:56
To:shenandoah-dev at openjdk.java.net <shenandoah-dev at openjdk.java.net>
Subject:Long pause time of Final Mark in latest 11u shenandoah
Hi Shenandoah developers,
We are investigating the product version of Shenandoah GC in 11u and it showed
a good result except an unexpected behavior of long pause time in final mark
(finish queues actually). I didn't find any open discussions of this issue yet
and cannot see any suspicious points in code. Could you please help?
Below is the problematic GC information in log. Usually the final mark only took
~2 ms. The long pause time of final mark happened in a rate of ~1% in this application.
The java option is "-Xmx24g -Xms24g -XX:+UseShenandoahGC".
[2020-08-27T13:42:26.940+0800][gc ] Trigger: Free (2451M) is below minimum threshold (2457M)
[2020-08-27T13:42:26.940+0800][gc,ergo ] Free: 2451M, Max: 8192K regular, 1288M humongous, Frag: 47% external, 0% internal; Reserve: 1232M, Max: 8192K
[2020-08-27T13:42:26.940+0800][gc,start ] GC(2167) Concurrent reset
[2020-08-27T13:42:26.940+0800][gc,task ] GC(2167) Using 4 of 8 workers for concurrent reset
[2020-08-27T13:42:26.940+0800][gc,ergo ] GC(2167) Pacer for Reset. Non-Taxable: 24576M
[2020-08-27T13:42:26.946+0800][gc ] GC(2167) Concurrent reset 5.489ms
[2020-08-27T13:42:26.947+0800][gc,start ] GC(2167) Pause Init Mark
[2020-08-27T13:42:26.947+0800][gc,task ] GC(2167) Using 8 of 8 workers for init marking
[2020-08-27T13:42:26.948+0800][gc,ergo ] GC(2167) Pacer for Mark. Expected Live: 3577M, Free: 2451M, Non-Taxable: 245M, Alloc Tax Rate: 1.8x
[2020-08-27T13:42:26.948+0800][gc ] GC(2167) Pause Init Mark 1.093ms
[2020-08-27T13:42:26.948+0800][gc,start ] GC(2167) Concurrent marking
[2020-08-27T13:42:26.948+0800][gc,task ] GC(2167) Using 4 of 8 workers for concurrent marking
[2020-08-27T13:42:27.018+0800][gc ] GC(2167) Concurrent marking 69.710ms
[2020-08-27T13:42:27.019+0800][gc,start ] GC(2167) Pause Final Mark
[2020-08-27T13:42:27.019+0800][gc,task ] GC(2167) Using 8 of 8 workers for final marking
[2020-08-27T13:42:27.698+0800][gc,ergo ] GC(2167) Adaptive CSet Selection. Target Free: 3481M, Actual Free: 15270M, Max CSet: 1024M, Min Garbage: 0B
[2020-08-27T13:42:27.698+0800][gc,ergo ] GC(2167) Collectable Garbage: 16415M (96%), Immediate: 11966M (70%), CSet: 4448M (26%)
[2020-08-27T13:42:27.699+0800][gc,ergo ] GC(2167) Pacer for Evacuation. Used CSet: 5469M, Free: 14101M, Non-Taxable: 1410M, Alloc Tax Rate: 1.1x
[2020-08-27T13:42:27.699+0800][gc ] GC(2167) Pause Final Mark 679.962ms
[2020-08-27T13:42:27.699+0800][gc,start ] GC(2167) Concurrent cleanup
[2020-08-27T13:42:27.699+0800][gc ] GC(2167) Concurrent cleanup 20942M->8984M(24576M) 0.288ms
[2020-08-27T13:42:27.699+0800][gc,ergo ] GC(2167) Free: 14100M, Max: 8192K regular, 1288M humongous, Frag: 91% external, 0% internal; Reserve: 1231M, Max: 8192K
[2020-08-27T13:42:27.699+0800][gc,start ] GC(2167) Concurrent evacuation
[2020-08-27T13:42:27.699+0800][gc,task ] GC(2167) Using 4 of 8 workers for concurrent evacuation
[2020-08-27T13:42:27.879+0800][gc ] GC(2167) Concurrent evacuation 180.091ms
[2020-08-27T13:42:27.881+0800][gc,start ] GC(2167) Pause Init Update Refs
[2020-08-27T13:42:27.881+0800][gc,ergo ] GC(2167) Pacer for Update Refs. Used: 10377M, Free: 13756M, Non-Taxable: 1375M, Alloc Tax Rate: 1.1x
[2020-08-27T13:42:27.881+0800][gc ] GC(2167) Pause Init Update Refs 0.154ms
[2020-08-27T13:42:27.881+0800][gc,start ] GC(2167) Concurrent update references
[2020-08-27T13:42:27.881+0800][gc,task ] GC(2167) Using 4 of 8 workers for concurrent reference update
[2020-08-27T13:42:28.570+0800][gc ] GC(2167) Concurrent update references 688.702ms[2020-08-27T13:42:28.571+0800][gc,start ] GC(2167) Pause Final Update Refs
[2020-08-27T13:42:28.571+0800][gc,task ] GC(2167) Using 8 of 8 workers for final reference update
[2020-08-27T13:42:28.571+0800][gc ] GC(2167) Pause Final Update Refs 0.402ms
[2020-08-27T13:42:28.571+0800][gc,start ] GC(2167) Concurrent cleanup
[2020-08-27T13:42:28.571+0800][gc ] GC(2167) Concurrent cleanup 10886M->5416M(24576M) 0.156ms
[2020-08-27T13:42:28.572+0800][gc,stats ] Pause Final Mark (G) 681245 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Pause Final Mark (N) 679951 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Finish Queues 678515 us
[2020-08-27T13:42:28.572+0800][gc,stats ] System Purge 130 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Weak Roots 129 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Update Region States 30 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Retire TLABs 102 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Choose Collection Set 197 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Rebuild Free Set 26 us
[2020-08-27T13:42:28.572+0800][gc,stats ] Initial Evacuation 893 us, parallelism: 5.36x
Thanks,
Liang
More information about the shenandoah-dev
mailing list