Concurrent Mark sweep increases after increase in SoftRef enqueued number increases

Sundara Mohan M m.sundar85 at gmail.com
Mon Apr 11 20:50:37 UTC 2022


Hi,
    I am running ZGC in production with following settings
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Initializing The Z
Garbage Collector
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Version: 17.0.2+8
(release)
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Support: Enabled
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Nodes: 2
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] CPUs: 72 total, 72
available
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Memory: 191871M
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Large Page Support:
Disabled
[2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] GC Workers: 9/44
(static)
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Type:
Contiguous/Unrestricted/Complete
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Size:
1572864M x 3 = 4718592M
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing File:
/memfd:java_heap
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing
Filesystem: tmpfs (0x1021994)
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Min Capacity: 98304M
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Initial Capacity:
98304M
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Max Capacity: 98304M
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Medium Page Size: 32M
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Pre-touch: Disabled
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Available space on
backing filesystem: N/A
[2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Uncommit: Implicitly
Disabled (-Xms equals -Xmx)
[2022-04-05T22:30:48.788+0000][11.672s][info][gc,init] Runtime Workers: 44

In 24 hours I have seen 4 allocation stalls, each is happening over 5 mins
with >1s stall on multiple threads.
Whenever allocation stalls, the following pattern is observed
1. There is an increase in SoftRef enqueued values
2. Concurrent Mark time increase from 3s to ~12s


[2022-04-10T19:46:22.380+0000][422145.264s][info][gc,ref      ] GC(17874)
Soft: 22330 encountered, 19702 discovered, 11571 enqueued
[2022-04-10T19:46:34.534+0000][422157.418s][info][gc,ref      ] GC(17875)
Soft: 10781 encountered, 7560 discovered, 19 enqueued
[2022-04-10T19:46:47.755+0000][422170.639s][info][gc,ref      ] GC(17876)
Soft: 10863 encountered, 3289 discovered, 44 enqueued
...
[2022-04-11T05:31:55.763+0000][457278.647s][info][gc,ref      ] GC(19267)
Soft: 9664 encountered, 868 discovered, 12 enqueued
[2022-04-11T05:32:09.553+0000][457292.437s][info][gc,ref      ] GC(19268)
Soft: 9846 encountered, 6202 discovered, 98 enqueued
[2022-04-11T05:32:23.798+0000][457306.682s][info][gc,ref      ] GC(19269)
Soft: 9859 encountered, 6715 discovered, 132 enqueued
[2022-04-11T05:32:37.903+0000][457320.787s][info][gc,ref      ] GC(19270)
Soft: 9873 encountered, 7585 discovered, 42 enqueued
[2022-04-11T05:32:52.609+0000][457335.493s][info][gc,ref      ] GC(19271)
Soft: 10073 encountered, 6466 discovered, 57 enqueued
[2022-04-11T11:13:14.758+0000][477757.642s][info][gc,ref      ] GC(20578)
Soft: 25760 encountered, 20850 discovered, 15635 enqueued

Can someone share info about what else might cause Concurrent Mark time to
go higher?
Will cleaning SoftReference quickly using SoftRefLRUPolicyMSPerMB flag or
not using SoftRef might help in this case?

Can send complete logs if it helps.


Thanks
Sundar


More information about the zgc-dev mailing list