From m.sundar85 at gmail.com Mon Apr 11 20:50:37 2022 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Mon, 11 Apr 2022 13:50:37 -0700 Subject: Concurrent Mark sweep increases after increase in SoftRef enqueued number increases Message-ID: Hi, I am running ZGC in production with following settings [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Initializing The Z Garbage Collector [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Version: 17.0.2+8 (release) [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Support: Enabled [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Nodes: 2 [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] CPUs: 72 total, 72 available [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Memory: 191871M [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Large Page Support: Disabled [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] GC Workers: 9/44 (static) [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Type: Contiguous/Unrestricted/Complete [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Size: 1572864M x 3 = 4718592M [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing File: /memfd:java_heap [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing Filesystem: tmpfs (0x1021994) [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Min Capacity: 98304M [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Initial Capacity: 98304M [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Max Capacity: 98304M [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Medium Page Size: 32M [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Pre-touch: Disabled [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Available space on backing filesystem: N/A [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Uncommit: Implicitly Disabled (-Xms equals -Xmx) [2022-04-05T22:30:48.788+0000][11.672s][info][gc,init] Runtime Workers: 44 In 24 hours I have seen 4 allocation stalls, each is happening over 5 mins with >1s stall on multiple threads. Whenever allocation stalls, the following pattern is observed 1. There is an increase in SoftRef enqueued values 2. Concurrent Mark time increase from 3s to ~12s [2022-04-10T19:46:22.380+0000][422145.264s][info][gc,ref ] GC(17874) Soft: 22330 encountered, 19702 discovered, 11571 enqueued [2022-04-10T19:46:34.534+0000][422157.418s][info][gc,ref ] GC(17875) Soft: 10781 encountered, 7560 discovered, 19 enqueued [2022-04-10T19:46:47.755+0000][422170.639s][info][gc,ref ] GC(17876) Soft: 10863 encountered, 3289 discovered, 44 enqueued ... [2022-04-11T05:31:55.763+0000][457278.647s][info][gc,ref ] GC(19267) Soft: 9664 encountered, 868 discovered, 12 enqueued [2022-04-11T05:32:09.553+0000][457292.437s][info][gc,ref ] GC(19268) Soft: 9846 encountered, 6202 discovered, 98 enqueued [2022-04-11T05:32:23.798+0000][457306.682s][info][gc,ref ] GC(19269) Soft: 9859 encountered, 6715 discovered, 132 enqueued [2022-04-11T05:32:37.903+0000][457320.787s][info][gc,ref ] GC(19270) Soft: 9873 encountered, 7585 discovered, 42 enqueued [2022-04-11T05:32:52.609+0000][457335.493s][info][gc,ref ] GC(19271) Soft: 10073 encountered, 6466 discovered, 57 enqueued [2022-04-11T11:13:14.758+0000][477757.642s][info][gc,ref ] GC(20578) Soft: 25760 encountered, 20850 discovered, 15635 enqueued Can someone share info about what else might cause Concurrent Mark time to go higher? Will cleaning SoftReference quickly using SoftRefLRUPolicyMSPerMB flag or not using SoftRef might help in this case? Can send complete logs if it helps. Thanks Sundar From per.liden at oracle.com Tue Apr 12 09:14:29 2022 From: per.liden at oracle.com (Per Liden) Date: Tue, 12 Apr 2022 11:14:29 +0200 Subject: Concurrent Mark sweep increases after increase in SoftRef enqueued number increases In-Reply-To: References: Message-ID: <17fe26cc-33a8-f3e8-2a6e-013a5038ab74@oracle.com> Hi, When an allocation stall occurs, ZGC will start to clear SoftRefs. In other words, clearing of SoftRefs you see is likely just a side effect of you running into an allocation stall. A longer concurrent mark phase could be caused by a number of things. For example, the application might have a larger live-set, the machine might have been overloaded, etc. However, it could also be that ZGC's heuristics selecting number of threads to use got things wrong, and used too few. That would also prolong the concurrent mark phase. You could try the option -XX:-UseDynamicNumberOfGCThreads to disable this heuristic, and instead always use a fixed number of threads. /Per On 2022-04-11 22:50, Sundara Mohan M wrote: > Hi, > I am running ZGC in production with following settings > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Initializing The Z > Garbage Collector > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Version: 17.0.2+8 > (release) > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Support: Enabled > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Nodes: 2 > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] CPUs: 72 total, 72 > available > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Memory: 191871M > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Large Page Support: > Disabled > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] GC Workers: 9/44 > (static) > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Type: > Contiguous/Unrestricted/Complete > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Size: > 1572864M x 3 = 4718592M > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing File: > /memfd:java_heap > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing > Filesystem: tmpfs (0x1021994) > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Min Capacity: 98304M > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Initial Capacity: > 98304M > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Max Capacity: 98304M > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Medium Page Size: 32M > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Pre-touch: Disabled > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Available space on > backing filesystem: N/A > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Uncommit: Implicitly > Disabled (-Xms equals -Xmx) > [2022-04-05T22:30:48.788+0000][11.672s][info][gc,init] Runtime Workers: 44 > > In 24 hours I have seen 4 allocation stalls, each is happening over 5 mins > with >1s stall on multiple threads. > Whenever allocation stalls, the following pattern is observed > 1. There is an increase in SoftRef enqueued values > 2. Concurrent Mark time increase from 3s to ~12s > > > [2022-04-10T19:46:22.380+0000][422145.264s][info][gc,ref ] GC(17874) > Soft: 22330 encountered, 19702 discovered, 11571 enqueued > [2022-04-10T19:46:34.534+0000][422157.418s][info][gc,ref ] GC(17875) > Soft: 10781 encountered, 7560 discovered, 19 enqueued > [2022-04-10T19:46:47.755+0000][422170.639s][info][gc,ref ] GC(17876) > Soft: 10863 encountered, 3289 discovered, 44 enqueued > ... > [2022-04-11T05:31:55.763+0000][457278.647s][info][gc,ref ] GC(19267) > Soft: 9664 encountered, 868 discovered, 12 enqueued > [2022-04-11T05:32:09.553+0000][457292.437s][info][gc,ref ] GC(19268) > Soft: 9846 encountered, 6202 discovered, 98 enqueued > [2022-04-11T05:32:23.798+0000][457306.682s][info][gc,ref ] GC(19269) > Soft: 9859 encountered, 6715 discovered, 132 enqueued > [2022-04-11T05:32:37.903+0000][457320.787s][info][gc,ref ] GC(19270) > Soft: 9873 encountered, 7585 discovered, 42 enqueued > [2022-04-11T05:32:52.609+0000][457335.493s][info][gc,ref ] GC(19271) > Soft: 10073 encountered, 6466 discovered, 57 enqueued > [2022-04-11T11:13:14.758+0000][477757.642s][info][gc,ref ] GC(20578) > Soft: 25760 encountered, 20850 discovered, 15635 enqueued > > Can someone share info about what else might cause Concurrent Mark time to > go higher? > Will cleaning SoftReference quickly using SoftRefLRUPolicyMSPerMB flag or > not using SoftRef might help in this case? > > Can send complete logs if it helps. > > > Thanks > Sundar From m.sundar85 at gmail.com Tue Apr 12 15:59:18 2022 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Tue, 12 Apr 2022 08:59:18 -0700 Subject: Concurrent Mark sweep increases after increase in SoftRef enqueued number increases In-Reply-To: <17fe26cc-33a8-f3e8-2a6e-013a5038ab74@oracle.com> References: <17fe26cc-33a8-f3e8-2a6e-013a5038ab74@oracle.com> Message-ID: Hi, Thanks for clarifying on SoftRefs. Already using -XX:-UseDynamicNumberOfGCThreads and a fixed number of concurrent threads (-XX:ConcGCThreads=9, host has 48 threads). I did notice the live set has increased whenever stall happens, is there any easy way to get the object stats when stall happens? Right now trying to collect heap dump when a stall happens. Thanks Sundar On Tue, Apr 12, 2022 at 2:14 AM Per Liden wrote: > Hi, > > When an allocation stall occurs, ZGC will start to clear SoftRefs. In > other words, clearing of SoftRefs you see is likely just a side effect > of you running into an allocation stall. > > A longer concurrent mark phase could be caused by a number of things. > For example, the application might have a larger live-set, the machine > might have been overloaded, etc. However, it could also be that ZGC's > heuristics selecting number of threads to use got things wrong, and used > too few. That would also prolong the concurrent mark phase. You could > try the option -XX:-UseDynamicNumberOfGCThreads to disable this > heuristic, and instead always use a fixed number of threads. > > /Per > > On 2022-04-11 22:50, Sundara Mohan M wrote: > > Hi, > > I am running ZGC in production with following settings > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Initializing The Z > > Garbage Collector > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Version: 17.0.2+8 > > (release) > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Support: > Enabled > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Nodes: 2 > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] CPUs: 72 total, 72 > > available > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Memory: 191871M > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Large Page Support: > > Disabled > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] GC Workers: 9/44 > > (static) > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Type: > > Contiguous/Unrestricted/Complete > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address Space Size: > > 1572864M x 3 = 4718592M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing File: > > /memfd:java_heap > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing > > Filesystem: tmpfs (0x1021994) > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Min Capacity: > 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Initial Capacity: > > 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Max Capacity: > 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Medium Page Size: > 32M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Pre-touch: Disabled > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Available space on > > backing filesystem: N/A > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Uncommit: > Implicitly > > Disabled (-Xms equals -Xmx) > > [2022-04-05T22:30:48.788+0000][11.672s][info][gc,init] Runtime Workers: > 44 > > > > In 24 hours I have seen 4 allocation stalls, each is happening over 5 > mins > > with >1s stall on multiple threads. > > Whenever allocation stalls, the following pattern is observed > > 1. There is an increase in SoftRef enqueued values > > 2. Concurrent Mark time increase from 3s to ~12s > > > > > > [2022-04-10T19:46:22.380+0000][422145.264s][info][gc,ref ] GC(17874) > > Soft: 22330 encountered, 19702 discovered, 11571 enqueued > > [2022-04-10T19:46:34.534+0000][422157.418s][info][gc,ref ] GC(17875) > > Soft: 10781 encountered, 7560 discovered, 19 enqueued > > [2022-04-10T19:46:47.755+0000][422170.639s][info][gc,ref ] GC(17876) > > Soft: 10863 encountered, 3289 discovered, 44 enqueued > > ... > > [2022-04-11T05:31:55.763+0000][457278.647s][info][gc,ref ] GC(19267) > > Soft: 9664 encountered, 868 discovered, 12 enqueued > > [2022-04-11T05:32:09.553+0000][457292.437s][info][gc,ref ] GC(19268) > > Soft: 9846 encountered, 6202 discovered, 98 enqueued > > [2022-04-11T05:32:23.798+0000][457306.682s][info][gc,ref ] GC(19269) > > Soft: 9859 encountered, 6715 discovered, 132 enqueued > > [2022-04-11T05:32:37.903+0000][457320.787s][info][gc,ref ] GC(19270) > > Soft: 9873 encountered, 7585 discovered, 42 enqueued > > [2022-04-11T05:32:52.609+0000][457335.493s][info][gc,ref ] GC(19271) > > Soft: 10073 encountered, 6466 discovered, 57 enqueued > > [2022-04-11T11:13:14.758+0000][477757.642s][info][gc,ref ] GC(20578) > > Soft: 25760 encountered, 20850 discovered, 15635 enqueued > > > > Can someone share info about what else might cause Concurrent Mark time > to > > go higher? > > Will cleaning SoftReference quickly using SoftRefLRUPolicyMSPerMB flag or > > not using SoftRef might help in this case? > > > > Can send complete logs if it helps. > > > > > > Thanks > > Sundar > From per.liden at oracle.com Tue Apr 19 08:27:45 2022 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Apr 2022 10:27:45 +0200 Subject: Concurrent Mark sweep increases after increase in SoftRef enqueued number increases In-Reply-To: References: <17fe26cc-33a8-f3e8-2a6e-013a5038ab74@oracle.com> Message-ID: <71c1fb2b-b034-5299-3f57-9ce198376ffe@oracle.com> Hi, On 4/12/22 17:59, Sundara Mohan M wrote: > Hi, > ? ? Thanks for clarifying on SoftRefs. > Already using -XX:-UseDynamicNumberOfGCThreads and a fixed number of > concurrent threads (-XX:ConcGCThreads=9, host has 48 threads). > I did notice the live set has increased whenever stall happens, is there > any easy way to get the object stats when stall happens? > Right?now trying to collect heap dump when a stall happens. Another way to get a quick overview of what's on the heap is: jcmd GC.class_histogram /Per > > Thanks > Sundar > > > > On Tue, Apr 12, 2022 at 2:14 AM Per Liden > wrote: > > Hi, > > When an allocation stall occurs, ZGC will start to clear SoftRefs. In > other words, clearing of SoftRefs you see is likely just a side effect > of you running into an allocation stall. > > A longer concurrent mark phase could be caused by a number of things. > For example, the application might have a larger live-set, the machine > might have been overloaded, etc. However, it could also be that ZGC's > heuristics selecting number of threads to use got things wrong, and > used > too few. That would also prolong the concurrent mark phase. You could > try the option -XX:-UseDynamicNumberOfGCThreads to disable this > heuristic, and instead always use a fixed number of threads. > > /Per > > On 2022-04-11 22:50, Sundara Mohan M wrote: > > Hi, > >? ? ? I am running ZGC in production with following settings > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] > Initializing The Z > > Garbage Collector > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Version: > 17.0.2+8 > > (release) > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA > Support: Enabled > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] NUMA Nodes: 2 > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] CPUs: 72 > total, 72 > > available > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Memory: 191871M > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] Large Page > Support: > > Disabled > > [2022-04-05T22:30:37.154+0000][0.038s][info][gc,init] GC Workers: > 9/44 > > (static) > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address > Space Type: > > Contiguous/Unrestricted/Complete > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Address > Space Size: > > 1572864M x 3 = 4718592M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap > Backing File: > > /memfd:java_heap > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Heap Backing > > Filesystem: tmpfs (0x1021994) > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Min > Capacity: 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Initial > Capacity: > > 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Max > Capacity: 98304M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Medium Page > Size: 32M > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Pre-touch: > Disabled > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Available > space on > > backing filesystem: N/A > > [2022-04-05T22:30:37.157+0000][0.040s][info][gc,init] Uncommit: > Implicitly > > Disabled (-Xms equals -Xmx) > > [2022-04-05T22:30:48.788+0000][11.672s][info][gc,init] Runtime > Workers: 44 > > > > In 24 hours I have seen 4 allocation stalls, each is happening > over 5 mins > > with >1s stall on multiple threads. > > Whenever allocation stalls, the following pattern is observed > > 1. There is an increase in SoftRef enqueued values > > 2. Concurrent Mark time increase from 3s to ~12s > > > > > > [2022-04-10T19:46:22.380+0000][422145.264s][info][gc,ref? ? ? ] > GC(17874) > > Soft: 22330 encountered, 19702 discovered, 11571 enqueued > > [2022-04-10T19:46:34.534+0000][422157.418s][info][gc,ref? ? ? ] > GC(17875) > > Soft: 10781 encountered, 7560 discovered, 19 enqueued > > [2022-04-10T19:46:47.755+0000][422170.639s][info][gc,ref? ? ? ] > GC(17876) > > Soft: 10863 encountered, 3289 discovered, 44 enqueued > > ... > > [2022-04-11T05:31:55.763+0000][457278.647s][info][gc,ref? ? ? ] > GC(19267) > > Soft: 9664 encountered, 868 discovered, 12 enqueued > > [2022-04-11T05:32:09.553+0000][457292.437s][info][gc,ref? ? ? ] > GC(19268) > > Soft: 9846 encountered, 6202 discovered, 98 enqueued > > [2022-04-11T05:32:23.798+0000][457306.682s][info][gc,ref? ? ? ] > GC(19269) > > Soft: 9859 encountered, 6715 discovered, 132 enqueued > > [2022-04-11T05:32:37.903+0000][457320.787s][info][gc,ref? ? ? ] > GC(19270) > > Soft: 9873 encountered, 7585 discovered, 42 enqueued > > [2022-04-11T05:32:52.609+0000][457335.493s][info][gc,ref? ? ? ] > GC(19271) > > Soft: 10073 encountered, 6466 discovered, 57 enqueued > > [2022-04-11T11:13:14.758+0000][477757.642s][info][gc,ref? ? ? ] > GC(20578) > > Soft: 25760 encountered, 20850 discovered, 15635 enqueued > > > > Can someone share info about what else might cause Concurrent > Mark time to > > go higher? > > Will cleaning SoftReference quickly using SoftRefLRUPolicyMSPerMB > flag or > > not using SoftRef might help in this case? > > > > Can send complete logs if it helps. > > > > > > Thanks > > Sundar > From bhavana.kilambi at foss.arm.com Tue Apr 19 12:54:57 2022 From: bhavana.kilambi at foss.arm.com (Bhavana Kilambi) Date: Tue, 19 Apr 2022 13:54:57 +0100 Subject: Performance analysis of ZGC load barrier for oop arraycopy Message-ID: <182dfa56-e89e-a31d-d6f8-07834e7bc5d8@foss.arm.com> Hello, I would like to share some analysis work that I've done on the load barrier on arraycopy in ZGC. This PR - https://github.com/openjdk/jdk/pull/6594 introduced stress tests for arraycopy where ObjectArrayCopy ended up in a timeout failure on a Windows-x64 machine with ZGC. This prompted us to perform some performance analysis/testing to understand the behaviour of ZGC and other GCs for arraycopy of objects. Used a simple JMH testcase with a call to System.arraycopy() to copy an entire array of 1024 object references to another array and ran it with the six available garbage collectors in OpenJDK 17,18 (Epsilon, G1, Z, Shenandoah, Serial, Parallel) on Neoverse N1 and Skylake systems. The performance of all the GCs except ZGC was more or less similar but the runtime with ZGC was ~8x that of G1GC (taken as representative of the rest of the GCs) on the N1 system and ~10x on the Skylake system (with OpenJDK17). The actual hot loop is this - inline void ZBarrier::load_barrier_on_oop_array(volatile oop* p, size_t length) { for (volatile const oop* const end = p + length; p < end; p++) { load_barrier_on_oop_field(p); } } Tried to optimize this loop by unrolling it and hoisting the load of the bad_mask out of this loop and these changes showed significant improvement in the runtimes of the JMH testcase and for a couple of real world workloads. A detailed analysis report with comparative analysis between GCs, profiles and code changes are present in the attached document. Would like to know your thoughts on this. Thank you, Bhavana From bhavana.kilambi at foss.arm.com Tue Apr 19 13:52:32 2022 From: bhavana.kilambi at foss.arm.com (Bhavana Kilambi) Date: Tue, 19 Apr 2022 14:52:32 +0100 Subject: Performance analysis of ZGC load barrier for oop arraycopy In-Reply-To: <182dfa56-e89e-a31d-d6f8-07834e7bc5d8@foss.arm.com> References: <182dfa56-e89e-a31d-d6f8-07834e7bc5d8@foss.arm.com> Message-ID: <96e52a89-f1aa-6ce2-28ab-0fcfb31a90c2@foss.arm.com> Hello, My apologies for sending out an attachment in my previous email, when I was supposed to provide a link for the same. Will keep it mind going further. The document is copied here - http://cr.openjdk.java.net/~smonteith/BK/ZGC%20arraycopy%20improvements.pdf Thank you, Bhavana On 4/19/22 13:54, Bhavana Kilambi wrote: > Hello, > > I would like to share some analysis work that I've done on the load > barrier on arraycopy in ZGC. > > This PR - https://github.com/openjdk/jdk/pull/6594 introduced stress > tests for arraycopy where ObjectArrayCopy ended up in a timeout > failure on a Windows-x64 machine with ZGC. This prompted us to perform > some performance analysis/testing to understand the behaviour of ZGC > and other GCs for arraycopy of objects. > > Used a simple JMH testcase with a call to System.arraycopy() to copy > an entire array of 1024 object references to another array and ran it > with the six available garbage collectors in OpenJDK 17,18 (Epsilon, > G1, Z, Shenandoah, Serial, Parallel) on Neoverse N1 and Skylake > systems. The performance of all the GCs except ZGC was more or less > similar but the runtime with ZGC was ~8x that of G1GC (taken as > representative of the rest of the GCs) on the N1 system and ~10x on > the Skylake system (with OpenJDK17). > > The actual hot loop is this - > > > inline void ZBarrier::load_barrier_on_oop_array(volatile oop* p, > size_t length) { > > for (volatile const oop* const end = p + length; p < end; p++) { > > load_barrier_on_oop_field(p); > > } > > } > > Tried to optimize this loop by unrolling it and hoisting the load of > the bad_mask out of this loop and these changes showed significant > improvement in the runtimes of the JMH testcase and for a couple of > real world workloads. > > A detailed analysis report with comparative analysis between GCs, > profiles and code changes are present in the attached document. > > Would like to know your thoughts on this. > > > Thank you, > > Bhavana From erik.osterlund at oracle.com Tue Apr 19 16:02:58 2022 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Tue, 19 Apr 2022 16:02:58 +0000 Subject: Performance analysis of ZGC load barrier for oop arraycopy In-Reply-To: <182dfa56-e89e-a31d-d6f8-07834e7bc5d8@foss.arm.com> References: <182dfa56-e89e-a31d-d6f8-07834e7bc5d8@foss.arm.com> Message-ID: <0C0FE38B-7B0E-4599-89F3-2EDE87CE0900@oracle.com> Hi Bhanava, In the upcoming generational version of ZGC, I have vectorized the barriers directly in the emitted arraycopy stubs. So we only call into the runtime when we encounter bad oops. I haven?t implemented AVX512 or SVX barriers yet though. I hope that makes things better. /Erik > On 19 Apr 2022, at 14:55, Bhavana Kilambi wrote: > > ?Hello, > > I would like to share some analysis work that I've done on the load barrier on arraycopy in ZGC. > > This PR - https://github.com/openjdk/jdk/pull/6594 introduced stress tests for arraycopy where ObjectArrayCopy ended up in a timeout failure on a Windows-x64 machine with ZGC. This prompted us to perform some performance analysis/testing to understand the behaviour of ZGC and other GCs for arraycopy of objects. > > Used a simple JMH testcase with a call to System.arraycopy() to copy an entire array of 1024 object references to another array and ran it with the six available garbage collectors in OpenJDK 17,18 (Epsilon, G1, Z, Shenandoah, Serial, Parallel) on Neoverse N1 and Skylake systems. The performance of all the GCs except ZGC was more or less similar but the runtime with ZGC was ~8x that of G1GC (taken as representative of the rest of the GCs) on the N1 system and ~10x on the Skylake system (with OpenJDK17). > > The actual hot loop is this - > > > inline void ZBarrier::load_barrier_on_oop_array(volatile oop* p, size_t length) { > > for (volatile const oop* const end = p + length; p < end; p++) { > > load_barrier_on_oop_field(p); > > } > > } > > Tried to optimize this loop by unrolling it and hoisting the load of the bad_mask out of this loop and these changes showed significant improvement in the runtimes of the JMH testcase and for a couple of real world workloads. > > A detailed analysis report with comparative analysis between GCs, profiles and code changes are present in the attached document. > > Would like to know your thoughts on this. > > > Thank you, > > Bhavana From roy.sunny.zhang007 at gmail.com Sat Apr 23 11:44:56 2022 From: roy.sunny.zhang007 at gmail.com (Roy Zhang) Date: Sat, 23 Apr 2022 19:44:56 +0800 Subject: (help)Mark stack overflow in JDK11 Message-ID: Dear ZGC experts, Currently we have JVM crash issue caused by Mark stack overflow, and need to increase ZMarkStacksMax (default value is 8G), but even if I increase it to 16G (or even 24G), we still have this issue. 1. Could u please kindly let me know how to troubleshoot the Mark stack overflow issue? Is it caused by too many live objects? 2. If I have to stuck on JDK11(as we know, ZMarkStacksMax is deprecated in JDK12 via dynamic base address for mark stack space), will it help if we switch to G1GC or Shenandoah GC? Really appreciate ur great help! Thanks, Roy From erik.osterlund at oracle.com Sat Apr 23 18:49:14 2022 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Sat, 23 Apr 2022 18:49:14 +0000 Subject: (help)Mark stack overflow in JDK11 In-Reply-To: References: Message-ID: <0D68EB62-A225-48B3-AAB4-59CF38946ADF@oracle.com> Hi Roy, I wonder how large is the heap size? I also wonder if you had the chance to try JDK 17? Quite a few things have changed in this area since 11. Thanks, /Erik > On 23 Apr 2022, at 13:45, Roy Zhang wrote: > > ?Dear ZGC experts, > > Currently we have JVM crash issue caused by Mark stack overflow, and need > to increase ZMarkStacksMax (default value is 8G), but even if I increase it > to 16G (or even 24G), we still have this issue. > > 1. Could u please kindly let me know how to troubleshoot the Mark stack > overflow issue? Is it caused by too many live objects? > 2. If I have to stuck on JDK11(as we know, ZMarkStacksMax is deprecated in > JDK12 via dynamic base address for mark stack space), will it help if we > switch to G1GC or Shenandoah GC? > > Really appreciate ur great help! > > Thanks, > Roy From roy.sunny.zhang007 at gmail.com Sun Apr 24 00:28:38 2022 From: roy.sunny.zhang007 at gmail.com (Roy Zhang) Date: Sun, 24 Apr 2022 08:28:38 +0800 Subject: (help)Mark stack overflow in JDK11 In-Reply-To: <0D68EB62-A225-48B3-AAB4-59CF38946ADF@oracle.com> References: <0D68EB62-A225-48B3-AAB4-59CF38946ADF@oracle.com> Message-ID: Dear Erik, Thanks for ur quick reply! The heap size is 650G, it is a huge application. So what should we do? Reduce heap size? Use G1GC or Shenandoah GC? Regarding JDK17, unfortunately, we have to stuck on JDK11 for some time... Thanks, Roy On Sun, Apr 24, 2022 at 2:49 AM Erik Osterlund wrote: > Hi Roy, > > I wonder how large is the heap size? > I also wonder if you had the chance to try JDK 17? Quite a few things have > changed in this area since 11. > > Thanks, > /Erik > > > On 23 Apr 2022, at 13:45, Roy Zhang > wrote: > > > > ?Dear ZGC experts, > > > > Currently we have JVM crash issue caused by Mark stack overflow, and need > > to increase ZMarkStacksMax (default value is 8G), but even if I increase > it > > to 16G (or even 24G), we still have this issue. > > > > 1. Could u please kindly let me know how to troubleshoot the Mark stack > > overflow issue? Is it caused by too many live objects? > > 2. If I have to stuck on JDK11(as we know, ZMarkStacksMax is deprecated > in > > JDK12 via dynamic base address for mark stack space), will it help if we > > switch to G1GC or Shenandoah GC? > > > > Really appreciate ur great help! > > > > Thanks, > > Roy > From roy.sunny.zhang007 at gmail.com Sun Apr 24 07:39:28 2022 From: roy.sunny.zhang007 at gmail.com (Roy Zhang) Date: Sun, 24 Apr 2022 15:39:28 +0800 Subject: (help)Mark stack overflow in JDK11 In-Reply-To: References: <0D68EB62-A225-48B3-AAB4-59CF38946ADF@oracle.com> Message-ID: BTW, where do we store Mark stack data? off-heap? Thanks, Roy On Sun, Apr 24, 2022 at 8:28 AM Roy Zhang wrote: > Dear Erik, > > Thanks for ur quick reply! > The heap size is 650G, it is a huge application. So what should we do? > Reduce heap size? Use G1GC or Shenandoah GC? > > Regarding JDK17, unfortunately, we have to stuck on JDK11 for some time... > > Thanks, > Roy > > On Sun, Apr 24, 2022 at 2:49 AM Erik Osterlund > wrote: > >> Hi Roy, >> >> I wonder how large is the heap size? >> I also wonder if you had the chance to try JDK 17? Quite a few things >> have changed in this area since 11. >> >> Thanks, >> /Erik >> >> > On 23 Apr 2022, at 13:45, Roy Zhang >> wrote: >> > >> > ?Dear ZGC experts, >> > >> > Currently we have JVM crash issue caused by Mark stack overflow, and >> need >> > to increase ZMarkStacksMax (default value is 8G), but even if I >> increase it >> > to 16G (or even 24G), we still have this issue. >> > >> > 1. Could u please kindly let me know how to troubleshoot the Mark stack >> > overflow issue? Is it caused by too many live objects? >> > 2. If I have to stuck on JDK11(as we know, ZMarkStacksMax is >> deprecated in >> > JDK12 via dynamic base address for mark stack space), will it help if we >> > switch to G1GC or Shenandoah GC? >> > >> > Really appreciate ur great help! >> > >> > Thanks, >> > Roy >> > From per.liden at oracle.com Thu Apr 28 15:14:28 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 28 Apr 2022 17:14:28 +0200 Subject: Resigning as ZGC Project Lead Message-ID: I hereby resign as ZGC Project Lead. It has been a super exciting time for me to drive this project from a proof-of-concept to a state-of-the-art GC that today powers mission critical workloads around the world. However, I will be taking a step back from ZGC development to do other things, and so it feels natural to hand over the project lead role to someone else. According to the OpenJDK Bylaws [1], a new Project Lead may be nominated by the Group Leads of a Project?s Sponsoring Groups. Such a nomination must be approved by a Three-Vote Consensus of these Group Leads. In this case, that means Vladimir Kozlov from the HotSpot Group appoints a new Project Lead. I would personally recommend Stefan Karlsson as new Project Lead. Stefan has played an invaluable role in this project since its inception and is a natural successor. /Per Liden [1] http://openjdk.java.net/bylaws#project-lead From per.liden at oracle.com Thu Apr 28 20:57:14 2022 From: per.liden at oracle.com (Per Liden) Date: Thu, 28 Apr 2022 22:57:14 +0200 Subject: Resigning as ZGC Project Lead In-Reply-To: <268649458.18186052.1651160195008.JavaMail.zimbra@u-pem.fr> References: <268649458.18186052.1651160195008.JavaMail.zimbra@u-pem.fr> Message-ID: <18f1d347-5c45-2a4f-0f97-eaadc026eb7b@oracle.com> Thanks R?mi! /Per On 4/28/22 17:36, Remi Forax wrote: > ----- Original Message ----- >> From: "Per Liden" >> To: zgc-dev at openjdk.java.net, "discuss" >> Cc: "Vladimir Kozlov" , "Stefan Karlsson" , per at malloc.se >> Sent: Thursday, April 28, 2022 5:14:28 PM >> Subject: Resigning as ZGC Project Lead > >> I hereby resign as ZGC Project Lead. >> >> It has been a super exciting time for me to drive this project from a >> proof-of-concept to a state-of-the-art GC that today powers mission >> critical workloads around the world. However, I will be taking a step >> back from ZGC development to do other things, and so it feels natural to >> hand over the project lead role to someone else. > > Hi Per, > you will be missed. > > Personally, i really liked your presentations on ZGC and the articles you have published. > I was waiting your explanation on generational ZGC, c'est la vie ! > > Thanks for all your work on ZGC and good luck for your future projects. > >> >> According to the OpenJDK Bylaws [1], a new Project Lead may be nominated >> by the Group Leads of a Project?s Sponsoring Groups. Such a nomination >> must be approved by a Three-Vote Consensus of these Group Leads. In this >> case, that means Vladimir Kozlov from the HotSpot Group appoints a new >> Project Lead. I would personally recommend Stefan Karlsson as new >> Project Lead. Stefan has played an invaluable role in this project since >> its inception and is a natural successor. >> >> /Per Liden >> >> [1] http://openjdk.java.net/bylaws#project-lead > > R?mi > From stefan.karlsson at oracle.com Fri Apr 29 10:45:35 2022 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 29 Apr 2022 12:45:35 +0200 Subject: New ZGC Project Lead: Stefan Karlsson In-Reply-To: References: Message-ID: (Vladimir's mail bounced) On 2022-04-28 18:08, Vladimir Kozlov wrote: > I hereby nominate Stefan Karlsson as new ZGC Project Lead. > > Current Project Lead Per Liden recommended him and Stefan accepted. > > According to the Bylaws [1] after current Project Lead resign [2] new > Project lead should be nominated. > > As Group Lead of Sponsoring Group [3] I approve this nomination. > > According to the Bylaws definition of Three-Vote Consensus [4], this is > sufficient to approve the nomination. > > Thanks, > Vladimir Kozlov > > [1] http://openjdk.java.net/bylaws#project-lead > [2] > https://mail.openjdk.java.net/pipermail/zgc-dev/2022-April/001133.html > [3] http://openjdk.java.net/census#zgc > [4] http://openjdk.java.net/bylaws#three-vote-consensus From Monica.Beckwith at microsoft.com Sat Apr 30 18:21:29 2022 From: Monica.Beckwith at microsoft.com (Monica Beckwith) Date: Sat, 30 Apr 2022 18:21:29 +0000 Subject: Resigning as ZGC Project Lead In-Reply-To: References: Message-ID: Thanks @Per Liden for your leadership in getting ZGC to where it's today. I wish you the best in your future endeavors. I think ZGC will be in great hands with @Stefan Karlsson at the helm! You have my full support. Thanks both for the great work. Monica Sent via a smartphone Get Outlook for Android ________________________________ From: discuss on behalf of Per Liden Sent: Thursday, April 28, 2022, 10:15 AM To: zgc-dev at openjdk.java.net ; discuss at openjdk.java.net Cc: Vladimir Kozlov ; stefan.karlsson at oracle.com ; per at malloc.se Subject: Resigning as ZGC Project Lead I hereby resign as ZGC Project Lead. It has been a super exciting time for me to drive this project from a proof-of-concept to a state-of-the-art GC that today powers mission critical workloads around the world. However, I will be taking a step back from ZGC development to do other things, and so it feels natural to hand over the project lead role to someone else. According to the OpenJDK Bylaws [1], a new Project Lead may be nominated by the Group Leads of a Project?s Sponsoring Groups. Such a nomination must be approved by a Three-Vote Consensus of these Group Leads. In this case, that means Vladimir Kozlov from the HotSpot Group appoints a new Project Lead. I would personally recommend Stefan Karlsson as new Project Lead. Stefan has played an invaluable role in this project since its inception and is a natural successor. /Per Liden [1] https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fopenjdk.java.net%2Fbylaws%23project-lead&data=05%7C01%7Cmonica.beckwith%40microsoft.com%7C34ea2fa614234be4a40f08da2929e4df%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637867557155401842%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Wl1oJqoW9f24WvPM4zcdMMmVxUXtNcY841oGpoQdMZs%3D&reserved=0