From mary.joseph at fiixsoftware.com Tue Feb 2 19:30:40 2021 From: mary.joseph at fiixsoftware.com (Mary Sunitha Joseph) Date: Tue, 2 Feb 2021 14:30:40 -0500 Subject: Running into Allocation Stalls during class unloading Message-ID: Hi team, Our Production application runs on a 320G heap and uses ZGC with large pages enabled. We have not done any tuning and are using ZGC with defaults. Since upgrading to JDK 15.0.1 we've started to notice that once a day the app experiences allocation stalls (during peak hours) and this happens when there is a huge drop in the number of classes loaded. We have a bi-monthly release cycle and can see that the allocation stalls start small a business day after a release and slowly increase as the week progresses. At the moment the app seems to be doing fine but it could escalate anytime by the looks of it. There is an increase in the app's response time as well at the same time and a small spike in heap which seem like side effects. Any pointers in terms of tuning would be much appreciated. The app currently always makes use of at least 200G of heap space which leaves a 37% head space for ZGC. Regards Mary -- Mary Sunitha Joseph (She/her) Lead Developer Fiix Software p: 1 (855) 884-5619 e: mary.joseph at fiixsoftware.com w: www.fiixsoftware.com From charlie.hunt at oracle.com Tue Feb 2 21:02:37 2021 From: charlie.hunt at oracle.com (charlie hunt) Date: Tue, 2 Feb 2021 15:02:37 -0600 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: Hi Mary, Thanks for reaching out. Since you are observing allocation stalls, there a couple options to consider. 1.) If you have CPU cycles available, you can increase the number of concurrent GC threads. You can see the default number ZGC is currently using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep -i concgcthreads.? Increasing the number of concurrent GC threads should allow ZGC to do its concurrent work before the Java heap space becomes exhausted resulting in allocation stalls. But, additional concurrent GC threads will use more CPU. 2.) Another option is to size the Java heap larger, if you have the available RAM on the system. By increasing the size of the Java heap, you also increase the time the concurrent GC threads can do their work to free space before exhausting Java heap space (which results in allocation stalls). 3.) Another option is profile the application and look for opportunities to reduce unnecessary object allocations. This will reduce the speed at which the Java heap fills that available free space and thus allows ZGC's concurrent GC threads to keep up and avoid allocation stalls. Fwiw, I tend to like the first two options better since I would rather see folks write their Java application(s) in their most natural form and let the JVM figure out how to best run the Java app. Also, as a general comment, having 37% head room for ZGC to operate is not a lot of space. Whether that is enough space largely depends on the application, i.e. its allocation rate, object lifetimes, amount live data in the Java heap, etc., and whether concurrent GC threads can keep up with the pace of allocations with the amount of Java heap space that's available. hths, charlie On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > Hi team, > > Our Production application runs on a 320G heap and uses ZGC with large > pages enabled. We have not done any tuning and are using ZGC with defaults. > Since upgrading to JDK 15.0.1 we've started to notice that once a day the > app experiences allocation stalls (during peak hours) and this happens when > there is a huge drop in the number of classes loaded. We have a bi-monthly > release cycle and can see that the allocation stalls start small a business > day after a release and slowly increase as the week progresses. > > At the moment the app seems to be doing fine but it could escalate anytime > by the looks of it. There is an increase in the app's response time as well > at the same time and a small spike in heap which seem like side effects. > Any pointers in terms of tuning would be much appreciated. > > The app currently always makes use of at least 200G of heap space which > leaves a 37% head space for ZGC. > > > Regards > Mary From mary.joseph at fiixsoftware.com Wed Feb 3 14:10:59 2021 From: mary.joseph at fiixsoftware.com (Mary Sunitha Joseph) Date: Wed, 3 Feb 2021 09:10:59 -0500 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: Hi Charlie, Thank you for going over our case and for the recommendations. Increasing the heap size or the number of ZGC threads is definitely something we can try out. I'm also trying to understand if the allocation stalls during the time of class unloading is an expected occurrence. It's almost as if at that point ZGC's entire focus is on class unloading and not clearing out the heap which leads to that spike in heap and subsequent allocation stalls. Perhaps class unloading and freeing heap are not concurrent themselves and ZGC is able to do one or the other? Regards, Mary On Wed, Feb 3, 2021 at 6:55 AM wrote: > Send zgc-dev mailing list submissions to > zgc-dev at openjdk.java.net > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > or, via email, send a message with subject or body 'help' to > zgc-dev-request at openjdk.java.net > > You can reach the person managing the list at > zgc-dev-owner at openjdk.java.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of zgc-dev digest..." > > > Today's Topics: > > 1. Running into Allocation Stalls during class unloading > (Mary Sunitha Joseph) > 2. Re: Running into Allocation Stalls during class unloading > (charlie hunt) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 2 Feb 2021 14:30:40 -0500 > From: Mary Sunitha Joseph > To: zgc-dev at openjdk.java.net > Subject: Running into Allocation Stalls during class unloading > Message-ID: > MXcA at mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > Hi team, > > Our Production application runs on a 320G heap and uses ZGC with large > pages enabled. We have not done any tuning and are using ZGC with defaults. > Since upgrading to JDK 15.0.1 we've started to notice that once a day the > app experiences allocation stalls (during peak hours) and this happens when > there is a huge drop in the number of classes loaded. We have a bi-monthly > release cycle and can see that the allocation stalls start small a business > day after a release and slowly increase as the week progresses. > > At the moment the app seems to be doing fine but it could escalate anytime > by the looks of it. There is an increase in the app's response time as well > at the same time and a small spike in heap which seem like side effects. > Any pointers in terms of tuning would be much appreciated. > > The app currently always makes use of at least 200G of heap space which > leaves a 37% head space for ZGC. > > > Regards > Mary > -- > > Mary Sunitha Joseph (She/her) > > Lead Developer > > Fiix Software > > p: 1 (855) 884-5619 > > e: mary.joseph at fiixsoftware.com > > w: www.fiixsoftware.com > > < > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > > > > > ------------------------------ > > Message: 2 > Date: Tue, 2 Feb 2021 15:02:37 -0600 > From: charlie hunt > To: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > Thanks for reaching out. > > Since you are observing allocation stalls, there a couple options to > consider. > > 1.) If you have CPU cycles available, you can increase the number of > concurrent GC threads. You can see the default number ZGC is currently > using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > -i concgcthreads.? Increasing the number of concurrent GC threads should > allow ZGC to do its concurrent work before the Java heap space becomes > exhausted resulting in allocation stalls. But, additional concurrent GC > threads will use more CPU. > > 2.) Another option is to size the Java heap larger, if you have the > available RAM on the system. By increasing the size of the Java heap, > you also increase the time the concurrent GC threads can do their work > to free space before exhausting Java heap space (which results in > allocation stalls). > > 3.) Another option is profile the application and look for opportunities > to reduce unnecessary object allocations. This will reduce the speed at > which the Java heap fills that available free space and thus allows > ZGC's concurrent GC threads to keep up and avoid allocation stalls. > > Fwiw, I tend to like the first two options better since I would rather > see folks write their Java application(s) in their most natural form and > let the JVM figure out how to best run the Java app. > > Also, as a general comment, having 37% head room for ZGC to operate is > not a lot of space. Whether that is enough space largely depends on the > application, i.e. its allocation rate, object lifetimes, amount live > data in the Java heap, etc., and whether concurrent GC threads can keep > up with the pace of allocations with the amount of Java heap space > that's available. > > hths, > > charlie > > On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > > Hi team, > > > > Our Production application runs on a 320G heap and uses ZGC with large > > pages enabled. We have not done any tuning and are using ZGC with > defaults. > > Since upgrading to JDK 15.0.1 we've started to notice that once a day the > > app experiences allocation stalls (during peak hours) and this happens > when > > there is a huge drop in the number of classes loaded. We have a > bi-monthly > > release cycle and can see that the allocation stalls start small a > business > > day after a release and slowly increase as the week progresses. > > > > At the moment the app seems to be doing fine but it could escalate > anytime > > by the looks of it. There is an increase in the app's response time as > well > > at the same time and a small spike in heap which seem like side effects. > > Any pointers in terms of tuning would be much appreciated. > > > > The app currently always makes use of at least 200G of heap space which > > leaves a 37% head space for ZGC. > > > > > > Regards > > Mary > > > End of zgc-dev Digest, Vol 36, Issue 1 > ************************************** > -- Mary Sunitha Joseph (She/her) Lead Developer Fiix Software p: 1 (855) 884-5619 e: mary.joseph at fiixsoftware.com w: www.fiixsoftware.com From charlie.hunt at oracle.com Wed Feb 3 14:40:53 2021 From: charlie.hunt at oracle.com (charlie hunt) Date: Wed, 3 Feb 2021 08:40:53 -0600 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90@oracle.com> Hi Mary, No, allocation stalls where there is concurrent class unloading is not an expected occurrence. What may be a possibility here, and I will try to explain what I am thinking. With the introduction of concurrent class unloading, the elapsed time it takes ZGC to complete a concurrent collection cycle may be slightly longer than when class unloading was a GC pause. If your application happened to be very close to a point where ZGC was just ahead of "losing the race" and exhausting Java heap space, then the additional concurrent class unloading work may be just enough for ZGC to lose that race. One thing to keep in mind here is that concurrent class unloading removes a GC pause, the pause that did class unloading. hths, charlie On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > Hi Charlie, > > Thank you for going over our case and for the recommendations. Increasing > the heap size or the number of ZGC threads is definitely something we can > try out. > > I'm also trying to understand if the allocation stalls during the time of > class unloading is an expected occurrence. It's almost as if at that point > ZGC's entire focus is on class unloading and not clearing out the heap > which leads to that spike in heap and subsequent allocation stalls. Perhaps > class unloading and freeing heap are not concurrent themselves and ZGC is > able to do one or the other? > > Regards, > Mary > > On Wed, Feb 3, 2021 at 6:55 AM wrote: > >> Send zgc-dev mailing list submissions to >> zgc-dev at openjdk.java.net >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >> or, via email, send a message with subject or body 'help' to >> zgc-dev-request at openjdk.java.net >> >> You can reach the person managing the list at >> zgc-dev-owner at openjdk.java.net >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of zgc-dev digest..." >> >> >> Today's Topics: >> >> 1. Running into Allocation Stalls during class unloading >> (Mary Sunitha Joseph) >> 2. Re: Running into Allocation Stalls during class unloading >> (charlie hunt) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Tue, 2 Feb 2021 14:30:40 -0500 >> From: Mary Sunitha Joseph >> To: zgc-dev at openjdk.java.net >> Subject: Running into Allocation Stalls during class unloading >> Message-ID: >> > MXcA at mail.gmail.com> >> Content-Type: text/plain; charset="UTF-8" >> >> Hi team, >> >> Our Production application runs on a 320G heap and uses ZGC with large >> pages enabled. We have not done any tuning and are using ZGC with defaults. >> Since upgrading to JDK 15.0.1 we've started to notice that once a day the >> app experiences allocation stalls (during peak hours) and this happens when >> there is a huge drop in the number of classes loaded. We have a bi-monthly >> release cycle and can see that the allocation stalls start small a business >> day after a release and slowly increase as the week progresses. >> >> At the moment the app seems to be doing fine but it could escalate anytime >> by the looks of it. There is an increase in the app's response time as well >> at the same time and a small spike in heap which seem like side effects. >> Any pointers in terms of tuning would be much appreciated. >> >> The app currently always makes use of at least 200G of heap space which >> leaves a 37% head space for ZGC. >> >> >> Regards >> Mary >> -- >> >> Mary Sunitha Joseph (She/her) >> >> Lead Developer >> >> Fiix Software >> >> p: 1 (855) 884-5619 >> >> e: mary.joseph at fiixsoftware.com >> >> w: www.fiixsoftware.com >> >> < >> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >> >> ------------------------------ >> >> Message: 2 >> Date: Tue, 2 Feb 2021 15:02:37 -0600 >> From: charlie hunt >> To: zgc-dev at openjdk.java.net >> Subject: Re: Running into Allocation Stalls during class unloading >> Message-ID: >> Content-Type: text/plain; charset=utf-8; format=flowed >> >> Hi Mary, >> >> Thanks for reaching out. >> >> Since you are observing allocation stalls, there a couple options to >> consider. >> >> 1.) If you have CPU cycles available, you can increase the number of >> concurrent GC threads. You can see the default number ZGC is currently >> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >> -i concgcthreads.? Increasing the number of concurrent GC threads should >> allow ZGC to do its concurrent work before the Java heap space becomes >> exhausted resulting in allocation stalls. But, additional concurrent GC >> threads will use more CPU. >> >> 2.) Another option is to size the Java heap larger, if you have the >> available RAM on the system. By increasing the size of the Java heap, >> you also increase the time the concurrent GC threads can do their work >> to free space before exhausting Java heap space (which results in >> allocation stalls). >> >> 3.) Another option is profile the application and look for opportunities >> to reduce unnecessary object allocations. This will reduce the speed at >> which the Java heap fills that available free space and thus allows >> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >> >> Fwiw, I tend to like the first two options better since I would rather >> see folks write their Java application(s) in their most natural form and >> let the JVM figure out how to best run the Java app. >> >> Also, as a general comment, having 37% head room for ZGC to operate is >> not a lot of space. Whether that is enough space largely depends on the >> application, i.e. its allocation rate, object lifetimes, amount live >> data in the Java heap, etc., and whether concurrent GC threads can keep >> up with the pace of allocations with the amount of Java heap space >> that's available. >> >> hths, >> >> charlie >> >> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>> Hi team, >>> >>> Our Production application runs on a 320G heap and uses ZGC with large >>> pages enabled. We have not done any tuning and are using ZGC with >> defaults. >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day the >>> app experiences allocation stalls (during peak hours) and this happens >> when >>> there is a huge drop in the number of classes loaded. We have a >> bi-monthly >>> release cycle and can see that the allocation stalls start small a >> business >>> day after a release and slowly increase as the week progresses. >>> >>> At the moment the app seems to be doing fine but it could escalate >> anytime >>> by the looks of it. There is an increase in the app's response time as >> well >>> at the same time and a small spike in heap which seem like side effects. >>> Any pointers in terms of tuning would be much appreciated. >>> >>> The app currently always makes use of at least 200G of heap space which >>> leaves a 37% head space for ZGC. >>> >>> >>> Regards >>> Mary >> >> End of zgc-dev Digest, Vol 36, Issue 1 >> ************************************** >> > From per.liden at oracle.com Wed Feb 3 15:38:20 2021 From: per.liden at oracle.com (Per Liden) Date: Wed, 3 Feb 2021 16:38:20 +0100 Subject: Running into Allocation Stalls during class unloading In-Reply-To: <7c368c52-1f03-1286-012c-7605ea506a90@oracle.com> References: <7c368c52-1f03-1286-012c-7605ea506a90@oracle.com> Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9@oracle.com> Hi Mary, Would it be possible for you to share GC logs (preferably generated using -Xlog:gc*)? That would help us understand if what you experience here is a consequence of what Charlie describes. If that's the case, i.e. a prolonged concurrent GC cycle because of lots of classes to unload, it could be that the GC is simply kicking in a bit too late. If so, you might want to have a look at the -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it shouldn't aim at using the whole heap (-Xmx), but instead some lower number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to variance in GC cycle length and make it more resilient to allocation spikes. cheers, Per On 2/3/21 3:40 PM, charlie hunt wrote: > Hi Mary, > > No, allocation stalls where there is concurrent class unloading is not > an expected occurrence. > > What may be a possibility here, and I will try to explain what I am > thinking. > > With the introduction of concurrent class unloading, the elapsed time it > takes ZGC to complete a concurrent collection cycle may be slightly > longer than when class unloading was a GC pause. If your application > happened to be very close to a point where ZGC was just ahead of "losing > the race" and exhausting Java heap space, then the additional concurrent > class unloading work may be just enough for ZGC to lose that race. One > thing to keep in mind here is that concurrent class unloading removes a > GC pause, the pause that did class unloading. > > hths, > > charlie > > On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: >> Hi Charlie, >> >> Thank you for going over our case and for the recommendations. Increasing >> the heap size or the number of ZGC threads is definitely something we can >> try out. >> >> I'm also trying to understand if the allocation stalls during the time of >> class unloading is an expected occurrence. It's almost as if at that >> point >> ZGC's entire focus is on class unloading and not clearing out the heap >> which leads to that spike in heap and subsequent allocation stalls. >> Perhaps >> class unloading and freeing heap are not concurrent themselves and ZGC is >> able to do one or the other? >> >> Regards, >> Mary >> >> On Wed, Feb 3, 2021 at 6:55 AM wrote: >> >>> Send zgc-dev mailing list submissions to >>> ???????? zgc-dev at openjdk.java.net >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>> or, via email, send a message with subject or body 'help' to >>> ???????? zgc-dev-request at openjdk.java.net >>> >>> You can reach the person managing the list at >>> ???????? zgc-dev-owner at openjdk.java.net >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of zgc-dev digest..." >>> >>> >>> Today's Topics: >>> >>> ??? 1. Running into Allocation Stalls during class unloading >>> ?????? (Mary Sunitha Joseph) >>> ??? 2. Re: Running into Allocation Stalls during class unloading >>> ?????? (charlie hunt) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>> From: Mary Sunitha Joseph >>> To: zgc-dev at openjdk.java.net >>> Subject: Running into Allocation Stalls during class unloading >>> Message-ID: >>> ???????? >> MXcA at mail.gmail.com> >>> Content-Type: text/plain; charset="UTF-8" >>> >>> Hi team, >>> >>> Our Production application runs on a 320G heap and uses ZGC with large >>> pages enabled. We have not done any tuning and are using ZGC with >>> defaults. >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >>> the >>> app experiences allocation stalls (during peak hours) and this >>> happens when >>> there is a huge drop in the number of classes loaded. We have a >>> bi-monthly >>> release cycle and can see that the allocation stalls start small a >>> business >>> day after a? release and slowly increase as the week progresses. >>> >>> At the moment the app seems to be doing fine but it could escalate >>> anytime >>> by the looks of it. There is an increase in the app's response time >>> as well >>> at the same time and a small spike in heap which seem like side effects. >>> Any pointers in terms of tuning would be much appreciated. >>> >>> The app currently always makes use of at least 200G of heap space which >>> leaves a 37% head space for ZGC. >>> >>> >>> Regards >>> Mary >>> -- >>> >>> Mary Sunitha Joseph (She/her) >>> >>> Lead Developer >>> >>> Fiix Software >>> >>> p: 1 (855) 884-5619 >>> >>> e: mary.joseph at fiixsoftware.com >>> >>> w: www.fiixsoftware.com >>> >>> < >>> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>> From: charlie hunt >>> To: zgc-dev at openjdk.java.net >>> Subject: Re: Running into Allocation Stalls during class unloading >>> Message-ID: >>> Content-Type: text/plain; charset=utf-8; format=flowed >>> >>> Hi Mary, >>> >>> Thanks for reaching out. >>> >>> Since you are observing allocation stalls, there a couple options to >>> consider. >>> >>> 1.) If you have CPU cycles available, you can increase the number of >>> concurrent GC threads. You can see the default number ZGC is currently >>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >>> -i concgcthreads.? Increasing the number of concurrent GC threads should >>> allow ZGC to do its concurrent work before the Java heap space becomes >>> exhausted resulting in allocation stalls. But, additional concurrent GC >>> threads will use more CPU. >>> >>> 2.) Another option is to size the Java heap larger, if you have the >>> available RAM on the system. By increasing the size of the Java heap, >>> you also increase the time the concurrent GC threads can do their work >>> to free space before exhausting Java heap space (which results in >>> allocation stalls). >>> >>> 3.) Another option is profile the application and look for opportunities >>> to reduce unnecessary object allocations. This will reduce the speed at >>> which the Java heap fills that available free space and thus allows >>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>> >>> Fwiw, I tend to like the first two options better since I would rather >>> see folks write their Java application(s) in their most natural form and >>> let the JVM figure out how to best run the Java app. >>> >>> Also, as a general comment, having 37% head room for ZGC to operate is >>> not a lot of space. Whether that is enough space largely depends on the >>> application, i.e. its allocation rate, object lifetimes, amount live >>> data in the Java heap, etc., and whether concurrent GC threads can keep >>> up with the pace of allocations with the amount of Java heap space >>> that's available. >>> >>> hths, >>> >>> charlie >>> >>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>> Hi team, >>>> >>>> Our Production application runs on a 320G heap and uses ZGC with large >>>> pages enabled. We have not done any tuning and are using ZGC with >>> defaults. >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>> day the >>>> app experiences allocation stalls (during peak hours) and this happens >>> when >>>> there is a huge drop in the number of classes loaded. We have a >>> bi-monthly >>>> release cycle and can see that the allocation stalls start small a >>> business >>>> day after a? release and slowly increase as the week progresses. >>>> >>>> At the moment the app seems to be doing fine but it could escalate >>> anytime >>>> by the looks of it. There is an increase in the app's response time as >>> well >>>> at the same time and a small spike in heap which seem like side >>>> effects. >>>> Any pointers in terms of tuning would be much appreciated. >>>> >>>> The app currently always makes use of at least 200G of heap space which >>>> leaves a 37% head space for ZGC. >>>> >>>> >>>> Regards >>>> Mary >>> >>> End of zgc-dev Digest, Vol 36, Issue 1 >>> ************************************** >>> >> From mary.joseph at fiixsoftware.com Tue Feb 9 02:39:04 2021 From: mary.joseph at fiixsoftware.com (Mary Sunitha Joseph) Date: Mon, 8 Feb 2021 21:39:04 -0500 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: Hi Charlie, Per I went over our verbose GC logs and found a couple of things : 1. Heap usage by the application during peak hours is always close to an avg of 55%. 2. Each GC cycle barely manages to reclaim upto 10% and there are times when the Concurrent Process Non-Strong reference phase takes upto a minute to complete. These are the times when the allocation stalls occur and the GC cycle soon after is unable to reclaim any space. Eg extract: HIGH ALLOC RATE [2021-02-02T17:01:32.641+0000][info][gc ] GC(1276) Garbage Collection (Allocation Rate) 186282M(57%)->205548M(63%) PROCESSING PHASE [2021-02-02T17:03:09.369+0000][info][gc,phases ] GC(1277) Concurrent Process Non-Strong References 71206.950ms ALLOC STALL [2021-02-02T17:03:09.372+0000][info][gc ] Allocation Stall (scrubbed) 318.487ms HIGH ALLOC RATE [2021-02-02T17:03:09.756+0000][info][gc ] GC(1277) Garbage Collection (Allocation Rate) 205634M(63%)->245938M(75%) PROCESSING PHASE [2021-02-02T17:04:46.862+0000][info][gc,phases ] GC(1278) Concurrent Process Non-Strong References 72611.793ms ALLOC STALL [2021-02-02T17:04:46.864+0000][info][gc ] Allocation Stall (scrubbed) 554.293ms GC is then able to reclaim some space [2021-02-02T17:04:48.332+0000][info][gc ] GC(1278) Garbage Collection (Allocation Rate) 245974M(75%)->204610M(62%) Towards the end of the week, this phase takes upto 2.3 mins. 3. There are object initialization hotspots in the application that need optimizing, but even with 320GB of memory and at most 50-60% heap usage, it looks like ZGC needs more room to maneuver. The Concurrent Process Non-strong References taking minutes to complete is a bit worrisome and we should probably look at optimizing things at our end to reduce this. Regards, Mary On Wed, Feb 3, 2021 at 10:39 AM wrote: > Send zgc-dev mailing list submissions to > zgc-dev at openjdk.java.net > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > or, via email, send a message with subject or body 'help' to > zgc-dev-request at openjdk.java.net > > You can reach the person managing the list at > zgc-dev-owner at openjdk.java.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of zgc-dev digest..." > > > Today's Topics: > > 1. Re: Running into Allocation Stalls during class unloading > (Mary Sunitha Joseph) > 2. Re: Running into Allocation Stalls during class unloading > (charlie hunt) > 3. Re: Running into Allocation Stalls during class unloading > (Per Liden) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 3 Feb 2021 09:10:59 -0500 > From: Mary Sunitha Joseph > To: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: > rB2yQw at mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > Hi Charlie, > > Thank you for going over our case and for the recommendations. Increasing > the heap size or the number of ZGC threads is definitely something we can > try out. > > I'm also trying to understand if the allocation stalls during the time of > class unloading is an expected occurrence. It's almost as if at that point > ZGC's entire focus is on class unloading and not clearing out the heap > which leads to that spike in heap and subsequent allocation stalls. Perhaps > class unloading and freeing heap are not concurrent themselves and ZGC is > able to do one or the other? > > Regards, > Mary > > On Wed, Feb 3, 2021 at 6:55 AM wrote: > > > Send zgc-dev mailing list submissions to > > zgc-dev at openjdk.java.net > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > > or, via email, send a message with subject or body 'help' to > > zgc-dev-request at openjdk.java.net > > > > You can reach the person managing the list at > > zgc-dev-owner at openjdk.java.net > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of zgc-dev digest..." > > > > > > Today's Topics: > > > > 1. Running into Allocation Stalls during class unloading > > (Mary Sunitha Joseph) > > 2. Re: Running into Allocation Stalls during class unloading > > (charlie hunt) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Tue, 2 Feb 2021 14:30:40 -0500 > > From: Mary Sunitha Joseph > > To: zgc-dev at openjdk.java.net > > Subject: Running into Allocation Stalls during class unloading > > Message-ID: > > > MXcA at mail.gmail.com> > > Content-Type: text/plain; charset="UTF-8" > > > > Hi team, > > > > Our Production application runs on a 320G heap and uses ZGC with large > > pages enabled. We have not done any tuning and are using ZGC with > defaults. > > Since upgrading to JDK 15.0.1 we've started to notice that once a day the > > app experiences allocation stalls (during peak hours) and this happens > when > > there is a huge drop in the number of classes loaded. We have a > bi-monthly > > release cycle and can see that the allocation stalls start small a > business > > day after a release and slowly increase as the week progresses. > > > > At the moment the app seems to be doing fine but it could escalate > anytime > > by the looks of it. There is an increase in the app's response time as > well > > at the same time and a small spike in heap which seem like side effects. > > Any pointers in terms of tuning would be much appreciated. > > > > The app currently always makes use of at least 200G of heap space which > > leaves a 37% head space for ZGC. > > > > > > Regards > > Mary > > -- > > > > Mary Sunitha Joseph (She/her) > > > > Lead Developer > > > > Fiix Software > > > > p: 1 (855) 884-5619 > > > > e: mary.joseph at fiixsoftware.com > > > > w: www.fiixsoftware.com > > > > < > > > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > > > > > > > > > ------------------------------ > > > > Message: 2 > > Date: Tue, 2 Feb 2021 15:02:37 -0600 > > From: charlie hunt > > To: zgc-dev at openjdk.java.net > > Subject: Re: Running into Allocation Stalls during class unloading > > Message-ID: > > Content-Type: text/plain; charset=utf-8; format=flowed > > > > Hi Mary, > > > > Thanks for reaching out. > > > > Since you are observing allocation stalls, there a couple options to > > consider. > > > > 1.) If you have CPU cycles available, you can increase the number of > > concurrent GC threads. You can see the default number ZGC is currently > > using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > > -i concgcthreads.? Increasing the number of concurrent GC threads should > > allow ZGC to do its concurrent work before the Java heap space becomes > > exhausted resulting in allocation stalls. But, additional concurrent GC > > threads will use more CPU. > > > > 2.) Another option is to size the Java heap larger, if you have the > > available RAM on the system. By increasing the size of the Java heap, > > you also increase the time the concurrent GC threads can do their work > > to free space before exhausting Java heap space (which results in > > allocation stalls). > > > > 3.) Another option is profile the application and look for opportunities > > to reduce unnecessary object allocations. This will reduce the speed at > > which the Java heap fills that available free space and thus allows > > ZGC's concurrent GC threads to keep up and avoid allocation stalls. > > > > Fwiw, I tend to like the first two options better since I would rather > > see folks write their Java application(s) in their most natural form and > > let the JVM figure out how to best run the Java app. > > > > Also, as a general comment, having 37% head room for ZGC to operate is > > not a lot of space. Whether that is enough space largely depends on the > > application, i.e. its allocation rate, object lifetimes, amount live > > data in the Java heap, etc., and whether concurrent GC threads can keep > > up with the pace of allocations with the amount of Java heap space > > that's available. > > > > hths, > > > > charlie > > > > On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > > > Hi team, > > > > > > Our Production application runs on a 320G heap and uses ZGC with large > > > pages enabled. We have not done any tuning and are using ZGC with > > defaults. > > > Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > > > app experiences allocation stalls (during peak hours) and this happens > > when > > > there is a huge drop in the number of classes loaded. We have a > > bi-monthly > > > release cycle and can see that the allocation stalls start small a > > business > > > day after a release and slowly increase as the week progresses. > > > > > > At the moment the app seems to be doing fine but it could escalate > > anytime > > > by the looks of it. There is an increase in the app's response time as > > well > > > at the same time and a small spike in heap which seem like side > effects. > > > Any pointers in terms of tuning would be much appreciated. > > > > > > The app currently always makes use of at least 200G of heap space which > > > leaves a 37% head space for ZGC. > > > > > > > > > Regards > > > Mary > > > > > > End of zgc-dev Digest, Vol 36, Issue 1 > > ************************************** > > > > > -- > > Mary Sunitha Joseph (She/her) > > Lead Developer > > Fiix Software > > p: 1 (855) 884-5619 > > e: mary.joseph at fiixsoftware.com > > w: www.fiixsoftware.com > > < > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > > > > > ------------------------------ > > Message: 2 > Date: Wed, 3 Feb 2021 08:40:53 -0600 > From: charlie hunt > To: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90 at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > No, allocation stalls where there is concurrent class unloading is not > an expected occurrence. > > What may be a possibility here, and I will try to explain what I am > thinking. > > With the introduction of concurrent class unloading, the elapsed time it > takes ZGC to complete a concurrent collection cycle may be slightly > longer than when class unloading was a GC pause. If your application > happened to be very close to a point where ZGC was just ahead of "losing > the race" and exhausting Java heap space, then the additional concurrent > class unloading work may be just enough for ZGC to lose that race. One > thing to keep in mind here is that concurrent class unloading removes a > GC pause, the pause that did class unloading. > > hths, > > charlie > > On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > > Hi Charlie, > > > > Thank you for going over our case and for the recommendations. Increasing > > the heap size or the number of ZGC threads is definitely something we can > > try out. > > > > I'm also trying to understand if the allocation stalls during the time of > > class unloading is an expected occurrence. It's almost as if at that > point > > ZGC's entire focus is on class unloading and not clearing out the heap > > which leads to that spike in heap and subsequent allocation stalls. > Perhaps > > class unloading and freeing heap are not concurrent themselves and ZGC is > > able to do one or the other? > > > > Regards, > > Mary > > > > On Wed, Feb 3, 2021 at 6:55 AM wrote: > > > >> Send zgc-dev mailing list submissions to > >> zgc-dev at openjdk.java.net > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >> or, via email, send a message with subject or body 'help' to > >> zgc-dev-request at openjdk.java.net > >> > >> You can reach the person managing the list at > >> zgc-dev-owner at openjdk.java.net > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of zgc-dev digest..." > >> > >> > >> Today's Topics: > >> > >> 1. Running into Allocation Stalls during class unloading > >> (Mary Sunitha Joseph) > >> 2. Re: Running into Allocation Stalls during class unloading > >> (charlie hunt) > >> > >> > >> ---------------------------------------------------------------------- > >> > >> Message: 1 > >> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >> From: Mary Sunitha Joseph > >> To: zgc-dev at openjdk.java.net > >> Subject: Running into Allocation Stalls during class unloading > >> Message-ID: > >> >> MXcA at mail.gmail.com> > >> Content-Type: text/plain; charset="UTF-8" > >> > >> Hi team, > >> > >> Our Production application runs on a 320G heap and uses ZGC with large > >> pages enabled. We have not done any tuning and are using ZGC with > defaults. > >> Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > >> app experiences allocation stalls (during peak hours) and this happens > when > >> there is a huge drop in the number of classes loaded. We have a > bi-monthly > >> release cycle and can see that the allocation stalls start small a > business > >> day after a release and slowly increase as the week progresses. > >> > >> At the moment the app seems to be doing fine but it could escalate > anytime > >> by the looks of it. There is an increase in the app's response time as > well > >> at the same time and a small spike in heap which seem like side effects. > >> Any pointers in terms of tuning would be much appreciated. > >> > >> The app currently always makes use of at least 200G of heap space which > >> leaves a 37% head space for ZGC. > >> > >> > >> Regards > >> Mary > >> -- > >> > >> Mary Sunitha Joseph (She/her) > >> > >> Lead Developer > >> > >> Fiix Software > >> > >> p: 1 (855) 884-5619 > >> > >> e: mary.joseph at fiixsoftware.com > >> > >> w: www.fiixsoftware.com > >> > >> < > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >> > >> ------------------------------ > >> > >> Message: 2 > >> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >> From: charlie hunt > >> To: zgc-dev at openjdk.java.net > >> Subject: Re: Running into Allocation Stalls during class unloading > >> Message-ID: > >> Content-Type: text/plain; charset=utf-8; format=flowed > >> > >> Hi Mary, > >> > >> Thanks for reaching out. > >> > >> Since you are observing allocation stalls, there a couple options to > >> consider. > >> > >> 1.) If you have CPU cycles available, you can increase the number of > >> concurrent GC threads. You can see the default number ZGC is currently > >> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >> -i concgcthreads.? Increasing the number of concurrent GC threads should > >> allow ZGC to do its concurrent work before the Java heap space becomes > >> exhausted resulting in allocation stalls. But, additional concurrent GC > >> threads will use more CPU. > >> > >> 2.) Another option is to size the Java heap larger, if you have the > >> available RAM on the system. By increasing the size of the Java heap, > >> you also increase the time the concurrent GC threads can do their work > >> to free space before exhausting Java heap space (which results in > >> allocation stalls). > >> > >> 3.) Another option is profile the application and look for opportunities > >> to reduce unnecessary object allocations. This will reduce the speed at > >> which the Java heap fills that available free space and thus allows > >> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >> > >> Fwiw, I tend to like the first two options better since I would rather > >> see folks write their Java application(s) in their most natural form and > >> let the JVM figure out how to best run the Java app. > >> > >> Also, as a general comment, having 37% head room for ZGC to operate is > >> not a lot of space. Whether that is enough space largely depends on the > >> application, i.e. its allocation rate, object lifetimes, amount live > >> data in the Java heap, etc., and whether concurrent GC threads can keep > >> up with the pace of allocations with the amount of Java heap space > >> that's available. > >> > >> hths, > >> > >> charlie > >> > >> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>> Hi team, > >>> > >>> Our Production application runs on a 320G heap and uses ZGC with large > >>> pages enabled. We have not done any tuning and are using ZGC with > >> defaults. > >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > >>> app experiences allocation stalls (during peak hours) and this happens > >> when > >>> there is a huge drop in the number of classes loaded. We have a > >> bi-monthly > >>> release cycle and can see that the allocation stalls start small a > >> business > >>> day after a release and slowly increase as the week progresses. > >>> > >>> At the moment the app seems to be doing fine but it could escalate > >> anytime > >>> by the looks of it. There is an increase in the app's response time as > >> well > >>> at the same time and a small spike in heap which seem like side > effects. > >>> Any pointers in terms of tuning would be much appreciated. > >>> > >>> The app currently always makes use of at least 200G of heap space which > >>> leaves a 37% head space for ZGC. > >>> > >>> > >>> Regards > >>> Mary > >> > >> End of zgc-dev Digest, Vol 36, Issue 1 > >> ************************************** > >> > > > > > ------------------------------ > > Message: 3 > Date: Wed, 3 Feb 2021 16:38:20 +0100 > From: Per Liden > To: charlie hunt , > mary.joseph at fiixsoftware.com > Cc: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9 at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > Would it be possible for you to share GC logs (preferably generated > using -Xlog:gc*)? That would help us understand if what you experience > here is a consequence of what Charlie describes. > > If that's the case, i.e. a prolonged concurrent GC cycle because of lots > of classes to unload, it could be that the GC is simply kicking in a bit > too late. If so, you might want to have a look at the > -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it > shouldn't aim at using the whole heap (-Xmx), but instead some lower > number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to > variance in GC cycle length and make it more resilient to allocation > spikes. > > cheers, > Per > > On 2/3/21 3:40 PM, charlie hunt wrote: > > Hi Mary, > > > > No, allocation stalls where there is concurrent class unloading is not > > an expected occurrence. > > > > What may be a possibility here, and I will try to explain what I am > > thinking. > > > > With the introduction of concurrent class unloading, the elapsed time it > > takes ZGC to complete a concurrent collection cycle may be slightly > > longer than when class unloading was a GC pause. If your application > > happened to be very close to a point where ZGC was just ahead of "losing > > the race" and exhausting Java heap space, then the additional concurrent > > class unloading work may be just enough for ZGC to lose that race. One > > thing to keep in mind here is that concurrent class unloading removes a > > GC pause, the pause that did class unloading. > > > > hths, > > > > charlie > > > > On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > >> Hi Charlie, > >> > >> Thank you for going over our case and for the recommendations. > Increasing > >> the heap size or the number of ZGC threads is definitely something we > can > >> try out. > >> > >> I'm also trying to understand if the allocation stalls during the time > of > >> class unloading is an expected occurrence. It's almost as if at that > >> point > >> ZGC's entire focus is on class unloading and not clearing out the heap > >> which leads to that spike in heap and subsequent allocation stalls. > >> Perhaps > >> class unloading and freeing heap are not concurrent themselves and ZGC > is > >> able to do one or the other? > >> > >> Regards, > >> Mary > >> > >> On Wed, Feb 3, 2021 at 6:55 AM > wrote: > >> > >>> Send zgc-dev mailing list submissions to > >>> ???????? zgc-dev at openjdk.java.net > >>> > >>> To subscribe or unsubscribe via the World Wide Web, visit > >>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >>> or, via email, send a message with subject or body 'help' to > >>> ???????? zgc-dev-request at openjdk.java.net > >>> > >>> You can reach the person managing the list at > >>> ???????? zgc-dev-owner at openjdk.java.net > >>> > >>> When replying, please edit your Subject line so it is more specific > >>> than "Re: Contents of zgc-dev digest..." > >>> > >>> > >>> Today's Topics: > >>> > >>> ??? 1. Running into Allocation Stalls during class unloading > >>> ?????? (Mary Sunitha Joseph) > >>> ??? 2. Re: Running into Allocation Stalls during class unloading > >>> ?????? (charlie hunt) > >>> > >>> > >>> ---------------------------------------------------------------------- > >>> > >>> Message: 1 > >>> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >>> From: Mary Sunitha Joseph > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> ???????? >>> MXcA at mail.gmail.com> > >>> Content-Type: text/plain; charset="UTF-8" > >>> > >>> Hi team, > >>> > >>> Our Production application runs on a 320G heap and uses ZGC with large > >>> pages enabled. We have not done any tuning and are using ZGC with > >>> defaults. > >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >>> the > >>> app experiences allocation stalls (during peak hours) and this > >>> happens when > >>> there is a huge drop in the number of classes loaded. We have a > >>> bi-monthly > >>> release cycle and can see that the allocation stalls start small a > >>> business > >>> day after a? release and slowly increase as the week progresses. > >>> > >>> At the moment the app seems to be doing fine but it could escalate > >>> anytime > >>> by the looks of it. There is an increase in the app's response time > >>> as well > >>> at the same time and a small spike in heap which seem like side > effects. > >>> Any pointers in terms of tuning would be much appreciated. > >>> > >>> The app currently always makes use of at least 200G of heap space which > >>> leaves a 37% head space for ZGC. > >>> > >>> > >>> Regards > >>> Mary > >>> -- > >>> > >>> Mary Sunitha Joseph (She/her) > >>> > >>> Lead Developer > >>> > >>> Fiix Software > >>> > >>> p: 1 (855) 884-5619 > >>> > >>> e: mary.joseph at fiixsoftware.com > >>> > >>> w: www.fiixsoftware.com > >>> > >>> < > >>> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >>> > >>> > >>> ------------------------------ > >>> > >>> Message: 2 > >>> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >>> From: charlie hunt > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Re: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> Content-Type: text/plain; charset=utf-8; format=flowed > >>> > >>> Hi Mary, > >>> > >>> Thanks for reaching out. > >>> > >>> Since you are observing allocation stalls, there a couple options to > >>> consider. > >>> > >>> 1.) If you have CPU cycles available, you can increase the number of > >>> concurrent GC threads. You can see the default number ZGC is currently > >>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >>> -i concgcthreads.? Increasing the number of concurrent GC threads > should > >>> allow ZGC to do its concurrent work before the Java heap space becomes > >>> exhausted resulting in allocation stalls. But, additional concurrent GC > >>> threads will use more CPU. > >>> > >>> 2.) Another option is to size the Java heap larger, if you have the > >>> available RAM on the system. By increasing the size of the Java heap, > >>> you also increase the time the concurrent GC threads can do their work > >>> to free space before exhausting Java heap space (which results in > >>> allocation stalls). > >>> > >>> 3.) Another option is profile the application and look for > opportunities > >>> to reduce unnecessary object allocations. This will reduce the speed at > >>> which the Java heap fills that available free space and thus allows > >>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >>> > >>> Fwiw, I tend to like the first two options better since I would rather > >>> see folks write their Java application(s) in their most natural form > and > >>> let the JVM figure out how to best run the Java app. > >>> > >>> Also, as a general comment, having 37% head room for ZGC to operate is > >>> not a lot of space. Whether that is enough space largely depends on the > >>> application, i.e. its allocation rate, object lifetimes, amount live > >>> data in the Java heap, etc., and whether concurrent GC threads can keep > >>> up with the pace of allocations with the amount of Java heap space > >>> that's available. > >>> > >>> hths, > >>> > >>> charlie > >>> > >>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>>> Hi team, > >>>> > >>>> Our Production application runs on a 320G heap and uses ZGC with large > >>>> pages enabled. We have not done any tuning and are using ZGC with > >>> defaults. > >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a > >>>> day the > >>>> app experiences allocation stalls (during peak hours) and this happens > >>> when > >>>> there is a huge drop in the number of classes loaded. We have a > >>> bi-monthly > >>>> release cycle and can see that the allocation stalls start small a > >>> business > >>>> day after a? release and slowly increase as the week progresses. > >>>> > >>>> At the moment the app seems to be doing fine but it could escalate > >>> anytime > >>>> by the looks of it. There is an increase in the app's response time as > >>> well > >>>> at the same time and a small spike in heap which seem like side > >>>> effects. > >>>> Any pointers in terms of tuning would be much appreciated. > >>>> > >>>> The app currently always makes use of at least 200G of heap space > which > >>>> leaves a 37% head space for ZGC. > >>>> > >>>> > >>>> Regards > >>>> Mary > >>> > >>> End of zgc-dev Digest, Vol 36, Issue 1 > >>> ************************************** > >>> > >> > > > End of zgc-dev Digest, Vol 36, Issue 2 > ************************************** > -- Mary Sunitha Joseph (She/her) Lead Developer Fiix Software p: 1 (855) 884-5619 e: mary.joseph at fiixsoftware.com w: www.fiixsoftware.com From mary.joseph at fiixsoftware.com Tue Feb 9 03:00:42 2021 From: mary.joseph at fiixsoftware.com (Mary Sunitha Joseph) Date: Mon, 8 Feb 2021 22:00:42 -0500 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: Hi Per, Due to a message size restriction I am unable to even compress and send over one day's worth of verbose logs. Is there another way I could share them with you? Regards, Mary On Wed, Feb 3, 2021 at 10:39 AM wrote: > Send zgc-dev mailing list submissions to > zgc-dev at openjdk.java.net > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > or, via email, send a message with subject or body 'help' to > zgc-dev-request at openjdk.java.net > > You can reach the person managing the list at > zgc-dev-owner at openjdk.java.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of zgc-dev digest..." > > > Today's Topics: > > 1. Re: Running into Allocation Stalls during class unloading > (Mary Sunitha Joseph) > 2. Re: Running into Allocation Stalls during class unloading > (charlie hunt) > 3. Re: Running into Allocation Stalls during class unloading > (Per Liden) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 3 Feb 2021 09:10:59 -0500 > From: Mary Sunitha Joseph > To: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: > rB2yQw at mail.gmail.com> > Content-Type: text/plain; charset="UTF-8" > > Hi Charlie, > > Thank you for going over our case and for the recommendations. Increasing > the heap size or the number of ZGC threads is definitely something we can > try out. > > I'm also trying to understand if the allocation stalls during the time of > class unloading is an expected occurrence. It's almost as if at that point > ZGC's entire focus is on class unloading and not clearing out the heap > which leads to that spike in heap and subsequent allocation stalls. Perhaps > class unloading and freeing heap are not concurrent themselves and ZGC is > able to do one or the other? > > Regards, > Mary > > On Wed, Feb 3, 2021 at 6:55 AM wrote: > > > Send zgc-dev mailing list submissions to > > zgc-dev at openjdk.java.net > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > > or, via email, send a message with subject or body 'help' to > > zgc-dev-request at openjdk.java.net > > > > You can reach the person managing the list at > > zgc-dev-owner at openjdk.java.net > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of zgc-dev digest..." > > > > > > Today's Topics: > > > > 1. Running into Allocation Stalls during class unloading > > (Mary Sunitha Joseph) > > 2. Re: Running into Allocation Stalls during class unloading > > (charlie hunt) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Tue, 2 Feb 2021 14:30:40 -0500 > > From: Mary Sunitha Joseph > > To: zgc-dev at openjdk.java.net > > Subject: Running into Allocation Stalls during class unloading > > Message-ID: > > > MXcA at mail.gmail.com> > > Content-Type: text/plain; charset="UTF-8" > > > > Hi team, > > > > Our Production application runs on a 320G heap and uses ZGC with large > > pages enabled. We have not done any tuning and are using ZGC with > defaults. > > Since upgrading to JDK 15.0.1 we've started to notice that once a day the > > app experiences allocation stalls (during peak hours) and this happens > when > > there is a huge drop in the number of classes loaded. We have a > bi-monthly > > release cycle and can see that the allocation stalls start small a > business > > day after a release and slowly increase as the week progresses. > > > > At the moment the app seems to be doing fine but it could escalate > anytime > > by the looks of it. There is an increase in the app's response time as > well > > at the same time and a small spike in heap which seem like side effects. > > Any pointers in terms of tuning would be much appreciated. > > > > The app currently always makes use of at least 200G of heap space which > > leaves a 37% head space for ZGC. > > > > > > Regards > > Mary > > -- > > > > Mary Sunitha Joseph (She/her) > > > > Lead Developer > > > > Fiix Software > > > > p: 1 (855) 884-5619 > > > > e: mary.joseph at fiixsoftware.com > > > > w: www.fiixsoftware.com > > > > < > > > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > > > > > > > > > ------------------------------ > > > > Message: 2 > > Date: Tue, 2 Feb 2021 15:02:37 -0600 > > From: charlie hunt > > To: zgc-dev at openjdk.java.net > > Subject: Re: Running into Allocation Stalls during class unloading > > Message-ID: > > Content-Type: text/plain; charset=utf-8; format=flowed > > > > Hi Mary, > > > > Thanks for reaching out. > > > > Since you are observing allocation stalls, there a couple options to > > consider. > > > > 1.) If you have CPU cycles available, you can increase the number of > > concurrent GC threads. You can see the default number ZGC is currently > > using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > > -i concgcthreads.? Increasing the number of concurrent GC threads should > > allow ZGC to do its concurrent work before the Java heap space becomes > > exhausted resulting in allocation stalls. But, additional concurrent GC > > threads will use more CPU. > > > > 2.) Another option is to size the Java heap larger, if you have the > > available RAM on the system. By increasing the size of the Java heap, > > you also increase the time the concurrent GC threads can do their work > > to free space before exhausting Java heap space (which results in > > allocation stalls). > > > > 3.) Another option is profile the application and look for opportunities > > to reduce unnecessary object allocations. This will reduce the speed at > > which the Java heap fills that available free space and thus allows > > ZGC's concurrent GC threads to keep up and avoid allocation stalls. > > > > Fwiw, I tend to like the first two options better since I would rather > > see folks write their Java application(s) in their most natural form and > > let the JVM figure out how to best run the Java app. > > > > Also, as a general comment, having 37% head room for ZGC to operate is > > not a lot of space. Whether that is enough space largely depends on the > > application, i.e. its allocation rate, object lifetimes, amount live > > data in the Java heap, etc., and whether concurrent GC threads can keep > > up with the pace of allocations with the amount of Java heap space > > that's available. > > > > hths, > > > > charlie > > > > On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > > > Hi team, > > > > > > Our Production application runs on a 320G heap and uses ZGC with large > > > pages enabled. We have not done any tuning and are using ZGC with > > defaults. > > > Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > > > app experiences allocation stalls (during peak hours) and this happens > > when > > > there is a huge drop in the number of classes loaded. We have a > > bi-monthly > > > release cycle and can see that the allocation stalls start small a > > business > > > day after a release and slowly increase as the week progresses. > > > > > > At the moment the app seems to be doing fine but it could escalate > > anytime > > > by the looks of it. There is an increase in the app's response time as > > well > > > at the same time and a small spike in heap which seem like side > effects. > > > Any pointers in terms of tuning would be much appreciated. > > > > > > The app currently always makes use of at least 200G of heap space which > > > leaves a 37% head space for ZGC. > > > > > > > > > Regards > > > Mary > > > > > > End of zgc-dev Digest, Vol 36, Issue 1 > > ************************************** > > > > > -- > > Mary Sunitha Joseph (She/her) > > Lead Developer > > Fiix Software > > p: 1 (855) 884-5619 > > e: mary.joseph at fiixsoftware.com > > w: www.fiixsoftware.com > > < > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > > > > > ------------------------------ > > Message: 2 > Date: Wed, 3 Feb 2021 08:40:53 -0600 > From: charlie hunt > To: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90 at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > No, allocation stalls where there is concurrent class unloading is not > an expected occurrence. > > What may be a possibility here, and I will try to explain what I am > thinking. > > With the introduction of concurrent class unloading, the elapsed time it > takes ZGC to complete a concurrent collection cycle may be slightly > longer than when class unloading was a GC pause. If your application > happened to be very close to a point where ZGC was just ahead of "losing > the race" and exhausting Java heap space, then the additional concurrent > class unloading work may be just enough for ZGC to lose that race. One > thing to keep in mind here is that concurrent class unloading removes a > GC pause, the pause that did class unloading. > > hths, > > charlie > > On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > > Hi Charlie, > > > > Thank you for going over our case and for the recommendations. Increasing > > the heap size or the number of ZGC threads is definitely something we can > > try out. > > > > I'm also trying to understand if the allocation stalls during the time of > > class unloading is an expected occurrence. It's almost as if at that > point > > ZGC's entire focus is on class unloading and not clearing out the heap > > which leads to that spike in heap and subsequent allocation stalls. > Perhaps > > class unloading and freeing heap are not concurrent themselves and ZGC is > > able to do one or the other? > > > > Regards, > > Mary > > > > On Wed, Feb 3, 2021 at 6:55 AM wrote: > > > >> Send zgc-dev mailing list submissions to > >> zgc-dev at openjdk.java.net > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >> or, via email, send a message with subject or body 'help' to > >> zgc-dev-request at openjdk.java.net > >> > >> You can reach the person managing the list at > >> zgc-dev-owner at openjdk.java.net > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of zgc-dev digest..." > >> > >> > >> Today's Topics: > >> > >> 1. Running into Allocation Stalls during class unloading > >> (Mary Sunitha Joseph) > >> 2. Re: Running into Allocation Stalls during class unloading > >> (charlie hunt) > >> > >> > >> ---------------------------------------------------------------------- > >> > >> Message: 1 > >> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >> From: Mary Sunitha Joseph > >> To: zgc-dev at openjdk.java.net > >> Subject: Running into Allocation Stalls during class unloading > >> Message-ID: > >> >> MXcA at mail.gmail.com> > >> Content-Type: text/plain; charset="UTF-8" > >> > >> Hi team, > >> > >> Our Production application runs on a 320G heap and uses ZGC with large > >> pages enabled. We have not done any tuning and are using ZGC with > defaults. > >> Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > >> app experiences allocation stalls (during peak hours) and this happens > when > >> there is a huge drop in the number of classes loaded. We have a > bi-monthly > >> release cycle and can see that the allocation stalls start small a > business > >> day after a release and slowly increase as the week progresses. > >> > >> At the moment the app seems to be doing fine but it could escalate > anytime > >> by the looks of it. There is an increase in the app's response time as > well > >> at the same time and a small spike in heap which seem like side effects. > >> Any pointers in terms of tuning would be much appreciated. > >> > >> The app currently always makes use of at least 200G of heap space which > >> leaves a 37% head space for ZGC. > >> > >> > >> Regards > >> Mary > >> -- > >> > >> Mary Sunitha Joseph (She/her) > >> > >> Lead Developer > >> > >> Fiix Software > >> > >> p: 1 (855) 884-5619 > >> > >> e: mary.joseph at fiixsoftware.com > >> > >> w: www.fiixsoftware.com > >> > >> < > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >> > >> ------------------------------ > >> > >> Message: 2 > >> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >> From: charlie hunt > >> To: zgc-dev at openjdk.java.net > >> Subject: Re: Running into Allocation Stalls during class unloading > >> Message-ID: > >> Content-Type: text/plain; charset=utf-8; format=flowed > >> > >> Hi Mary, > >> > >> Thanks for reaching out. > >> > >> Since you are observing allocation stalls, there a couple options to > >> consider. > >> > >> 1.) If you have CPU cycles available, you can increase the number of > >> concurrent GC threads. You can see the default number ZGC is currently > >> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >> -i concgcthreads.? Increasing the number of concurrent GC threads should > >> allow ZGC to do its concurrent work before the Java heap space becomes > >> exhausted resulting in allocation stalls. But, additional concurrent GC > >> threads will use more CPU. > >> > >> 2.) Another option is to size the Java heap larger, if you have the > >> available RAM on the system. By increasing the size of the Java heap, > >> you also increase the time the concurrent GC threads can do their work > >> to free space before exhausting Java heap space (which results in > >> allocation stalls). > >> > >> 3.) Another option is profile the application and look for opportunities > >> to reduce unnecessary object allocations. This will reduce the speed at > >> which the Java heap fills that available free space and thus allows > >> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >> > >> Fwiw, I tend to like the first two options better since I would rather > >> see folks write their Java application(s) in their most natural form and > >> let the JVM figure out how to best run the Java app. > >> > >> Also, as a general comment, having 37% head room for ZGC to operate is > >> not a lot of space. Whether that is enough space largely depends on the > >> application, i.e. its allocation rate, object lifetimes, amount live > >> data in the Java heap, etc., and whether concurrent GC threads can keep > >> up with the pace of allocations with the amount of Java heap space > >> that's available. > >> > >> hths, > >> > >> charlie > >> > >> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>> Hi team, > >>> > >>> Our Production application runs on a 320G heap and uses ZGC with large > >>> pages enabled. We have not done any tuning and are using ZGC with > >> defaults. > >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > >>> app experiences allocation stalls (during peak hours) and this happens > >> when > >>> there is a huge drop in the number of classes loaded. We have a > >> bi-monthly > >>> release cycle and can see that the allocation stalls start small a > >> business > >>> day after a release and slowly increase as the week progresses. > >>> > >>> At the moment the app seems to be doing fine but it could escalate > >> anytime > >>> by the looks of it. There is an increase in the app's response time as > >> well > >>> at the same time and a small spike in heap which seem like side > effects. > >>> Any pointers in terms of tuning would be much appreciated. > >>> > >>> The app currently always makes use of at least 200G of heap space which > >>> leaves a 37% head space for ZGC. > >>> > >>> > >>> Regards > >>> Mary > >> > >> End of zgc-dev Digest, Vol 36, Issue 1 > >> ************************************** > >> > > > > > ------------------------------ > > Message: 3 > Date: Wed, 3 Feb 2021 16:38:20 +0100 > From: Per Liden > To: charlie hunt , > mary.joseph at fiixsoftware.com > Cc: zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9 at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > Would it be possible for you to share GC logs (preferably generated > using -Xlog:gc*)? That would help us understand if what you experience > here is a consequence of what Charlie describes. > > If that's the case, i.e. a prolonged concurrent GC cycle because of lots > of classes to unload, it could be that the GC is simply kicking in a bit > too late. If so, you might want to have a look at the > -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it > shouldn't aim at using the whole heap (-Xmx), but instead some lower > number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to > variance in GC cycle length and make it more resilient to allocation > spikes. > > cheers, > Per > > On 2/3/21 3:40 PM, charlie hunt wrote: > > Hi Mary, > > > > No, allocation stalls where there is concurrent class unloading is not > > an expected occurrence. > > > > What may be a possibility here, and I will try to explain what I am > > thinking. > > > > With the introduction of concurrent class unloading, the elapsed time it > > takes ZGC to complete a concurrent collection cycle may be slightly > > longer than when class unloading was a GC pause. If your application > > happened to be very close to a point where ZGC was just ahead of "losing > > the race" and exhausting Java heap space, then the additional concurrent > > class unloading work may be just enough for ZGC to lose that race. One > > thing to keep in mind here is that concurrent class unloading removes a > > GC pause, the pause that did class unloading. > > > > hths, > > > > charlie > > > > On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > >> Hi Charlie, > >> > >> Thank you for going over our case and for the recommendations. > Increasing > >> the heap size or the number of ZGC threads is definitely something we > can > >> try out. > >> > >> I'm also trying to understand if the allocation stalls during the time > of > >> class unloading is an expected occurrence. It's almost as if at that > >> point > >> ZGC's entire focus is on class unloading and not clearing out the heap > >> which leads to that spike in heap and subsequent allocation stalls. > >> Perhaps > >> class unloading and freeing heap are not concurrent themselves and ZGC > is > >> able to do one or the other? > >> > >> Regards, > >> Mary > >> > >> On Wed, Feb 3, 2021 at 6:55 AM > wrote: > >> > >>> Send zgc-dev mailing list submissions to > >>> ???????? zgc-dev at openjdk.java.net > >>> > >>> To subscribe or unsubscribe via the World Wide Web, visit > >>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >>> or, via email, send a message with subject or body 'help' to > >>> ???????? zgc-dev-request at openjdk.java.net > >>> > >>> You can reach the person managing the list at > >>> ???????? zgc-dev-owner at openjdk.java.net > >>> > >>> When replying, please edit your Subject line so it is more specific > >>> than "Re: Contents of zgc-dev digest..." > >>> > >>> > >>> Today's Topics: > >>> > >>> ??? 1. Running into Allocation Stalls during class unloading > >>> ?????? (Mary Sunitha Joseph) > >>> ??? 2. Re: Running into Allocation Stalls during class unloading > >>> ?????? (charlie hunt) > >>> > >>> > >>> ---------------------------------------------------------------------- > >>> > >>> Message: 1 > >>> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >>> From: Mary Sunitha Joseph > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> ???????? >>> MXcA at mail.gmail.com> > >>> Content-Type: text/plain; charset="UTF-8" > >>> > >>> Hi team, > >>> > >>> Our Production application runs on a 320G heap and uses ZGC with large > >>> pages enabled. We have not done any tuning and are using ZGC with > >>> defaults. > >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >>> the > >>> app experiences allocation stalls (during peak hours) and this > >>> happens when > >>> there is a huge drop in the number of classes loaded. We have a > >>> bi-monthly > >>> release cycle and can see that the allocation stalls start small a > >>> business > >>> day after a? release and slowly increase as the week progresses. > >>> > >>> At the moment the app seems to be doing fine but it could escalate > >>> anytime > >>> by the looks of it. There is an increase in the app's response time > >>> as well > >>> at the same time and a small spike in heap which seem like side > effects. > >>> Any pointers in terms of tuning would be much appreciated. > >>> > >>> The app currently always makes use of at least 200G of heap space which > >>> leaves a 37% head space for ZGC. > >>> > >>> > >>> Regards > >>> Mary > >>> -- > >>> > >>> Mary Sunitha Joseph (She/her) > >>> > >>> Lead Developer > >>> > >>> Fiix Software > >>> > >>> p: 1 (855) 884-5619 > >>> > >>> e: mary.joseph at fiixsoftware.com > >>> > >>> w: www.fiixsoftware.com > >>> > >>> < > >>> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >>> > >>> > >>> ------------------------------ > >>> > >>> Message: 2 > >>> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >>> From: charlie hunt > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Re: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> Content-Type: text/plain; charset=utf-8; format=flowed > >>> > >>> Hi Mary, > >>> > >>> Thanks for reaching out. > >>> > >>> Since you are observing allocation stalls, there a couple options to > >>> consider. > >>> > >>> 1.) If you have CPU cycles available, you can increase the number of > >>> concurrent GC threads. You can see the default number ZGC is currently > >>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >>> -i concgcthreads.? Increasing the number of concurrent GC threads > should > >>> allow ZGC to do its concurrent work before the Java heap space becomes > >>> exhausted resulting in allocation stalls. But, additional concurrent GC > >>> threads will use more CPU. > >>> > >>> 2.) Another option is to size the Java heap larger, if you have the > >>> available RAM on the system. By increasing the size of the Java heap, > >>> you also increase the time the concurrent GC threads can do their work > >>> to free space before exhausting Java heap space (which results in > >>> allocation stalls). > >>> > >>> 3.) Another option is profile the application and look for > opportunities > >>> to reduce unnecessary object allocations. This will reduce the speed at > >>> which the Java heap fills that available free space and thus allows > >>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >>> > >>> Fwiw, I tend to like the first two options better since I would rather > >>> see folks write their Java application(s) in their most natural form > and > >>> let the JVM figure out how to best run the Java app. > >>> > >>> Also, as a general comment, having 37% head room for ZGC to operate is > >>> not a lot of space. Whether that is enough space largely depends on the > >>> application, i.e. its allocation rate, object lifetimes, amount live > >>> data in the Java heap, etc., and whether concurrent GC threads can keep > >>> up with the pace of allocations with the amount of Java heap space > >>> that's available. > >>> > >>> hths, > >>> > >>> charlie > >>> > >>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>>> Hi team, > >>>> > >>>> Our Production application runs on a 320G heap and uses ZGC with large > >>>> pages enabled. We have not done any tuning and are using ZGC with > >>> defaults. > >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a > >>>> day the > >>>> app experiences allocation stalls (during peak hours) and this happens > >>> when > >>>> there is a huge drop in the number of classes loaded. We have a > >>> bi-monthly > >>>> release cycle and can see that the allocation stalls start small a > >>> business > >>>> day after a? release and slowly increase as the week progresses. > >>>> > >>>> At the moment the app seems to be doing fine but it could escalate > >>> anytime > >>>> by the looks of it. There is an increase in the app's response time as > >>> well > >>>> at the same time and a small spike in heap which seem like side > >>>> effects. > >>>> Any pointers in terms of tuning would be much appreciated. > >>>> > >>>> The app currently always makes use of at least 200G of heap space > which > >>>> leaves a 37% head space for ZGC. > >>>> > >>>> > >>>> Regards > >>>> Mary > >>> > >>> End of zgc-dev Digest, Vol 36, Issue 1 > >>> ************************************** > >>> > >> > > > End of zgc-dev Digest, Vol 36, Issue 2 > ************************************** > -- Mary Sunitha Joseph (She/her) Lead Developer Fiix Software p: 1 (855) 884-5619 e: mary.joseph at fiixsoftware.com w: www.fiixsoftware.com From charlie.hunt at oracle.com Tue Feb 9 23:44:17 2021 From: charlie.hunt at oracle.com (charlie hunt) Date: Tue, 9 Feb 2021 17:44:17 -0600 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: <7a8922c7-2f51-6e7c-032c-ff8a0911346e@oracle.com> Hi Mary, Based on the data you have shared, (thanks for sharing!), it looks like the previous suggestions from both Per and I should help. 1. Increase number of concurrent GC threads -- This should help shorten the elapsed time you are seeing in "Process Non-Strong References" phase, (and other concurrent phases too). Increasing concurrent GC threads will use additional CPU. So you should expect to see some increased CPU usage during the concurrent phases. 2.) Setting a SoftMaxHeapSize, as Per suggested (which is a great idea I had not thought about). This should help ZGC deal with those spikes you mentioned. The end result being that it should help ZGC complete the "Process Non-Strong References" phase in a state where there is more headroom / available Java heap space (and avoid an allocation stall). 3.) And, obviously, increasing Java heap size, and/or reducing unnecessary allocations. As mentioned earlier, any one, or any number of these should help. Keep us posted on your progress. thanks, charlie On 2/8/21 8:39 PM, Mary Sunitha Joseph wrote: > Hi Charlie, Per > > I went over our verbose GC logs and found a couple of things : > > 1. Heap usage by the application during peak hours is always close to an > avg of 55%. > > 2. Each GC cycle barely manages to reclaim upto 10% and there are times > when the Concurrent Process Non-Strong reference phase takes upto a minute > to complete. These are the times when the allocation stalls occur and the > GC cycle soon after is unable to reclaim any space. > > Eg extract: > > HIGH ALLOC RATE > > [2021-02-02T17:01:32.641+0000][info][gc ] GC(1276) Garbage > Collection (Allocation Rate) 186282M(57%)->205548M(63%) > > PROCESSING PHASE > > [2021-02-02T17:03:09.369+0000][info][gc,phases ] GC(1277) Concurrent > Process Non-Strong References 71206.950ms > > ALLOC STALL > > [2021-02-02T17:03:09.372+0000][info][gc ] Allocation Stall > (scrubbed) 318.487ms > > HIGH ALLOC RATE > > [2021-02-02T17:03:09.756+0000][info][gc ] GC(1277) Garbage > Collection (Allocation Rate) 205634M(63%)->245938M(75%) > > PROCESSING PHASE > > [2021-02-02T17:04:46.862+0000][info][gc,phases ] GC(1278) Concurrent > Process Non-Strong References 72611.793ms > > ALLOC STALL > > [2021-02-02T17:04:46.864+0000][info][gc ] Allocation Stall > (scrubbed) 554.293ms > > GC is then able to reclaim some space > > [2021-02-02T17:04:48.332+0000][info][gc ] GC(1278) Garbage > Collection (Allocation Rate) 245974M(75%)->204610M(62%) > > Towards the end of the week, this phase takes upto 2.3 mins. > > 3. There are object initialization hotspots in the application that need > optimizing, but even with 320GB of memory and at most 50-60% heap usage, it > looks like ZGC needs more room to maneuver. > > The Concurrent Process Non-strong References taking minutes to complete is > a bit worrisome and we should probably look at optimizing things at our end > to reduce this. > > Regards, > Mary > > > On Wed, Feb 3, 2021 at 10:39 AM wrote: > >> Send zgc-dev mailing list submissions to >> zgc-dev at openjdk.java.net >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >> or, via email, send a message with subject or body 'help' to >> zgc-dev-request at openjdk.java.net >> >> You can reach the person managing the list at >> zgc-dev-owner at openjdk.java.net >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of zgc-dev digest..." >> >> >> Today's Topics: >> >> 1. Re: Running into Allocation Stalls during class unloading >> (Mary Sunitha Joseph) >> 2. Re: Running into Allocation Stalls during class unloading >> (charlie hunt) >> 3. Re: Running into Allocation Stalls during class unloading >> (Per Liden) >> >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Wed, 3 Feb 2021 09:10:59 -0500 >> From: Mary Sunitha Joseph >> To: zgc-dev at openjdk.java.net >> Subject: Re: Running into Allocation Stalls during class unloading >> Message-ID: >> > rB2yQw at mail.gmail.com> >> Content-Type: text/plain; charset="UTF-8" >> >> Hi Charlie, >> >> Thank you for going over our case and for the recommendations. Increasing >> the heap size or the number of ZGC threads is definitely something we can >> try out. >> >> I'm also trying to understand if the allocation stalls during the time of >> class unloading is an expected occurrence. It's almost as if at that point >> ZGC's entire focus is on class unloading and not clearing out the heap >> which leads to that spike in heap and subsequent allocation stalls. Perhaps >> class unloading and freeing heap are not concurrent themselves and ZGC is >> able to do one or the other? >> >> Regards, >> Mary >> >> On Wed, Feb 3, 2021 at 6:55 AM wrote: >> >>> Send zgc-dev mailing list submissions to >>> zgc-dev at openjdk.java.net >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>> or, via email, send a message with subject or body 'help' to >>> zgc-dev-request at openjdk.java.net >>> >>> You can reach the person managing the list at >>> zgc-dev-owner at openjdk.java.net >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of zgc-dev digest..." >>> >>> >>> Today's Topics: >>> >>> 1. Running into Allocation Stalls during class unloading >>> (Mary Sunitha Joseph) >>> 2. Re: Running into Allocation Stalls during class unloading >>> (charlie hunt) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>> From: Mary Sunitha Joseph >>> To: zgc-dev at openjdk.java.net >>> Subject: Running into Allocation Stalls during class unloading >>> Message-ID: >>> >> MXcA at mail.gmail.com> >>> Content-Type: text/plain; charset="UTF-8" >>> >>> Hi team, >>> >>> Our Production application runs on a 320G heap and uses ZGC with large >>> pages enabled. We have not done any tuning and are using ZGC with >> defaults. >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day the >>> app experiences allocation stalls (during peak hours) and this happens >> when >>> there is a huge drop in the number of classes loaded. We have a >> bi-monthly >>> release cycle and can see that the allocation stalls start small a >> business >>> day after a release and slowly increase as the week progresses. >>> >>> At the moment the app seems to be doing fine but it could escalate >> anytime >>> by the looks of it. There is an increase in the app's response time as >> well >>> at the same time and a small spike in heap which seem like side effects. >>> Any pointers in terms of tuning would be much appreciated. >>> >>> The app currently always makes use of at least 200G of heap space which >>> leaves a 37% head space for ZGC. >>> >>> >>> Regards >>> Mary >>> -- >>> >>> Mary Sunitha Joseph (She/her) >>> >>> Lead Developer >>> >>> Fiix Software >>> >>> p: 1 (855) 884-5619 >>> >>> e: mary.joseph at fiixsoftware.com >>> >>> w: www.fiixsoftware.com >>> >>> < >>> >> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>> From: charlie hunt >>> To: zgc-dev at openjdk.java.net >>> Subject: Re: Running into Allocation Stalls during class unloading >>> Message-ID: >>> Content-Type: text/plain; charset=utf-8; format=flowed >>> >>> Hi Mary, >>> >>> Thanks for reaching out. >>> >>> Since you are observing allocation stalls, there a couple options to >>> consider. >>> >>> 1.) If you have CPU cycles available, you can increase the number of >>> concurrent GC threads. You can see the default number ZGC is currently >>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >>> -i concgcthreads.? Increasing the number of concurrent GC threads should >>> allow ZGC to do its concurrent work before the Java heap space becomes >>> exhausted resulting in allocation stalls. But, additional concurrent GC >>> threads will use more CPU. >>> >>> 2.) Another option is to size the Java heap larger, if you have the >>> available RAM on the system. By increasing the size of the Java heap, >>> you also increase the time the concurrent GC threads can do their work >>> to free space before exhausting Java heap space (which results in >>> allocation stalls). >>> >>> 3.) Another option is profile the application and look for opportunities >>> to reduce unnecessary object allocations. This will reduce the speed at >>> which the Java heap fills that available free space and thus allows >>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>> >>> Fwiw, I tend to like the first two options better since I would rather >>> see folks write their Java application(s) in their most natural form and >>> let the JVM figure out how to best run the Java app. >>> >>> Also, as a general comment, having 37% head room for ZGC to operate is >>> not a lot of space. Whether that is enough space largely depends on the >>> application, i.e. its allocation rate, object lifetimes, amount live >>> data in the Java heap, etc., and whether concurrent GC threads can keep >>> up with the pace of allocations with the amount of Java heap space >>> that's available. >>> >>> hths, >>> >>> charlie >>> >>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>> Hi team, >>>> >>>> Our Production application runs on a 320G heap and uses ZGC with large >>>> pages enabled. We have not done any tuning and are using ZGC with >>> defaults. >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >> the >>>> app experiences allocation stalls (during peak hours) and this happens >>> when >>>> there is a huge drop in the number of classes loaded. We have a >>> bi-monthly >>>> release cycle and can see that the allocation stalls start small a >>> business >>>> day after a release and slowly increase as the week progresses. >>>> >>>> At the moment the app seems to be doing fine but it could escalate >>> anytime >>>> by the looks of it. There is an increase in the app's response time as >>> well >>>> at the same time and a small spike in heap which seem like side >> effects. >>>> Any pointers in terms of tuning would be much appreciated. >>>> >>>> The app currently always makes use of at least 200G of heap space which >>>> leaves a 37% head space for ZGC. >>>> >>>> >>>> Regards >>>> Mary >>> >>> End of zgc-dev Digest, Vol 36, Issue 1 >>> ************************************** >>> >> >> -- >> >> Mary Sunitha Joseph (She/her) >> >> Lead Developer >> >> Fiix Software >> >> p: 1 (855) 884-5619 >> >> e: mary.joseph at fiixsoftware.com >> >> w: www.fiixsoftware.com >> >> < >> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >> >> ------------------------------ >> >> Message: 2 >> Date: Wed, 3 Feb 2021 08:40:53 -0600 >> From: charlie hunt >> To: zgc-dev at openjdk.java.net >> Subject: Re: Running into Allocation Stalls during class unloading >> Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90 at oracle.com> >> Content-Type: text/plain; charset=utf-8; format=flowed >> >> Hi Mary, >> >> No, allocation stalls where there is concurrent class unloading is not >> an expected occurrence. >> >> What may be a possibility here, and I will try to explain what I am >> thinking. >> >> With the introduction of concurrent class unloading, the elapsed time it >> takes ZGC to complete a concurrent collection cycle may be slightly >> longer than when class unloading was a GC pause. If your application >> happened to be very close to a point where ZGC was just ahead of "losing >> the race" and exhausting Java heap space, then the additional concurrent >> class unloading work may be just enough for ZGC to lose that race. One >> thing to keep in mind here is that concurrent class unloading removes a >> GC pause, the pause that did class unloading. >> >> hths, >> >> charlie >> >> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: >>> Hi Charlie, >>> >>> Thank you for going over our case and for the recommendations. Increasing >>> the heap size or the number of ZGC threads is definitely something we can >>> try out. >>> >>> I'm also trying to understand if the allocation stalls during the time of >>> class unloading is an expected occurrence. It's almost as if at that >> point >>> ZGC's entire focus is on class unloading and not clearing out the heap >>> which leads to that spike in heap and subsequent allocation stalls. >> Perhaps >>> class unloading and freeing heap are not concurrent themselves and ZGC is >>> able to do one or the other? >>> >>> Regards, >>> Mary >>> >>> On Wed, Feb 3, 2021 at 6:55 AM wrote: >>> >>>> Send zgc-dev mailing list submissions to >>>> zgc-dev at openjdk.java.net >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>>> or, via email, send a message with subject or body 'help' to >>>> zgc-dev-request at openjdk.java.net >>>> >>>> You can reach the person managing the list at >>>> zgc-dev-owner at openjdk.java.net >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of zgc-dev digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> 1. Running into Allocation Stalls during class unloading >>>> (Mary Sunitha Joseph) >>>> 2. Re: Running into Allocation Stalls during class unloading >>>> (charlie hunt) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>>> From: Mary Sunitha Joseph >>>> To: zgc-dev at openjdk.java.net >>>> Subject: Running into Allocation Stalls during class unloading >>>> Message-ID: >>>> >>> MXcA at mail.gmail.com> >>>> Content-Type: text/plain; charset="UTF-8" >>>> >>>> Hi team, >>>> >>>> Our Production application runs on a 320G heap and uses ZGC with large >>>> pages enabled. We have not done any tuning and are using ZGC with >> defaults. >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >> the >>>> app experiences allocation stalls (during peak hours) and this happens >> when >>>> there is a huge drop in the number of classes loaded. We have a >> bi-monthly >>>> release cycle and can see that the allocation stalls start small a >> business >>>> day after a release and slowly increase as the week progresses. >>>> >>>> At the moment the app seems to be doing fine but it could escalate >> anytime >>>> by the looks of it. There is an increase in the app's response time as >> well >>>> at the same time and a small spike in heap which seem like side effects. >>>> Any pointers in terms of tuning would be much appreciated. >>>> >>>> The app currently always makes use of at least 200G of heap space which >>>> leaves a 37% head space for ZGC. >>>> >>>> >>>> Regards >>>> Mary >>>> -- >>>> >>>> Mary Sunitha Joseph (She/her) >>>> >>>> Lead Developer >>>> >>>> Fiix Software >>>> >>>> p: 1 (855) 884-5619 >>>> >>>> e: mary.joseph at fiixsoftware.com >>>> >>>> w: www.fiixsoftware.com >>>> >>>> < >>>> >> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>>> From: charlie hunt >>>> To: zgc-dev at openjdk.java.net >>>> Subject: Re: Running into Allocation Stalls during class unloading >>>> Message-ID: >>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>> >>>> Hi Mary, >>>> >>>> Thanks for reaching out. >>>> >>>> Since you are observing allocation stalls, there a couple options to >>>> consider. >>>> >>>> 1.) If you have CPU cycles available, you can increase the number of >>>> concurrent GC threads. You can see the default number ZGC is currently >>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >>>> -i concgcthreads.? Increasing the number of concurrent GC threads should >>>> allow ZGC to do its concurrent work before the Java heap space becomes >>>> exhausted resulting in allocation stalls. But, additional concurrent GC >>>> threads will use more CPU. >>>> >>>> 2.) Another option is to size the Java heap larger, if you have the >>>> available RAM on the system. By increasing the size of the Java heap, >>>> you also increase the time the concurrent GC threads can do their work >>>> to free space before exhausting Java heap space (which results in >>>> allocation stalls). >>>> >>>> 3.) Another option is profile the application and look for opportunities >>>> to reduce unnecessary object allocations. This will reduce the speed at >>>> which the Java heap fills that available free space and thus allows >>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>>> >>>> Fwiw, I tend to like the first two options better since I would rather >>>> see folks write their Java application(s) in their most natural form and >>>> let the JVM figure out how to best run the Java app. >>>> >>>> Also, as a general comment, having 37% head room for ZGC to operate is >>>> not a lot of space. Whether that is enough space largely depends on the >>>> application, i.e. its allocation rate, object lifetimes, amount live >>>> data in the Java heap, etc., and whether concurrent GC threads can keep >>>> up with the pace of allocations with the amount of Java heap space >>>> that's available. >>>> >>>> hths, >>>> >>>> charlie >>>> >>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>>> Hi team, >>>>> >>>>> Our Production application runs on a 320G heap and uses ZGC with large >>>>> pages enabled. We have not done any tuning and are using ZGC with >>>> defaults. >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >> the >>>>> app experiences allocation stalls (during peak hours) and this happens >>>> when >>>>> there is a huge drop in the number of classes loaded. We have a >>>> bi-monthly >>>>> release cycle and can see that the allocation stalls start small a >>>> business >>>>> day after a release and slowly increase as the week progresses. >>>>> >>>>> At the moment the app seems to be doing fine but it could escalate >>>> anytime >>>>> by the looks of it. There is an increase in the app's response time as >>>> well >>>>> at the same time and a small spike in heap which seem like side >> effects. >>>>> Any pointers in terms of tuning would be much appreciated. >>>>> >>>>> The app currently always makes use of at least 200G of heap space which >>>>> leaves a 37% head space for ZGC. >>>>> >>>>> >>>>> Regards >>>>> Mary >>>> End of zgc-dev Digest, Vol 36, Issue 1 >>>> ************************************** >>>> >> >> ------------------------------ >> >> Message: 3 >> Date: Wed, 3 Feb 2021 16:38:20 +0100 >> From: Per Liden >> To: charlie hunt , >> mary.joseph at fiixsoftware.com >> Cc: zgc-dev at openjdk.java.net >> Subject: Re: Running into Allocation Stalls during class unloading >> Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9 at oracle.com> >> Content-Type: text/plain; charset=utf-8; format=flowed >> >> Hi Mary, >> >> Would it be possible for you to share GC logs (preferably generated >> using -Xlog:gc*)? That would help us understand if what you experience >> here is a consequence of what Charlie describes. >> >> If that's the case, i.e. a prolonged concurrent GC cycle because of lots >> of classes to unload, it could be that the GC is simply kicking in a bit >> too late. If so, you might want to have a look at the >> -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it >> shouldn't aim at using the whole heap (-Xmx), but instead some lower >> number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to >> variance in GC cycle length and make it more resilient to allocation >> spikes. >> >> cheers, >> Per >> >> On 2/3/21 3:40 PM, charlie hunt wrote: >>> Hi Mary, >>> >>> No, allocation stalls where there is concurrent class unloading is not >>> an expected occurrence. >>> >>> What may be a possibility here, and I will try to explain what I am >>> thinking. >>> >>> With the introduction of concurrent class unloading, the elapsed time it >>> takes ZGC to complete a concurrent collection cycle may be slightly >>> longer than when class unloading was a GC pause. If your application >>> happened to be very close to a point where ZGC was just ahead of "losing >>> the race" and exhausting Java heap space, then the additional concurrent >>> class unloading work may be just enough for ZGC to lose that race. One >>> thing to keep in mind here is that concurrent class unloading removes a >>> GC pause, the pause that did class unloading. >>> >>> hths, >>> >>> charlie >>> >>> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: >>>> Hi Charlie, >>>> >>>> Thank you for going over our case and for the recommendations. >> Increasing >>>> the heap size or the number of ZGC threads is definitely something we >> can >>>> try out. >>>> >>>> I'm also trying to understand if the allocation stalls during the time >> of >>>> class unloading is an expected occurrence. It's almost as if at that >>>> point >>>> ZGC's entire focus is on class unloading and not clearing out the heap >>>> which leads to that spike in heap and subsequent allocation stalls. >>>> Perhaps >>>> class unloading and freeing heap are not concurrent themselves and ZGC >> is >>>> able to do one or the other? >>>> >>>> Regards, >>>> Mary >>>> >>>> On Wed, Feb 3, 2021 at 6:55 AM >> wrote: >>>>> Send zgc-dev mailing list submissions to >>>>> ???????? zgc-dev at openjdk.java.net >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>>>> or, via email, send a message with subject or body 'help' to >>>>> ???????? zgc-dev-request at openjdk.java.net >>>>> >>>>> You can reach the person managing the list at >>>>> ???????? zgc-dev-owner at openjdk.java.net >>>>> >>>>> When replying, please edit your Subject line so it is more specific >>>>> than "Re: Contents of zgc-dev digest..." >>>>> >>>>> >>>>> Today's Topics: >>>>> >>>>> ??? 1. Running into Allocation Stalls during class unloading >>>>> ?????? (Mary Sunitha Joseph) >>>>> ??? 2. Re: Running into Allocation Stalls during class unloading >>>>> ?????? (charlie hunt) >>>>> >>>>> >>>>> ---------------------------------------------------------------------- >>>>> >>>>> Message: 1 >>>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>>>> From: Mary Sunitha Joseph >>>>> To: zgc-dev at openjdk.java.net >>>>> Subject: Running into Allocation Stalls during class unloading >>>>> Message-ID: >>>>> ???????? >>>> MXcA at mail.gmail.com> >>>>> Content-Type: text/plain; charset="UTF-8" >>>>> >>>>> Hi team, >>>>> >>>>> Our Production application runs on a 320G heap and uses ZGC with large >>>>> pages enabled. We have not done any tuning and are using ZGC with >>>>> defaults. >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >>>>> the >>>>> app experiences allocation stalls (during peak hours) and this >>>>> happens when >>>>> there is a huge drop in the number of classes loaded. We have a >>>>> bi-monthly >>>>> release cycle and can see that the allocation stalls start small a >>>>> business >>>>> day after a? release and slowly increase as the week progresses. >>>>> >>>>> At the moment the app seems to be doing fine but it could escalate >>>>> anytime >>>>> by the looks of it. There is an increase in the app's response time >>>>> as well >>>>> at the same time and a small spike in heap which seem like side >> effects. >>>>> Any pointers in terms of tuning would be much appreciated. >>>>> >>>>> The app currently always makes use of at least 200G of heap space which >>>>> leaves a 37% head space for ZGC. >>>>> >>>>> >>>>> Regards >>>>> Mary >>>>> -- >>>>> >>>>> Mary Sunitha Joseph (She/her) >>>>> >>>>> Lead Developer >>>>> >>>>> Fiix Software >>>>> >>>>> p: 1 (855) 884-5619 >>>>> >>>>> e: mary.joseph at fiixsoftware.com >>>>> >>>>> w: www.fiixsoftware.com >>>>> >>>>> < >>>>> >> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>>>> >>>>> ------------------------------ >>>>> >>>>> Message: 2 >>>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>>>> From: charlie hunt >>>>> To: zgc-dev at openjdk.java.net >>>>> Subject: Re: Running into Allocation Stalls during class unloading >>>>> Message-ID: >>>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>>> >>>>> Hi Mary, >>>>> >>>>> Thanks for reaching out. >>>>> >>>>> Since you are observing allocation stalls, there a couple options to >>>>> consider. >>>>> >>>>> 1.) If you have CPU cycles available, you can increase the number of >>>>> concurrent GC threads. You can see the default number ZGC is currently >>>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >>>>> -i concgcthreads.? Increasing the number of concurrent GC threads >> should >>>>> allow ZGC to do its concurrent work before the Java heap space becomes >>>>> exhausted resulting in allocation stalls. But, additional concurrent GC >>>>> threads will use more CPU. >>>>> >>>>> 2.) Another option is to size the Java heap larger, if you have the >>>>> available RAM on the system. By increasing the size of the Java heap, >>>>> you also increase the time the concurrent GC threads can do their work >>>>> to free space before exhausting Java heap space (which results in >>>>> allocation stalls). >>>>> >>>>> 3.) Another option is profile the application and look for >> opportunities >>>>> to reduce unnecessary object allocations. This will reduce the speed at >>>>> which the Java heap fills that available free space and thus allows >>>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>>>> >>>>> Fwiw, I tend to like the first two options better since I would rather >>>>> see folks write their Java application(s) in their most natural form >> and >>>>> let the JVM figure out how to best run the Java app. >>>>> >>>>> Also, as a general comment, having 37% head room for ZGC to operate is >>>>> not a lot of space. Whether that is enough space largely depends on the >>>>> application, i.e. its allocation rate, object lifetimes, amount live >>>>> data in the Java heap, etc., and whether concurrent GC threads can keep >>>>> up with the pace of allocations with the amount of Java heap space >>>>> that's available. >>>>> >>>>> hths, >>>>> >>>>> charlie >>>>> >>>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>>>> Hi team, >>>>>> >>>>>> Our Production application runs on a 320G heap and uses ZGC with large >>>>>> pages enabled. We have not done any tuning and are using ZGC with >>>>> defaults. >>>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>>>> day the >>>>>> app experiences allocation stalls (during peak hours) and this happens >>>>> when >>>>>> there is a huge drop in the number of classes loaded. We have a >>>>> bi-monthly >>>>>> release cycle and can see that the allocation stalls start small a >>>>> business >>>>>> day after a? release and slowly increase as the week progresses. >>>>>> >>>>>> At the moment the app seems to be doing fine but it could escalate >>>>> anytime >>>>>> by the looks of it. There is an increase in the app's response time as >>>>> well >>>>>> at the same time and a small spike in heap which seem like side >>>>>> effects. >>>>>> Any pointers in terms of tuning would be much appreciated. >>>>>> >>>>>> The app currently always makes use of at least 200G of heap space >> which >>>>>> leaves a 37% head space for ZGC. >>>>>> >>>>>> >>>>>> Regards >>>>>> Mary >>>>> End of zgc-dev Digest, Vol 36, Issue 1 >>>>> ************************************** >>>>> >> >> End of zgc-dev Digest, Vol 36, Issue 2 >> ************************************** >> > From mary.joseph at fiixsoftware.com Wed Feb 10 21:27:05 2021 From: mary.joseph at fiixsoftware.com (Mary Sunitha Joseph) Date: Wed, 10 Feb 2021 16:27:05 -0500 Subject: Running into Allocation Stalls during class unloading In-Reply-To: References: Message-ID: Thank you for confirming, we will try out the recommendations and let you know how things go! Thanks again for your time. Regards, Mary On Tue, Feb 9, 2021 at 6:44 PM wrote: > Send zgc-dev mailing list submissions to > zgc-dev at openjdk.java.net > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > or, via email, send a message with subject or body 'help' to > zgc-dev-request at openjdk.java.net > > You can reach the person managing the list at > zgc-dev-owner at openjdk.java.net > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of zgc-dev digest..." > > > Today's Topics: > > 1. Re: Running into Allocation Stalls during class unloading > (charlie hunt) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 9 Feb 2021 17:44:17 -0600 > From: charlie hunt > To: Mary Sunitha Joseph , > zgc-dev at openjdk.java.net > Subject: Re: Running into Allocation Stalls during class unloading > Message-ID: <7a8922c7-2f51-6e7c-032c-ff8a0911346e at oracle.com> > Content-Type: text/plain; charset=utf-8; format=flowed > > Hi Mary, > > Based on the data you have shared, (thanks for sharing!), it looks like > the previous suggestions from both Per and I should help. > > 1. Increase number of concurrent GC threads -- This should help shorten > the elapsed time you are seeing in "Process Non-Strong References" > phase, (and other concurrent phases too). Increasing concurrent GC > threads will use additional CPU. So you should expect to see some > increased CPU usage during the concurrent phases. > > 2.) Setting a SoftMaxHeapSize, as Per suggested (which is a great idea I > had not thought about). This should help ZGC deal with those spikes you > mentioned. The end result being that it should help ZGC complete the > "Process Non-Strong References" phase in a state where there is more > headroom / available Java heap space (and avoid an allocation stall). > > 3.) And, obviously, increasing Java heap size, and/or reducing > unnecessary allocations. > > As mentioned earlier, any one, or any number of these should help. > > Keep us posted on your progress. > > thanks, > > charlie > > On 2/8/21 8:39 PM, Mary Sunitha Joseph wrote: > > Hi Charlie, Per > > > > I went over our verbose GC logs and found a couple of things : > > > > 1. Heap usage by the application during peak hours is always close to an > > avg of 55%. > > > > 2. Each GC cycle barely manages to reclaim upto 10% and there are times > > when the Concurrent Process Non-Strong reference phase takes upto a > minute > > to complete. These are the times when the allocation stalls occur and the > > GC cycle soon after is unable to reclaim any space. > > > > Eg extract: > > > > HIGH ALLOC RATE > > > > [2021-02-02T17:01:32.641+0000][info][gc ] GC(1276) Garbage > > Collection (Allocation Rate) 186282M(57%)->205548M(63%) > > > > PROCESSING PHASE > > > > [2021-02-02T17:03:09.369+0000][info][gc,phases ] GC(1277) Concurrent > > Process Non-Strong References 71206.950ms > > > > ALLOC STALL > > > > [2021-02-02T17:03:09.372+0000][info][gc ] Allocation Stall > > (scrubbed) 318.487ms > > > > HIGH ALLOC RATE > > > > [2021-02-02T17:03:09.756+0000][info][gc ] GC(1277) Garbage > > Collection (Allocation Rate) 205634M(63%)->245938M(75%) > > > > PROCESSING PHASE > > > > [2021-02-02T17:04:46.862+0000][info][gc,phases ] GC(1278) Concurrent > > Process Non-Strong References 72611.793ms > > > > ALLOC STALL > > > > [2021-02-02T17:04:46.864+0000][info][gc ] Allocation Stall > > (scrubbed) 554.293ms > > > > GC is then able to reclaim some space > > > > [2021-02-02T17:04:48.332+0000][info][gc ] GC(1278) Garbage > > Collection (Allocation Rate) 245974M(75%)->204610M(62%) > > > > Towards the end of the week, this phase takes upto 2.3 mins. > > > > 3. There are object initialization hotspots in the application that need > > optimizing, but even with 320GB of memory and at most 50-60% heap usage, > it > > looks like ZGC needs more room to maneuver. > > > > The Concurrent Process Non-strong References taking minutes to complete > is > > a bit worrisome and we should probably look at optimizing things at our > end > > to reduce this. > > > > Regards, > > Mary > > > > > > On Wed, Feb 3, 2021 at 10:39 AM > wrote: > > > >> Send zgc-dev mailing list submissions to > >> zgc-dev at openjdk.java.net > >> > >> To subscribe or unsubscribe via the World Wide Web, visit > >> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >> or, via email, send a message with subject or body 'help' to > >> zgc-dev-request at openjdk.java.net > >> > >> You can reach the person managing the list at > >> zgc-dev-owner at openjdk.java.net > >> > >> When replying, please edit your Subject line so it is more specific > >> than "Re: Contents of zgc-dev digest..." > >> > >> > >> Today's Topics: > >> > >> 1. Re: Running into Allocation Stalls during class unloading > >> (Mary Sunitha Joseph) > >> 2. Re: Running into Allocation Stalls during class unloading > >> (charlie hunt) > >> 3. Re: Running into Allocation Stalls during class unloading > >> (Per Liden) > >> > >> > >> ---------------------------------------------------------------------- > >> > >> Message: 1 > >> Date: Wed, 3 Feb 2021 09:10:59 -0500 > >> From: Mary Sunitha Joseph > >> To: zgc-dev at openjdk.java.net > >> Subject: Re: Running into Allocation Stalls during class unloading > >> Message-ID: > >> >> rB2yQw at mail.gmail.com> > >> Content-Type: text/plain; charset="UTF-8" > >> > >> Hi Charlie, > >> > >> Thank you for going over our case and for the recommendations. > Increasing > >> the heap size or the number of ZGC threads is definitely something we > can > >> try out. > >> > >> I'm also trying to understand if the allocation stalls during the time > of > >> class unloading is an expected occurrence. It's almost as if at that > point > >> ZGC's entire focus is on class unloading and not clearing out the heap > >> which leads to that spike in heap and subsequent allocation stalls. > Perhaps > >> class unloading and freeing heap are not concurrent themselves and ZGC > is > >> able to do one or the other? > >> > >> Regards, > >> Mary > >> > >> On Wed, Feb 3, 2021 at 6:55 AM > wrote: > >> > >>> Send zgc-dev mailing list submissions to > >>> zgc-dev at openjdk.java.net > >>> > >>> To subscribe or unsubscribe via the World Wide Web, visit > >>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >>> or, via email, send a message with subject or body 'help' to > >>> zgc-dev-request at openjdk.java.net > >>> > >>> You can reach the person managing the list at > >>> zgc-dev-owner at openjdk.java.net > >>> > >>> When replying, please edit your Subject line so it is more specific > >>> than "Re: Contents of zgc-dev digest..." > >>> > >>> > >>> Today's Topics: > >>> > >>> 1. Running into Allocation Stalls during class unloading > >>> (Mary Sunitha Joseph) > >>> 2. Re: Running into Allocation Stalls during class unloading > >>> (charlie hunt) > >>> > >>> > >>> ---------------------------------------------------------------------- > >>> > >>> Message: 1 > >>> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >>> From: Mary Sunitha Joseph > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> >>> MXcA at mail.gmail.com> > >>> Content-Type: text/plain; charset="UTF-8" > >>> > >>> Hi team, > >>> > >>> Our Production application runs on a 320G heap and uses ZGC with large > >>> pages enabled. We have not done any tuning and are using ZGC with > >> defaults. > >>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > the > >>> app experiences allocation stalls (during peak hours) and this happens > >> when > >>> there is a huge drop in the number of classes loaded. We have a > >> bi-monthly > >>> release cycle and can see that the allocation stalls start small a > >> business > >>> day after a release and slowly increase as the week progresses. > >>> > >>> At the moment the app seems to be doing fine but it could escalate > >> anytime > >>> by the looks of it. There is an increase in the app's response time as > >> well > >>> at the same time and a small spike in heap which seem like side > effects. > >>> Any pointers in terms of tuning would be much appreciated. > >>> > >>> The app currently always makes use of at least 200G of heap space which > >>> leaves a 37% head space for ZGC. > >>> > >>> > >>> Regards > >>> Mary > >>> -- > >>> > >>> Mary Sunitha Joseph (She/her) > >>> > >>> Lead Developer > >>> > >>> Fiix Software > >>> > >>> p: 1 (855) 884-5619 > >>> > >>> e: mary.joseph at fiixsoftware.com > >>> > >>> w: www.fiixsoftware.com > >>> > >>> < > >>> > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >>> > >>> ------------------------------ > >>> > >>> Message: 2 > >>> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >>> From: charlie hunt > >>> To: zgc-dev at openjdk.java.net > >>> Subject: Re: Running into Allocation Stalls during class unloading > >>> Message-ID: > >>> Content-Type: text/plain; charset=utf-8; format=flowed > >>> > >>> Hi Mary, > >>> > >>> Thanks for reaching out. > >>> > >>> Since you are observing allocation stalls, there a couple options to > >>> consider. > >>> > >>> 1.) If you have CPU cycles available, you can increase the number of > >>> concurrent GC threads. You can see the default number ZGC is currently > >>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >>> -i concgcthreads.? Increasing the number of concurrent GC threads > should > >>> allow ZGC to do its concurrent work before the Java heap space becomes > >>> exhausted resulting in allocation stalls. But, additional concurrent GC > >>> threads will use more CPU. > >>> > >>> 2.) Another option is to size the Java heap larger, if you have the > >>> available RAM on the system. By increasing the size of the Java heap, > >>> you also increase the time the concurrent GC threads can do their work > >>> to free space before exhausting Java heap space (which results in > >>> allocation stalls). > >>> > >>> 3.) Another option is profile the application and look for > opportunities > >>> to reduce unnecessary object allocations. This will reduce the speed at > >>> which the Java heap fills that available free space and thus allows > >>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >>> > >>> Fwiw, I tend to like the first two options better since I would rather > >>> see folks write their Java application(s) in their most natural form > and > >>> let the JVM figure out how to best run the Java app. > >>> > >>> Also, as a general comment, having 37% head room for ZGC to operate is > >>> not a lot of space. Whether that is enough space largely depends on the > >>> application, i.e. its allocation rate, object lifetimes, amount live > >>> data in the Java heap, etc., and whether concurrent GC threads can keep > >>> up with the pace of allocations with the amount of Java heap space > >>> that's available. > >>> > >>> hths, > >>> > >>> charlie > >>> > >>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>>> Hi team, > >>>> > >>>> Our Production application runs on a 320G heap and uses ZGC with large > >>>> pages enabled. We have not done any tuning and are using ZGC with > >>> defaults. > >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >> the > >>>> app experiences allocation stalls (during peak hours) and this happens > >>> when > >>>> there is a huge drop in the number of classes loaded. We have a > >>> bi-monthly > >>>> release cycle and can see that the allocation stalls start small a > >>> business > >>>> day after a release and slowly increase as the week progresses. > >>>> > >>>> At the moment the app seems to be doing fine but it could escalate > >>> anytime > >>>> by the looks of it. There is an increase in the app's response time as > >>> well > >>>> at the same time and a small spike in heap which seem like side > >> effects. > >>>> Any pointers in terms of tuning would be much appreciated. > >>>> > >>>> The app currently always makes use of at least 200G of heap space > which > >>>> leaves a 37% head space for ZGC. > >>>> > >>>> > >>>> Regards > >>>> Mary > >>> > >>> End of zgc-dev Digest, Vol 36, Issue 1 > >>> ************************************** > >>> > >> > >> -- > >> > >> Mary Sunitha Joseph (She/her) > >> > >> Lead Developer > >> > >> Fiix Software > >> > >> p: 1 (855) 884-5619 > >> > >> e: mary.joseph at fiixsoftware.com > >> > >> w: www.fiixsoftware.com > >> > >> < > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >> > >> ------------------------------ > >> > >> Message: 2 > >> Date: Wed, 3 Feb 2021 08:40:53 -0600 > >> From: charlie hunt > >> To: zgc-dev at openjdk.java.net > >> Subject: Re: Running into Allocation Stalls during class unloading > >> Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90 at oracle.com> > >> Content-Type: text/plain; charset=utf-8; format=flowed > >> > >> Hi Mary, > >> > >> No, allocation stalls where there is concurrent class unloading is not > >> an expected occurrence. > >> > >> What may be a possibility here, and I will try to explain what I am > >> thinking. > >> > >> With the introduction of concurrent class unloading, the elapsed time it > >> takes ZGC to complete a concurrent collection cycle may be slightly > >> longer than when class unloading was a GC pause. If your application > >> happened to be very close to a point where ZGC was just ahead of "losing > >> the race" and exhausting Java heap space, then the additional concurrent > >> class unloading work may be just enough for ZGC to lose that race. One > >> thing to keep in mind here is that concurrent class unloading removes a > >> GC pause, the pause that did class unloading. > >> > >> hths, > >> > >> charlie > >> > >> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > >>> Hi Charlie, > >>> > >>> Thank you for going over our case and for the recommendations. > Increasing > >>> the heap size or the number of ZGC threads is definitely something we > can > >>> try out. > >>> > >>> I'm also trying to understand if the allocation stalls during the time > of > >>> class unloading is an expected occurrence. It's almost as if at that > >> point > >>> ZGC's entire focus is on class unloading and not clearing out the heap > >>> which leads to that spike in heap and subsequent allocation stalls. > >> Perhaps > >>> class unloading and freeing heap are not concurrent themselves and ZGC > is > >>> able to do one or the other? > >>> > >>> Regards, > >>> Mary > >>> > >>> On Wed, Feb 3, 2021 at 6:55 AM > wrote: > >>> > >>>> Send zgc-dev mailing list submissions to > >>>> zgc-dev at openjdk.java.net > >>>> > >>>> To subscribe or unsubscribe via the World Wide Web, visit > >>>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >>>> or, via email, send a message with subject or body 'help' to > >>>> zgc-dev-request at openjdk.java.net > >>>> > >>>> You can reach the person managing the list at > >>>> zgc-dev-owner at openjdk.java.net > >>>> > >>>> When replying, please edit your Subject line so it is more specific > >>>> than "Re: Contents of zgc-dev digest..." > >>>> > >>>> > >>>> Today's Topics: > >>>> > >>>> 1. Running into Allocation Stalls during class unloading > >>>> (Mary Sunitha Joseph) > >>>> 2. Re: Running into Allocation Stalls during class unloading > >>>> (charlie hunt) > >>>> > >>>> > >>>> ---------------------------------------------------------------------- > >>>> > >>>> Message: 1 > >>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >>>> From: Mary Sunitha Joseph > >>>> To: zgc-dev at openjdk.java.net > >>>> Subject: Running into Allocation Stalls during class unloading > >>>> Message-ID: > >>>> >>>> MXcA at mail.gmail.com> > >>>> Content-Type: text/plain; charset="UTF-8" > >>>> > >>>> Hi team, > >>>> > >>>> Our Production application runs on a 320G heap and uses ZGC with large > >>>> pages enabled. We have not done any tuning and are using ZGC with > >> defaults. > >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >> the > >>>> app experiences allocation stalls (during peak hours) and this happens > >> when > >>>> there is a huge drop in the number of classes loaded. We have a > >> bi-monthly > >>>> release cycle and can see that the allocation stalls start small a > >> business > >>>> day after a release and slowly increase as the week progresses. > >>>> > >>>> At the moment the app seems to be doing fine but it could escalate > >> anytime > >>>> by the looks of it. There is an increase in the app's response time as > >> well > >>>> at the same time and a small spike in heap which seem like side > effects. > >>>> Any pointers in terms of tuning would be much appreciated. > >>>> > >>>> The app currently always makes use of at least 200G of heap space > which > >>>> leaves a 37% head space for ZGC. > >>>> > >>>> > >>>> Regards > >>>> Mary > >>>> -- > >>>> > >>>> Mary Sunitha Joseph (She/her) > >>>> > >>>> Lead Developer > >>>> > >>>> Fiix Software > >>>> > >>>> p: 1 (855) 884-5619 > >>>> > >>>> e: mary.joseph at fiixsoftware.com > >>>> > >>>> w: www.fiixsoftware.com > >>>> > >>>> < > >>>> > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >>>> ------------------------------ > >>>> > >>>> Message: 2 > >>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >>>> From: charlie hunt > >>>> To: zgc-dev at openjdk.java.net > >>>> Subject: Re: Running into Allocation Stalls during class unloading > >>>> Message-ID: > >>>> Content-Type: text/plain; charset=utf-8; format=flowed > >>>> > >>>> Hi Mary, > >>>> > >>>> Thanks for reaching out. > >>>> > >>>> Since you are observing allocation stalls, there a couple options to > >>>> consider. > >>>> > >>>> 1.) If you have CPU cycles available, you can increase the number of > >>>> concurrent GC threads. You can see the default number ZGC is currently > >>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep > >>>> -i concgcthreads.? Increasing the number of concurrent GC threads > should > >>>> allow ZGC to do its concurrent work before the Java heap space becomes > >>>> exhausted resulting in allocation stalls. But, additional concurrent > GC > >>>> threads will use more CPU. > >>>> > >>>> 2.) Another option is to size the Java heap larger, if you have the > >>>> available RAM on the system. By increasing the size of the Java heap, > >>>> you also increase the time the concurrent GC threads can do their work > >>>> to free space before exhausting Java heap space (which results in > >>>> allocation stalls). > >>>> > >>>> 3.) Another option is profile the application and look for > opportunities > >>>> to reduce unnecessary object allocations. This will reduce the speed > at > >>>> which the Java heap fills that available free space and thus allows > >>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >>>> > >>>> Fwiw, I tend to like the first two options better since I would rather > >>>> see folks write their Java application(s) in their most natural form > and > >>>> let the JVM figure out how to best run the Java app. > >>>> > >>>> Also, as a general comment, having 37% head room for ZGC to operate is > >>>> not a lot of space. Whether that is enough space largely depends on > the > >>>> application, i.e. its allocation rate, object lifetimes, amount live > >>>> data in the Java heap, etc., and whether concurrent GC threads can > keep > >>>> up with the pace of allocations with the amount of Java heap space > >>>> that's available. > >>>> > >>>> hths, > >>>> > >>>> charlie > >>>> > >>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>>>> Hi team, > >>>>> > >>>>> Our Production application runs on a 320G heap and uses ZGC with > large > >>>>> pages enabled. We have not done any tuning and are using ZGC with > >>>> defaults. > >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >> the > >>>>> app experiences allocation stalls (during peak hours) and this > happens > >>>> when > >>>>> there is a huge drop in the number of classes loaded. We have a > >>>> bi-monthly > >>>>> release cycle and can see that the allocation stalls start small a > >>>> business > >>>>> day after a release and slowly increase as the week progresses. > >>>>> > >>>>> At the moment the app seems to be doing fine but it could escalate > >>>> anytime > >>>>> by the looks of it. There is an increase in the app's response time > as > >>>> well > >>>>> at the same time and a small spike in heap which seem like side > >> effects. > >>>>> Any pointers in terms of tuning would be much appreciated. > >>>>> > >>>>> The app currently always makes use of at least 200G of heap space > which > >>>>> leaves a 37% head space for ZGC. > >>>>> > >>>>> > >>>>> Regards > >>>>> Mary > >>>> End of zgc-dev Digest, Vol 36, Issue 1 > >>>> ************************************** > >>>> > >> > >> ------------------------------ > >> > >> Message: 3 > >> Date: Wed, 3 Feb 2021 16:38:20 +0100 > >> From: Per Liden > >> To: charlie hunt , > >> mary.joseph at fiixsoftware.com > >> Cc: zgc-dev at openjdk.java.net > >> Subject: Re: Running into Allocation Stalls during class unloading > >> Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9 at oracle.com> > >> Content-Type: text/plain; charset=utf-8; format=flowed > >> > >> Hi Mary, > >> > >> Would it be possible for you to share GC logs (preferably generated > >> using -Xlog:gc*)? That would help us understand if what you experience > >> here is a consequence of what Charlie describes. > >> > >> If that's the case, i.e. a prolonged concurrent GC cycle because of lots > >> of classes to unload, it could be that the GC is simply kicking in a bit > >> too late. If so, you might want to have a look at the > >> -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it > >> shouldn't aim at using the whole heap (-Xmx), but instead some lower > >> number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to > >> variance in GC cycle length and make it more resilient to allocation > >> spikes. > >> > >> cheers, > >> Per > >> > >> On 2/3/21 3:40 PM, charlie hunt wrote: > >>> Hi Mary, > >>> > >>> No, allocation stalls where there is concurrent class unloading is not > >>> an expected occurrence. > >>> > >>> What may be a possibility here, and I will try to explain what I am > >>> thinking. > >>> > >>> With the introduction of concurrent class unloading, the elapsed time > it > >>> takes ZGC to complete a concurrent collection cycle may be slightly > >>> longer than when class unloading was a GC pause. If your application > >>> happened to be very close to a point where ZGC was just ahead of > "losing > >>> the race" and exhausting Java heap space, then the additional > concurrent > >>> class unloading work may be just enough for ZGC to lose that race. One > >>> thing to keep in mind here is that concurrent class unloading removes a > >>> GC pause, the pause that did class unloading. > >>> > >>> hths, > >>> > >>> charlie > >>> > >>> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: > >>>> Hi Charlie, > >>>> > >>>> Thank you for going over our case and for the recommendations. > >> Increasing > >>>> the heap size or the number of ZGC threads is definitely something we > >> can > >>>> try out. > >>>> > >>>> I'm also trying to understand if the allocation stalls during the time > >> of > >>>> class unloading is an expected occurrence. It's almost as if at that > >>>> point > >>>> ZGC's entire focus is on class unloading and not clearing out the heap > >>>> which leads to that spike in heap and subsequent allocation stalls. > >>>> Perhaps > >>>> class unloading and freeing heap are not concurrent themselves and ZGC > >> is > >>>> able to do one or the other? > >>>> > >>>> Regards, > >>>> Mary > >>>> > >>>> On Wed, Feb 3, 2021 at 6:55 AM > >> wrote: > >>>>> Send zgc-dev mailing list submissions to > >>>>> ???????? zgc-dev at openjdk.java.net > >>>>> > >>>>> To subscribe or unsubscribe via the World Wide Web, visit > >>>>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev > >>>>> or, via email, send a message with subject or body 'help' to > >>>>> ???????? zgc-dev-request at openjdk.java.net > >>>>> > >>>>> You can reach the person managing the list at > >>>>> ???????? zgc-dev-owner at openjdk.java.net > >>>>> > >>>>> When replying, please edit your Subject line so it is more specific > >>>>> than "Re: Contents of zgc-dev digest..." > >>>>> > >>>>> > >>>>> Today's Topics: > >>>>> > >>>>> ??? 1. Running into Allocation Stalls during class unloading > >>>>> ?????? (Mary Sunitha Joseph) > >>>>> ??? 2. Re: Running into Allocation Stalls during class unloading > >>>>> ?????? (charlie hunt) > >>>>> > >>>>> > >>>>> > ---------------------------------------------------------------------- > >>>>> > >>>>> Message: 1 > >>>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 > >>>>> From: Mary Sunitha Joseph > >>>>> To: zgc-dev at openjdk.java.net > >>>>> Subject: Running into Allocation Stalls during class unloading > >>>>> Message-ID: > >>>>> ???????? >>>>> MXcA at mail.gmail.com> > >>>>> Content-Type: text/plain; charset="UTF-8" > >>>>> > >>>>> Hi team, > >>>>> > >>>>> Our Production application runs on a 320G heap and uses ZGC with > large > >>>>> pages enabled. We have not done any tuning and are using ZGC with > >>>>> defaults. > >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day > >>>>> the > >>>>> app experiences allocation stalls (during peak hours) and this > >>>>> happens when > >>>>> there is a huge drop in the number of classes loaded. We have a > >>>>> bi-monthly > >>>>> release cycle and can see that the allocation stalls start small a > >>>>> business > >>>>> day after a? release and slowly increase as the week progresses. > >>>>> > >>>>> At the moment the app seems to be doing fine but it could escalate > >>>>> anytime > >>>>> by the looks of it. There is an increase in the app's response time > >>>>> as well > >>>>> at the same time and a small spike in heap which seem like side > >> effects. > >>>>> Any pointers in terms of tuning would be much appreciated. > >>>>> > >>>>> The app currently always makes use of at least 200G of heap space > which > >>>>> leaves a 37% head space for ZGC. > >>>>> > >>>>> > >>>>> Regards > >>>>> Mary > >>>>> -- > >>>>> > >>>>> Mary Sunitha Joseph (She/her) > >>>>> > >>>>> Lead Developer > >>>>> > >>>>> Fiix Software > >>>>> > >>>>> p: 1 (855) 884-5619 > >>>>> > >>>>> e: mary.joseph at fiixsoftware.com > >>>>> > >>>>> w: www.fiixsoftware.com > >>>>> > >>>>> < > >>>>> > >> > https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email > >>>>> > >>>>> ------------------------------ > >>>>> > >>>>> Message: 2 > >>>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 > >>>>> From: charlie hunt > >>>>> To: zgc-dev at openjdk.java.net > >>>>> Subject: Re: Running into Allocation Stalls during class unloading > >>>>> Message-ID: > >>>>> Content-Type: text/plain; charset=utf-8; format=flowed > >>>>> > >>>>> Hi Mary, > >>>>> > >>>>> Thanks for reaching out. > >>>>> > >>>>> Since you are observing allocation stalls, there a couple options to > >>>>> consider. > >>>>> > >>>>> 1.) If you have CPU cycles available, you can increase the number of > >>>>> concurrent GC threads. You can see the default number ZGC is > currently > >>>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | > grep > >>>>> -i concgcthreads.? Increasing the number of concurrent GC threads > >> should > >>>>> allow ZGC to do its concurrent work before the Java heap space > becomes > >>>>> exhausted resulting in allocation stalls. But, additional concurrent > GC > >>>>> threads will use more CPU. > >>>>> > >>>>> 2.) Another option is to size the Java heap larger, if you have the > >>>>> available RAM on the system. By increasing the size of the Java heap, > >>>>> you also increase the time the concurrent GC threads can do their > work > >>>>> to free space before exhausting Java heap space (which results in > >>>>> allocation stalls). > >>>>> > >>>>> 3.) Another option is profile the application and look for > >> opportunities > >>>>> to reduce unnecessary object allocations. This will reduce the speed > at > >>>>> which the Java heap fills that available free space and thus allows > >>>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. > >>>>> > >>>>> Fwiw, I tend to like the first two options better since I would > rather > >>>>> see folks write their Java application(s) in their most natural form > >> and > >>>>> let the JVM figure out how to best run the Java app. > >>>>> > >>>>> Also, as a general comment, having 37% head room for ZGC to operate > is > >>>>> not a lot of space. Whether that is enough space largely depends on > the > >>>>> application, i.e. its allocation rate, object lifetimes, amount live > >>>>> data in the Java heap, etc., and whether concurrent GC threads can > keep > >>>>> up with the pace of allocations with the amount of Java heap space > >>>>> that's available. > >>>>> > >>>>> hths, > >>>>> > >>>>> charlie > >>>>> > >>>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: > >>>>>> Hi team, > >>>>>> > >>>>>> Our Production application runs on a 320G heap and uses ZGC with > large > >>>>>> pages enabled. We have not done any tuning and are using ZGC with > >>>>> defaults. > >>>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a > >>>>>> day the > >>>>>> app experiences allocation stalls (during peak hours) and this > happens > >>>>> when > >>>>>> there is a huge drop in the number of classes loaded. We have a > >>>>> bi-monthly > >>>>>> release cycle and can see that the allocation stalls start small a > >>>>> business > >>>>>> day after a? release and slowly increase as the week progresses. > >>>>>> > >>>>>> At the moment the app seems to be doing fine but it could escalate > >>>>> anytime > >>>>>> by the looks of it. There is an increase in the app's response time > as > >>>>> well > >>>>>> at the same time and a small spike in heap which seem like side > >>>>>> effects. > >>>>>> Any pointers in terms of tuning would be much appreciated. > >>>>>> > >>>>>> The app currently always makes use of at least 200G of heap space > >> which > >>>>>> leaves a 37% head space for ZGC. > >>>>>> > >>>>>> > >>>>>> Regards > >>>>>> Mary > >>>>> End of zgc-dev Digest, Vol 36, Issue 1 > >>>>> ************************************** > >>>>> > >> > >> End of zgc-dev Digest, Vol 36, Issue 2 > >> ************************************** > >> > > > > > End of zgc-dev Digest, Vol 36, Issue 5 > ************************************** > -- Mary Sunitha Joseph (She/her) Lead Developer Fiix Software p: 1 (855) 884-5619 e: mary.joseph at fiixsoftware.com w: www.fiixsoftware.com From stefan.karlsson at oracle.com Thu Feb 11 08:22:43 2021 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 11 Feb 2021 09:22:43 +0100 Subject: Running into Allocation Stalls during class unloading In-Reply-To: <7a8922c7-2f51-6e7c-032c-ff8a0911346e@oracle.com> References: <7a8922c7-2f51-6e7c-032c-ff8a0911346e@oracle.com> Message-ID: On 2021-02-10 00:44, charlie hunt wrote: > Hi Mary, > > Based on the data you have shared, (thanks for sharing!), it looks > like the previous suggestions from both Per and I should help. > > 1. Increase number of concurrent GC threads -- This should help > shorten the elapsed time you are seeing in "Process Non-Strong > References" phase, (and other concurrent phases too). Increasing > concurrent GC threads will use additional CPU. So you should expect to > see some increased CPU usage during the concurrent phases. I think that these extremely long "Process Non-Strong References" needs to be investigated further. If they are much longer than a typical GC cycle duration, then I think it will be hard to try to completely prevent this with by tuning the GC. It would be interesting to see these numbers from after the time you got the long "Process Non-Strong References": [10,198s][info][gc,stats??? ]??? Subphase: Concurrent Classes Purge????????????????????? 0,031 / 0,054???????? 0,031 / 0,054???????? 0,031 / 0,054???????? 0,031 / 0,054?????? ms [10,198s][info][gc,stats??? ]??? Subphase: Concurrent Classes Unlink???????????????????? 0,204 / 0,511???????? 0,204 / 0,511???????? 0,204 / 0,511???????? 0,204 / 0,511?????? ms ... [10,198s][info][gc,stats??? ]??? Subphase: Concurrent References Enqueue???????????????? 0,000 / 0,011???????? 0,000 / 0,011 0,000 / 0,011???????? 0,000 / 0,011?????? ms [10,198s][info][gc,stats??? ]??? Subphase: Concurrent References Process???????????????? 0,031 / 0,071???????? 0,031 / 0,071 0,031 / 0,071???????? 0,031 / 0,071?????? ms this will show us what sub-phase is taking a long time. It would also be interesting to see if you have a huge amount of java.lang.ref.*References, by looking at these numbers before and after the long event: [10,179s][info][gc,ref????? ] GC(1504) Soft: 49 encountered, 0 discovered, 0 enqueued [10,179s][info][gc,ref????? ] GC(1504) Weak: 132 encountered, 33 discovered, 0 enqueued [10,179s][info][gc,ref????? ] GC(1504) Final: 0 encountered, 0 discovered, 0 enqueued [10,179s][info][gc,ref????? ] GC(1504) Phantom: 3 encountered, 2 discovered, 0 enqueued Thanks, StefanK > > 2.) Setting a SoftMaxHeapSize, as Per suggested (which is a great idea > I had not thought about). This should help ZGC deal with those spikes > you mentioned. The end result being that it should help ZGC complete > the "Process Non-Strong References" phase in a state where there is > more headroom / available Java heap space (and avoid an allocation > stall). > > 3.) And, obviously, increasing Java heap size, and/or reducing > unnecessary allocations. > > As mentioned earlier, any one, or any number of these should help. > > Keep us posted on your progress. > > thanks, > > charlie > > On 2/8/21 8:39 PM, Mary Sunitha Joseph wrote: >> Hi Charlie, Per >> >> I went over our verbose GC logs and found a couple of things : >> >> 1. Heap usage by the application during peak hours is always close to an >> avg of 55%. >> >> 2. Each GC cycle barely manages to reclaim upto 10% and there are times >> when the Concurrent Process Non-Strong reference phase takes upto a >> minute >> to complete. These are the times when the allocation stalls occur and >> the >> GC cycle soon after is unable to reclaim any space. >> >> Eg extract: >> >> HIGH ALLOC RATE >> >> [2021-02-02T17:01:32.641+0000][info][gc????????? ] GC(1276) Garbage >> Collection (Allocation Rate) 186282M(57%)->205548M(63%) >> >> PROCESSING PHASE >> >> [2021-02-02T17:03:09.369+0000][info][gc,phases?? ] GC(1277) Concurrent >> Process Non-Strong References 71206.950ms >> >> ALLOC STALL >> >> [2021-02-02T17:03:09.372+0000][info][gc????????? ] Allocation Stall >> (scrubbed) 318.487ms >> >> HIGH ALLOC RATE >> >> [2021-02-02T17:03:09.756+0000][info][gc????????? ] GC(1277) Garbage >> Collection (Allocation Rate) 205634M(63%)->245938M(75%) >> >> PROCESSING PHASE >> >> [2021-02-02T17:04:46.862+0000][info][gc,phases?? ] GC(1278) Concurrent >> Process Non-Strong References 72611.793ms >> >> ALLOC STALL >> >> [2021-02-02T17:04:46.864+0000][info][gc????????? ] Allocation Stall >> (scrubbed) 554.293ms >> >> GC is then able to reclaim some space >> >> [2021-02-02T17:04:48.332+0000][info][gc????????? ] GC(1278) Garbage >> Collection (Allocation Rate) 245974M(75%)->204610M(62%) >> >> Towards the end of the week, this phase takes upto 2.3 mins. >> >> 3. There are object initialization hotspots in the application that need >> optimizing, but even with 320GB of memory and at most 50-60% heap >> usage, it >> looks like ZGC needs more room to maneuver. >> >> The Concurrent Process Non-strong References taking minutes to >> complete is >> a bit worrisome and we should probably look at optimizing things at >> our end >> to reduce this. >> >> Regards, >> Mary >> >> >> On Wed, Feb 3, 2021 at 10:39 AM >> wrote: >> >>> Send zgc-dev mailing list submissions to >>> ???????? zgc-dev at openjdk.java.net >>> >>> To subscribe or unsubscribe via the World Wide Web, visit >>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>> or, via email, send a message with subject or body 'help' to >>> ???????? zgc-dev-request at openjdk.java.net >>> >>> You can reach the person managing the list at >>> ???????? zgc-dev-owner at openjdk.java.net >>> >>> When replying, please edit your Subject line so it is more specific >>> than "Re: Contents of zgc-dev digest..." >>> >>> >>> Today's Topics: >>> >>> ??? 1. Re: Running into Allocation Stalls during class unloading >>> ?????? (Mary Sunitha Joseph) >>> ??? 2. Re: Running into Allocation Stalls during class unloading >>> ?????? (charlie hunt) >>> ??? 3. Re: Running into Allocation Stalls during class unloading >>> ?????? (Per Liden) >>> >>> >>> ---------------------------------------------------------------------- >>> >>> Message: 1 >>> Date: Wed, 3 Feb 2021 09:10:59 -0500 >>> From: Mary Sunitha Joseph >>> To: zgc-dev at openjdk.java.net >>> Subject: Re: Running into Allocation Stalls during class unloading >>> Message-ID: >>> ???????? >> rB2yQw at mail.gmail.com> >>> Content-Type: text/plain; charset="UTF-8" >>> >>> Hi Charlie, >>> >>> Thank you for going over our case and for the recommendations. >>> Increasing >>> the heap size or the number of ZGC threads is definitely something >>> we can >>> try out. >>> >>> I'm also trying to understand if the allocation stalls during the >>> time of >>> class unloading is an expected occurrence. It's almost as if at that >>> point >>> ZGC's entire focus is on class unloading and not clearing out the heap >>> which leads to that spike in heap and subsequent allocation stalls. >>> Perhaps >>> class unloading and freeing heap are not concurrent themselves and >>> ZGC is >>> able to do one or the other? >>> >>> Regards, >>> Mary >>> >>> On Wed, Feb 3, 2021 at 6:55 AM >>> wrote: >>> >>>> Send zgc-dev mailing list submissions to >>>> ???????? zgc-dev at openjdk.java.net >>>> >>>> To subscribe or unsubscribe via the World Wide Web, visit >>>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>>> or, via email, send a message with subject or body 'help' to >>>> ???????? zgc-dev-request at openjdk.java.net >>>> >>>> You can reach the person managing the list at >>>> ???????? zgc-dev-owner at openjdk.java.net >>>> >>>> When replying, please edit your Subject line so it is more specific >>>> than "Re: Contents of zgc-dev digest..." >>>> >>>> >>>> Today's Topics: >>>> >>>> ??? 1. Running into Allocation Stalls during class unloading >>>> ?????? (Mary Sunitha Joseph) >>>> ??? 2. Re: Running into Allocation Stalls during class unloading >>>> ?????? (charlie hunt) >>>> >>>> >>>> ---------------------------------------------------------------------- >>>> >>>> Message: 1 >>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>>> From: Mary Sunitha Joseph >>>> To: zgc-dev at openjdk.java.net >>>> Subject: Running into Allocation Stalls during class unloading >>>> Message-ID: >>>> ???????? >>> MXcA at mail.gmail.com> >>>> Content-Type: text/plain; charset="UTF-8" >>>> >>>> Hi team, >>>> >>>> Our Production application runs on a 320G heap and uses ZGC with large >>>> pages enabled. We have not done any tuning and are using ZGC with >>> defaults. >>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>> day the >>>> app experiences allocation stalls (during peak hours) and this happens >>> when >>>> there is a huge drop in the number of classes loaded. We have a >>> bi-monthly >>>> release cycle and can see that the allocation stalls start small a >>> business >>>> day after a? release and slowly increase as the week progresses. >>>> >>>> At the moment the app seems to be doing fine but it could escalate >>> anytime >>>> by the looks of it. There is an increase in the app's response time as >>> well >>>> at the same time and a small spike in heap which seem like side >>>> effects. >>>> Any pointers in terms of tuning would be much appreciated. >>>> >>>> The app currently always makes use of at least 200G of heap space >>>> which >>>> leaves a 37% head space for ZGC. >>>> >>>> >>>> Regards >>>> Mary >>>> -- >>>> >>>> Mary Sunitha Joseph (She/her) >>>> >>>> Lead Developer >>>> >>>> Fiix Software >>>> >>>> p: 1 (855) 884-5619 >>>> >>>> e: mary.joseph at fiixsoftware.com >>>> >>>> w: www.fiixsoftware.com >>>> >>>> < >>>> >>> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>>> >>>> ------------------------------ >>>> >>>> Message: 2 >>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>>> From: charlie hunt >>>> To: zgc-dev at openjdk.java.net >>>> Subject: Re: Running into Allocation Stalls during class unloading >>>> Message-ID: >>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>> >>>> Hi Mary, >>>> >>>> Thanks for reaching out. >>>> >>>> Since you are observing allocation stalls, there a couple options to >>>> consider. >>>> >>>> 1.) If you have CPU cycles available, you can increase the number of >>>> concurrent GC threads. You can see the default number ZGC is currently >>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | grep >>>> -i concgcthreads.? Increasing the number of concurrent GC threads >>>> should >>>> allow ZGC to do its concurrent work before the Java heap space becomes >>>> exhausted resulting in allocation stalls. But, additional >>>> concurrent GC >>>> threads will use more CPU. >>>> >>>> 2.) Another option is to size the Java heap larger, if you have the >>>> available RAM on the system. By increasing the size of the Java heap, >>>> you also increase the time the concurrent GC threads can do their work >>>> to free space before exhausting Java heap space (which results in >>>> allocation stalls). >>>> >>>> 3.) Another option is profile the application and look for >>>> opportunities >>>> to reduce unnecessary object allocations. This will reduce the >>>> speed at >>>> which the Java heap fills that available free space and thus allows >>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>>> >>>> Fwiw, I tend to like the first two options better since I would rather >>>> see folks write their Java application(s) in their most natural >>>> form and >>>> let the JVM figure out how to best run the Java app. >>>> >>>> Also, as a general comment, having 37% head room for ZGC to operate is >>>> not a lot of space. Whether that is enough space largely depends on >>>> the >>>> application, i.e. its allocation rate, object lifetimes, amount live >>>> data in the Java heap, etc., and whether concurrent GC threads can >>>> keep >>>> up with the pace of allocations with the amount of Java heap space >>>> that's available. >>>> >>>> hths, >>>> >>>> charlie >>>> >>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>>> Hi team, >>>>> >>>>> Our Production application runs on a 320G heap and uses ZGC with >>>>> large >>>>> pages enabled. We have not done any tuning and are using ZGC with >>>> defaults. >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >>> the >>>>> app experiences allocation stalls (during peak hours) and this >>>>> happens >>>> when >>>>> there is a huge drop in the number of classes loaded. We have a >>>> bi-monthly >>>>> release cycle and can see that the allocation stalls start small a >>>> business >>>>> day after a? release and slowly increase as the week progresses. >>>>> >>>>> At the moment the app seems to be doing fine but it could escalate >>>> anytime >>>>> by the looks of it. There is an increase in the app's response >>>>> time as >>>> well >>>>> at the same time and a small spike in heap which seem like side >>> effects. >>>>> Any pointers in terms of tuning would be much appreciated. >>>>> >>>>> The app currently always makes use of at least 200G of heap space >>>>> which >>>>> leaves a 37% head space for ZGC. >>>>> >>>>> >>>>> Regards >>>>> Mary >>>> >>>> End of zgc-dev Digest, Vol 36, Issue 1 >>>> ************************************** >>>> >>> >>> -- >>> >>> Mary Sunitha Joseph (She/her) >>> >>> Lead Developer >>> >>> Fiix Software >>> >>> p: 1 (855) 884-5619 >>> >>> e: mary.joseph at fiixsoftware.com >>> >>> w: www.fiixsoftware.com >>> >>> < >>> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>> >>> ------------------------------ >>> >>> Message: 2 >>> Date: Wed, 3 Feb 2021 08:40:53 -0600 >>> From: charlie hunt >>> To: zgc-dev at openjdk.java.net >>> Subject: Re: Running into Allocation Stalls during class unloading >>> Message-ID: <7c368c52-1f03-1286-012c-7605ea506a90 at oracle.com> >>> Content-Type: text/plain; charset=utf-8; format=flowed >>> >>> Hi Mary, >>> >>> No, allocation stalls where there is concurrent class unloading is not >>> an expected occurrence. >>> >>> What may be a possibility here, and I will try to explain what I am >>> thinking. >>> >>> With the introduction of concurrent class unloading, the elapsed >>> time it >>> takes ZGC to complete a concurrent collection cycle may be slightly >>> longer than when class unloading was a GC pause. If your application >>> happened to be very close to a point where ZGC was just ahead of >>> "losing >>> the race" and exhausting Java heap space, then the additional >>> concurrent >>> class unloading work may be just enough for ZGC to lose that race. One >>> thing to keep in mind here is that concurrent class unloading removes a >>> GC pause, the pause that did class unloading. >>> >>> hths, >>> >>> charlie >>> >>> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: >>>> Hi Charlie, >>>> >>>> Thank you for going over our case and for the recommendations. >>>> Increasing >>>> the heap size or the number of ZGC threads is definitely something >>>> we can >>>> try out. >>>> >>>> I'm also trying to understand if the allocation stalls during the >>>> time of >>>> class unloading is an expected occurrence. It's almost as if at that >>> point >>>> ZGC's entire focus is on class unloading and not clearing out the heap >>>> which leads to that spike in heap and subsequent allocation stalls. >>> Perhaps >>>> class unloading and freeing heap are not concurrent themselves and >>>> ZGC is >>>> able to do one or the other? >>>> >>>> Regards, >>>> Mary >>>> >>>> On Wed, Feb 3, 2021 at 6:55 AM >>>> wrote: >>>> >>>>> Send zgc-dev mailing list submissions to >>>>> ????????? zgc-dev at openjdk.java.net >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>> https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>>>> or, via email, send a message with subject or body 'help' to >>>>> ????????? zgc-dev-request at openjdk.java.net >>>>> >>>>> You can reach the person managing the list at >>>>> ????????? zgc-dev-owner at openjdk.java.net >>>>> >>>>> When replying, please edit your Subject line so it is more specific >>>>> than "Re: Contents of zgc-dev digest..." >>>>> >>>>> >>>>> Today's Topics: >>>>> >>>>> ???? 1. Running into Allocation Stalls during class unloading >>>>> ??????? (Mary Sunitha Joseph) >>>>> ???? 2. Re: Running into Allocation Stalls during class unloading >>>>> ??????? (charlie hunt) >>>>> >>>>> >>>>> ---------------------------------------------------------------------- >>>>> >>>>> >>>>> Message: 1 >>>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>>>> From: Mary Sunitha Joseph >>>>> To: zgc-dev at openjdk.java.net >>>>> Subject: Running into Allocation Stalls during class unloading >>>>> Message-ID: >>>>> >>>> MXcA at mail.gmail.com> >>>>> Content-Type: text/plain; charset="UTF-8" >>>>> >>>>> Hi team, >>>>> >>>>> Our Production application runs on a 320G heap and uses ZGC with >>>>> large >>>>> pages enabled. We have not done any tuning and are using ZGC with >>> defaults. >>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a day >>> the >>>>> app experiences allocation stalls (during peak hours) and this >>>>> happens >>> when >>>>> there is a huge drop in the number of classes loaded. We have a >>> bi-monthly >>>>> release cycle and can see that the allocation stalls start small a >>> business >>>>> day after a? release and slowly increase as the week progresses. >>>>> >>>>> At the moment the app seems to be doing fine but it could escalate >>> anytime >>>>> by the looks of it. There is an increase in the app's response >>>>> time as >>> well >>>>> at the same time and a small spike in heap which seem like side >>>>> effects. >>>>> Any pointers in terms of tuning would be much appreciated. >>>>> >>>>> The app currently always makes use of at least 200G of heap space >>>>> which >>>>> leaves a 37% head space for ZGC. >>>>> >>>>> >>>>> Regards >>>>> Mary >>>>> -- >>>>> >>>>> Mary Sunitha Joseph (She/her) >>>>> >>>>> Lead Developer >>>>> >>>>> Fiix Software >>>>> >>>>> p: 1 (855) 884-5619 >>>>> >>>>> e: mary.joseph at fiixsoftware.com >>>>> >>>>> w: www.fiixsoftware.com >>>>> >>>>> < >>>>> >>> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>>>> ------------------------------ >>>>> >>>>> Message: 2 >>>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>>>> From: charlie hunt >>>>> To: zgc-dev at openjdk.java.net >>>>> Subject: Re: Running into Allocation Stalls during class unloading >>>>> Message-ID: >>>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>>> >>>>> Hi Mary, >>>>> >>>>> Thanks for reaching out. >>>>> >>>>> Since you are observing allocation stalls, there a couple options to >>>>> consider. >>>>> >>>>> 1.) If you have CPU cycles available, you can increase the number of >>>>> concurrent GC threads. You can see the default number ZGC is >>>>> currently >>>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | >>>>> grep >>>>> -i concgcthreads.? Increasing the number of concurrent GC threads >>>>> should >>>>> allow ZGC to do its concurrent work before the Java heap space >>>>> becomes >>>>> exhausted resulting in allocation stalls. But, additional >>>>> concurrent GC >>>>> threads will use more CPU. >>>>> >>>>> 2.) Another option is to size the Java heap larger, if you have the >>>>> available RAM on the system. By increasing the size of the Java heap, >>>>> you also increase the time the concurrent GC threads can do their >>>>> work >>>>> to free space before exhausting Java heap space (which results in >>>>> allocation stalls). >>>>> >>>>> 3.) Another option is profile the application and look for >>>>> opportunities >>>>> to reduce unnecessary object allocations. This will reduce the >>>>> speed at >>>>> which the Java heap fills that available free space and thus allows >>>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>>>> >>>>> Fwiw, I tend to like the first two options better since I would >>>>> rather >>>>> see folks write their Java application(s) in their most natural >>>>> form and >>>>> let the JVM figure out how to best run the Java app. >>>>> >>>>> Also, as a general comment, having 37% head room for ZGC to >>>>> operate is >>>>> not a lot of space. Whether that is enough space largely depends >>>>> on the >>>>> application, i.e. its allocation rate, object lifetimes, amount live >>>>> data in the Java heap, etc., and whether concurrent GC threads can >>>>> keep >>>>> up with the pace of allocations with the amount of Java heap space >>>>> that's available. >>>>> >>>>> hths, >>>>> >>>>> charlie >>>>> >>>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>>>> Hi team, >>>>>> >>>>>> Our Production application runs on a 320G heap and uses ZGC with >>>>>> large >>>>>> pages enabled. We have not done any tuning and are using ZGC with >>>>> defaults. >>>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>>>> day >>> the >>>>>> app experiences allocation stalls (during peak hours) and this >>>>>> happens >>>>> when >>>>>> there is a huge drop in the number of classes loaded. We have a >>>>> bi-monthly >>>>>> release cycle and can see that the allocation stalls start small a >>>>> business >>>>>> day after a? release and slowly increase as the week progresses. >>>>>> >>>>>> At the moment the app seems to be doing fine but it could escalate >>>>> anytime >>>>>> by the looks of it. There is an increase in the app's response >>>>>> time as >>>>> well >>>>>> at the same time and a small spike in heap which seem like side >>> effects. >>>>>> Any pointers in terms of tuning would be much appreciated. >>>>>> >>>>>> The app currently always makes use of at least 200G of heap space >>>>>> which >>>>>> leaves a 37% head space for ZGC. >>>>>> >>>>>> >>>>>> Regards >>>>>> Mary >>>>> End of zgc-dev Digest, Vol 36, Issue 1 >>>>> ************************************** >>>>> >>> >>> ------------------------------ >>> >>> Message: 3 >>> Date: Wed, 3 Feb 2021 16:38:20 +0100 >>> From: Per Liden >>> To: charlie hunt , >>> ???????? mary.joseph at fiixsoftware.com >>> Cc: zgc-dev at openjdk.java.net >>> Subject: Re: Running into Allocation Stalls during class unloading >>> Message-ID: <4e5f0987-abc8-f19d-3953-7c6d5c8c82f9 at oracle.com> >>> Content-Type: text/plain; charset=utf-8; format=flowed >>> >>> Hi Mary, >>> >>> Would it be possible for you to share GC logs (preferably generated >>> using -Xlog:gc*)? That would help us understand if what you experience >>> here is a consequence of what Charlie describes. >>> >>> If that's the case, i.e. a prolonged concurrent GC cycle because of >>> lots >>> of classes to unload, it could be that the GC is simply kicking in a >>> bit >>> too late. If so, you might want to have a look at the >>> -XX:+SoftMaxHeapSize option. This option tells ZGC's heuristics that it >>> shouldn't aim at using the whole heap (-Xmx), but instead some lower >>> number (-XX:SoftMaxHeapSize). This will increase ZGC's tolerance to >>> variance in GC cycle length and make it more resilient to allocation >>> spikes. >>> >>> cheers, >>> Per >>> >>> On 2/3/21 3:40 PM, charlie hunt wrote: >>>> Hi Mary, >>>> >>>> No, allocation stalls where there is concurrent class unloading is not >>>> an expected occurrence. >>>> >>>> What may be a possibility here, and I will try to explain what I am >>>> thinking. >>>> >>>> With the introduction of concurrent class unloading, the elapsed >>>> time it >>>> takes ZGC to complete a concurrent collection cycle may be slightly >>>> longer than when class unloading was a GC pause. If your application >>>> happened to be very close to a point where ZGC was just ahead of >>>> "losing >>>> the race" and exhausting Java heap space, then the additional >>>> concurrent >>>> class unloading work may be just enough for ZGC to lose that race. One >>>> thing to keep in mind here is that concurrent class unloading >>>> removes a >>>> GC pause, the pause that did class unloading. >>>> >>>> hths, >>>> >>>> charlie >>>> >>>> On 2/3/21 8:10 AM, Mary Sunitha Joseph wrote: >>>>> Hi Charlie, >>>>> >>>>> Thank you for going over our case and for the recommendations. >>> Increasing >>>>> the heap size or the number of ZGC threads is definitely something we >>> can >>>>> try out. >>>>> >>>>> I'm also trying to understand if the allocation stalls during the >>>>> time >>> of >>>>> class unloading is an expected occurrence. It's almost as if at that >>>>> point >>>>> ZGC's entire focus is on class unloading and not clearing out the >>>>> heap >>>>> which leads to that spike in heap and subsequent allocation stalls. >>>>> Perhaps >>>>> class unloading and freeing heap are not concurrent themselves and >>>>> ZGC >>> is >>>>> able to do one or the other? >>>>> >>>>> Regards, >>>>> Mary >>>>> >>>>> On Wed, Feb 3, 2021 at 6:55 AM >>> wrote: >>>>>> Send zgc-dev mailing list submissions to >>>>>> ???????? zgc-dev at openjdk.java.net >>>>>> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit >>>>>> ???????? https://mail.openjdk.java.net/mailman/listinfo/zgc-dev >>>>>> or, via email, send a message with subject or body 'help' to >>>>>> ???????? zgc-dev-request at openjdk.java.net >>>>>> >>>>>> You can reach the person managing the list at >>>>>> ???????? zgc-dev-owner at openjdk.java.net >>>>>> >>>>>> When replying, please edit your Subject line so it is more specific >>>>>> than "Re: Contents of zgc-dev digest..." >>>>>> >>>>>> >>>>>> Today's Topics: >>>>>> >>>>>> ??? 1. Running into Allocation Stalls during class unloading >>>>>> ?????? (Mary Sunitha Joseph) >>>>>> ??? 2. Re: Running into Allocation Stalls during class unloading >>>>>> ?????? (charlie hunt) >>>>>> >>>>>> >>>>>> ---------------------------------------------------------------------- >>>>>> >>>>>> >>>>>> Message: 1 >>>>>> Date: Tue, 2 Feb 2021 14:30:40 -0500 >>>>>> From: Mary Sunitha Joseph >>>>>> To: zgc-dev at openjdk.java.net >>>>>> Subject: Running into Allocation Stalls during class unloading >>>>>> Message-ID: >>>>>> ???????? >>>>> MXcA at mail.gmail.com> >>>>>> Content-Type: text/plain; charset="UTF-8" >>>>>> >>>>>> Hi team, >>>>>> >>>>>> Our Production application runs on a 320G heap and uses ZGC with >>>>>> large >>>>>> pages enabled. We have not done any tuning and are using ZGC with >>>>>> defaults. >>>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>>>> day >>>>>> the >>>>>> app experiences allocation stalls (during peak hours) and this >>>>>> happens when >>>>>> there is a huge drop in the number of classes loaded. We have a >>>>>> bi-monthly >>>>>> release cycle and can see that the allocation stalls start small a >>>>>> business >>>>>> day after a? release and slowly increase as the week progresses. >>>>>> >>>>>> At the moment the app seems to be doing fine but it could escalate >>>>>> anytime >>>>>> by the looks of it. There is an increase in the app's response time >>>>>> as well >>>>>> at the same time and a small spike in heap which seem like side >>> effects. >>>>>> Any pointers in terms of tuning would be much appreciated. >>>>>> >>>>>> The app currently always makes use of at least 200G of heap space >>>>>> which >>>>>> leaves a 37% head space for ZGC. >>>>>> >>>>>> >>>>>> Regards >>>>>> Mary >>>>>> -- >>>>>> >>>>>> Mary Sunitha Joseph (She/her) >>>>>> >>>>>> Lead Developer >>>>>> >>>>>> Fiix Software >>>>>> >>>>>> p: 1 (855) 884-5619 >>>>>> >>>>>> e: mary.joseph at fiixsoftware.com >>>>>> >>>>>> w: www.fiixsoftware.com >>>>>> >>>>>> < >>>>>> >>> https://www.fiixsoftware.com/foresight/#insights?utm_source=signature&utm_medium=email >>> >>>>>> >>>>>> ------------------------------ >>>>>> >>>>>> Message: 2 >>>>>> Date: Tue, 2 Feb 2021 15:02:37 -0600 >>>>>> From: charlie hunt >>>>>> To: zgc-dev at openjdk.java.net >>>>>> Subject: Re: Running into Allocation Stalls during class unloading >>>>>> Message-ID: >>>>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>>>> >>>>>> Hi Mary, >>>>>> >>>>>> Thanks for reaching out. >>>>>> >>>>>> Since you are observing allocation stalls, there a couple options to >>>>>> consider. >>>>>> >>>>>> 1.) If you have CPU cycles available, you can increase the number of >>>>>> concurrent GC threads. You can see the default number ZGC is >>>>>> currently >>>>>> using by doing:? java -XX:+UseZGC -XX:+PrintFlagsFinal -version | >>>>>> grep >>>>>> -i concgcthreads.? Increasing the number of concurrent GC threads >>> should >>>>>> allow ZGC to do its concurrent work before the Java heap space >>>>>> becomes >>>>>> exhausted resulting in allocation stalls. But, additional >>>>>> concurrent GC >>>>>> threads will use more CPU. >>>>>> >>>>>> 2.) Another option is to size the Java heap larger, if you have the >>>>>> available RAM on the system. By increasing the size of the Java >>>>>> heap, >>>>>> you also increase the time the concurrent GC threads can do their >>>>>> work >>>>>> to free space before exhausting Java heap space (which results in >>>>>> allocation stalls). >>>>>> >>>>>> 3.) Another option is profile the application and look for >>> opportunities >>>>>> to reduce unnecessary object allocations. This will reduce the >>>>>> speed at >>>>>> which the Java heap fills that available free space and thus allows >>>>>> ZGC's concurrent GC threads to keep up and avoid allocation stalls. >>>>>> >>>>>> Fwiw, I tend to like the first two options better since I would >>>>>> rather >>>>>> see folks write their Java application(s) in their most natural form >>> and >>>>>> let the JVM figure out how to best run the Java app. >>>>>> >>>>>> Also, as a general comment, having 37% head room for ZGC to >>>>>> operate is >>>>>> not a lot of space. Whether that is enough space largely depends >>>>>> on the >>>>>> application, i.e. its allocation rate, object lifetimes, amount live >>>>>> data in the Java heap, etc., and whether concurrent GC threads >>>>>> can keep >>>>>> up with the pace of allocations with the amount of Java heap space >>>>>> that's available. >>>>>> >>>>>> hths, >>>>>> >>>>>> charlie >>>>>> >>>>>> On 2/2/21 1:30 PM, Mary Sunitha Joseph wrote: >>>>>>> Hi team, >>>>>>> >>>>>>> Our Production application runs on a 320G heap and uses ZGC with >>>>>>> large >>>>>>> pages enabled. We have not done any tuning and are using ZGC with >>>>>> defaults. >>>>>>> Since upgrading to JDK 15.0.1 we've started to notice that once a >>>>>>> day the >>>>>>> app experiences allocation stalls (during peak hours) and this >>>>>>> happens >>>>>> when >>>>>>> there is a huge drop in the number of classes loaded. We have a >>>>>> bi-monthly >>>>>>> release cycle and can see that the allocation stalls start small a >>>>>> business >>>>>>> day after a? release and slowly increase as the week progresses. >>>>>>> >>>>>>> At the moment the app seems to be doing fine but it could escalate >>>>>> anytime >>>>>>> by the looks of it. There is an increase in the app's response >>>>>>> time as >>>>>> well >>>>>>> at the same time and a small spike in heap which seem like side >>>>>>> effects. >>>>>>> Any pointers in terms of tuning would be much appreciated. >>>>>>> >>>>>>> The app currently always makes use of at least 200G of heap space >>> which >>>>>>> leaves a 37% head space for ZGC. >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> Mary >>>>>> End of zgc-dev Digest, Vol 36, Issue 1 >>>>>> ************************************** >>>>>> >>> >>> End of zgc-dev Digest, Vol 36, Issue 2 >>> ************************************** >>> >>