From m.sundar85 at gmail.com Mon Nov 4 18:56:05 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Mon, 4 Nov 2019 10:56:05 -0800 Subject: ZGC Unable to reclaim memory for long time Message-ID: Hi, I ran into this issue where ZGC is unable to reclaim memory for few hours/days. It just keep printing "Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and Allocation Stall happening on that thread. Here is the metrics which shows for some reason even though there is Garbage but it is unable to Reclaim .... [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] GC(112126) Live: - 6366M (78%) 6366M (78%) 6366M (78%) - - *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] GC(112126) Garbage: - 1735M (21%) 1735M (21%) 1731M (21%)* - - [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] GC(112126) Reclaimed: - - 0M (0%) 4M (0%) ... [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Live: - 6367M (78%) 6367M (78%) 6367M (78%) - - *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Garbage: - 1730M (21%) 1730M (21%) 1724M (21%)* - - [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) Reclaimed: - - 0M (0%) 6M (0%) Here it was in this state for ~8hours and it is still happening. It says has a Garbage of 21G but it is not able to Reclaim it everytime it reclaims only 4-6M. Any idea what might be the issue here. TIA Sundar From per.liden at oracle.com Mon Nov 4 20:40:49 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 4 Nov 2019 21:40:49 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: Hi, When a workload produces a uniformly swiss-cheesy heap, i.e. where all parts of the heap have roughly the same amount of garbage, then the GC will face a situation where there are no free lunches and it will have to work hard (compact a lot) to reclaim memory. Therefore, the GC will tolerate a certain amount of fragmentation/waste, in the hope that more object will die soon, making compaction less expensive (at the expense of using more memory for a while). How many CPU cycles to spend on compaction vs. how much memory you can spare is of course a trade-off. You can use -XX:ZFragmentationLimit to control this. It currently defaults to 25% and your workload seems to stabilize at 21%. If you want more aggressive compaction/reclamation, then set the -XX:ZFragmentationLimit to something below 21. This may or may not be a good trade-off in your case. The alternative is to give the GC a larger heap to work with. cheers, Per On 11/4/19 7:56 PM, Sundara Mohan M wrote: > Hi, > I ran into this issue where ZGC is unable to reclaim memory for few > hours/days. It just keep printing "Exception in thread "RMI TCP > Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and > Allocation Stall happening on that thread. > > > Here is the metrics which shows for some reason even though there is > Garbage but it is unable to Reclaim > > .... > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] > GC(112126) Live: - 6366M (78%) 6366M (78%) > 6366M (78%) > - - > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > GC(112126) Garbage: - 1735M (21%) 1735M (21%) > 1731M (21%)* > - - > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] GC(112126) > Reclaimed: - - 0M (0%) > 4M (0%) > ... > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) > Live: - 6367M (78%) 6367M (78%) > 6367M (78%) > - - > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > GC(135520) Garbage: - 1730M (21%) 1730M (21%) > 1724M (21%)* > - - > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] GC(135520) > Reclaimed: - - 0M (0%) > 6M (0%) > > Here it was in this state for ~8hours and it is still happening. It says > has a Garbage of 21G but it is not able to Reclaim it everytime it reclaims > only 4-6M. > > Any idea what might be the issue here. > > > TIA > Sundar > From m.sundar85 at gmail.com Tue Nov 5 01:27:46 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Mon, 4 Nov 2019 17:27:46 -0800 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: HI Per, This explains why it didn't work to reclaim memory, also my heap memory was 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing heap memory will help in this case. Still trying to understand better on ZGC, 1. So shouldn't GC try to be more aggressive and try to put more effort to reclaim without additional settings? 2. Is there a reason why it shouldn't give more CPU to GC threads and reclaim garbage (say after X run of GC it could not reclaim memory)? In this case it would be good to reclaim existing garbage instead of doing Allocation Stall and failing with heap out of memory. Thanks Sundar On Mon, Nov 4, 2019 at 12:40 PM Per Liden wrote: > Hi, > > When a workload produces a uniformly swiss-cheesy heap, i.e. where all > parts of the heap have roughly the same amount of garbage, then the GC > will face a situation where there are no free lunches and it will have > to work hard (compact a lot) to reclaim memory. Therefore, the GC will > tolerate a certain amount of fragmentation/waste, in the hope that more > object will die soon, making compaction less expensive (at the expense > of using more memory for a while). How many CPU cycles to spend on > compaction vs. how much memory you can spare is of course a trade-off. > > You can use -XX:ZFragmentationLimit to control this. It currently > defaults to 25% and your workload seems to stabilize at 21%. If you want > more aggressive compaction/reclamation, then set the > -XX:ZFragmentationLimit to something below 21. This may or may not be a > good trade-off in your case. The alternative is to give the GC a larger > heap to work with. > > cheers, > Per > > On 11/4/19 7:56 PM, Sundara Mohan M wrote: > > Hi, > > I ran into this issue where ZGC is unable to reclaim memory for few > > hours/days. It just keep printing "Exception in thread "RMI TCP > > Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and > > Allocation Stall happening on that thread. > > > > > > Here is the metrics which shows for some reason even though there is > > Garbage but it is unable to Reclaim > > > > .... > > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] > > GC(112126) Live: - 6366M (78%) 6366M > (78%) > > 6366M (78%) > > - - > > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > > GC(112126) Garbage: - 1735M (21%) 1735M > (21%) > > 1731M (21%)* > > - - > > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > GC(112126) > > Reclaimed: - - 0M (0%) > > 4M (0%) > > ... > > > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > GC(135520) > > Live: - 6367M (78%) 6367M (78%) > > 6367M (78%) > > - - > > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > > GC(135520) Garbage: - 1730M (21%) 1730M > (21%) > > 1724M (21%)* > > - - > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > GC(135520) > > Reclaimed: - - 0M (0%) > > 6M (0%) > > > > Here it was in this state for ~8hours and it is still happening. It says > > has a Garbage of 21G but it is not able to Reclaim it everytime it > reclaims > > only 4-6M. > > > > Any idea what might be the issue here. > > > > > > TIA > > Sundar > > > From peter_booth at me.com Tue Nov 5 15:48:43 2019 From: peter_booth at me.com (Peter Booth) Date: Tue, 5 Nov 2019 10:48:43 -0500 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: Reading this and similar threads I am struck by the fact that ZGC users are experiencing things that users of Azul?s Zing JVM also go through. I remember the amazement at seeing a JVM run without substantive GC pauses and thinking that it was a free lunch. But the price was two parts - ensuring adequate heap, and rewiring brains that are accustomed to seeing cpu and memory as independent resources. The second turns out to be much harder. From experience, I think a lot of pain can be avoided by clearly communicating that an adequate heap is a prerequisite for a healthy JVM. Most java developers have absorbed the notion that large heaps are bad/risky and unlearning takes time. Sent from my iPhone > On Nov 4, 2019, at 8:28 PM, Sundara Mohan M wrote: > > ?HI Per, > This explains why it didn't work to reclaim memory, also my heap memory was > 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing > heap memory will help in this case. > > Still trying to understand better on ZGC, > 1. So shouldn't GC try to be more aggressive and try to put more effort to > reclaim without additional settings? > 2. Is there a reason why it shouldn't give more CPU to GC threads and > reclaim garbage (say after X run of GC it could not reclaim memory)? In > this case it would be good to reclaim existing garbage instead of doing > Allocation Stall and failing with heap out of memory. > > > Thanks > Sundar > >> On Mon, Nov 4, 2019 at 12:40 PM Per Liden wrote: >> >> Hi, >> >> When a workload produces a uniformly swiss-cheesy heap, i.e. where all >> parts of the heap have roughly the same amount of garbage, then the GC >> will face a situation where there are no free lunches and it will have >> to work hard (compact a lot) to reclaim memory. Therefore, the GC will >> tolerate a certain amount of fragmentation/waste, in the hope that more >> object will die soon, making compaction less expensive (at the expense >> of using more memory for a while). How many CPU cycles to spend on >> compaction vs. how much memory you can spare is of course a trade-off. >> >> You can use -XX:ZFragmentationLimit to control this. It currently >> defaults to 25% and your workload seems to stabilize at 21%. If you want >> more aggressive compaction/reclamation, then set the >> -XX:ZFragmentationLimit to something below 21. This may or may not be a >> good trade-off in your case. The alternative is to give the GC a larger >> heap to work with. >> >> cheers, >> Per >> >>> On 11/4/19 7:56 PM, Sundara Mohan M wrote: >>> Hi, >>> I ran into this issue where ZGC is unable to reclaim memory for few >>> hours/days. It just keep printing "Exception in thread "RMI TCP >>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and >>> Allocation Stall happening on that thread. >>> >>> >>> Here is the metrics which shows for some reason even though there is >>> Garbage but it is unable to Reclaim >>> >>> .... >>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] >>> GC(112126) Live: - 6366M (78%) 6366M >> (78%) >>> 6366M (78%) >>> - - >>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] >>> GC(112126) Garbage: - 1735M (21%) 1735M >> (21%) >>> 1731M (21%)* >>> - - >>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] >> GC(112126) >>> Reclaimed: - - 0M (0%) >>> 4M (0%) >>> ... >>> >>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >> GC(135520) >>> Live: - 6367M (78%) 6367M (78%) >>> 6367M (78%) >>> - - >>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >>> GC(135520) Garbage: - 1730M (21%) 1730M >> (21%) >>> 1724M (21%)* >>> - - >>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >> GC(135520) >>> Reclaimed: - - 0M (0%) >>> 6M (0%) >>> >>> Here it was in this state for ~8hours and it is still happening. It says >>> has a Garbage of 21G but it is not able to Reclaim it everytime it >> reclaims >>> only 4-6M. >>> >>> Any idea what might be the issue here. >>> >>> >>> TIA >>> Sundar >>> >> From per.liden at oracle.com Wed Nov 6 10:37:32 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 6 Nov 2019 11:37:32 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: Hi, On 11/5/19 2:27 AM, Sundara Mohan M wrote: > HI Per, > This explains why it didn't work to reclaim memory, also my heap memory > was 8G and 6G was strongly reachable (when i took heap dump). Agreed > increasing heap memory will help in this case. > > Still trying to understand better on ZGC, > 1. So shouldn't GC try to be more aggressive and try to put more effort > to reclaim without additional settings? > 2. Is there a reason why it shouldn't give more CPU to GC threads and > reclaim?garbage (say after X run of GC it could not reclaim memory)? In > this case it would be good to reclaim existing garbage instead of doing > Allocation Stall and failing with heap out of memory. The tricky part is knowing/detecting when to be more aggressive, since it tends to become an exercise in trying to predict the future. Reacting when something bad happens (e.g. allocation stall) tends to be too late. However, before thinking too much about heuristics, we might just want to reconsider the ZFragmentationLimit default value, as it is perhaps a bit too generous today. Most apps I've looked at tend to stabilize somewhere between 2-10% fragmentation/waste (i.e. way below 25%), so lowering the default might not hurt most apps, but help some apps. cheers, Per > > > Thanks > Sundar > > On Mon, Nov 4, 2019 at 12:40 PM Per Liden > wrote: > > Hi, > > When a workload produces a uniformly swiss-cheesy heap, i.e. where all > parts of the heap have roughly the same amount of garbage, then the GC > will face a situation where there are no free lunches and it will have > to work hard (compact a lot) to reclaim memory. Therefore, the GC will > tolerate a certain amount of fragmentation/waste, in the hope that more > object will die soon, making compaction less expensive (at the expense > of using more memory for a while). How many CPU cycles to spend on > compaction vs. how much memory you can spare is of course a trade-off. > > You can use -XX:ZFragmentationLimit to control this. It currently > defaults to 25% and your workload seems to stabilize at 21%. If you > want > more aggressive compaction/reclamation, then set the > -XX:ZFragmentationLimit to something below 21. This may or may not be a > good trade-off in your case. The alternative is to give the GC a larger > heap to work with. > > cheers, > Per > > On 11/4/19 7:56 PM, Sundara Mohan M wrote: > > Hi, > >? ? ?I ran into this issue where ZGC is unable to reclaim memory > for few > > hours/days. It just keep printing "Exception in thread "RMI TCP > > Connection(idle)" java.lang.OutOfMemoryError: Java heap space"? and > > Allocation Stall happening on that thread. > > > > > > Here is the metrics which shows for some reason even though there is > > Garbage but it is unable to Reclaim > > > > .... > > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap? ? ?] > > GC(112126)? ? ? Live:? ? ? ? ?-? ? ? ? ? ? ? 6366M (78%) > 6366M (78%) > >? ? ? ? ?6366M (78%) > >? ? ? -? ? ? ? ? ? ? ? ? - > > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap? ? ?] > > GC(112126)? ?Garbage:? ? ? ? ?-? ? ? ? ? ? ? 1735M (21%) > 1735M (21%) > >? ? ? ? ?1731M (21%)* > >? ? ? -? ? ? ? ? ? ? ? ? - > > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap? ? ?] > GC(112126) > > Reclaimed:? ? ? ? ?-? ? ? ? ? ? ? ? ? -? ? ? ? ? ? ? ? ?0M (0%) > >? ?4M (0%) > > ... > > > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?] > GC(135520) > >? ? ? ?Live:? ? ? ? ?-? ? ? ? ? ? ? 6367M (78%)? ? ? ? 6367M (78%) > >? ?6367M (78%) > >? ? ? -? ? ? ? ? ? ? ? ? - > > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?] > > GC(135520)? ?Garbage:? ? ? ? ?-? ? ? ? ? ? ? 1730M (21%) > 1730M (21%) > >? ? ? ? ?1724M (21%)* > >? ? ? -? ? ? ? ? ? ? ? ? - > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap? ? ?] > GC(135520) > > Reclaimed:? ? ? ? ?-? ? ? ? ? ? ? ? ? -? ? ? ? ? ? ? ? ?0M (0%) > >? ?6M (0%) > > > > Here it was in this state for ~8hours and it is still happening. > It says > > has a Garbage of 21G but it is not able to Reclaim it everytime > it reclaims > > only 4-6M. > > > > Any idea what might be the issue here. > > > > > > TIA > > Sundar > > > From per.liden at oracle.com Wed Nov 6 10:44:38 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 6 Nov 2019 11:44:38 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> On 11/5/19 4:48 PM, Peter Booth wrote: > Reading this and similar threads I am struck by the fact that ZGC users are experiencing things that users of Azul?s Zing JVM also go through. I remember the amazement at seeing a JVM run without substantive GC pauses and thinking that it was a free lunch. But the price was two parts - ensuring adequate heap, and rewiring brains that are accustomed to seeing cpu and memory as independent resources. The second turns out to be much harder. > > From experience, I think a lot of pain can be avoided by clearly communicating that an adequate heap is a prerequisite for a healthy JVM. Most java developers have absorbed the notion that large heaps are bad/risky and unlearning takes time. The documentation on the ZGC wiki [1] tries to be clear about this, but I'm sure it could be improved. [1] https://wiki.openjdk.java.net/display/zgc/Main cheers, Per > > Sent from my iPhone > >> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M wrote: >> >> ?HI Per, >> This explains why it didn't work to reclaim memory, also my heap memory was >> 8G and 6G was strongly reachable (when i took heap dump). Agreed increasing >> heap memory will help in this case. >> >> Still trying to understand better on ZGC, >> 1. So shouldn't GC try to be more aggressive and try to put more effort to >> reclaim without additional settings? >> 2. Is there a reason why it shouldn't give more CPU to GC threads and >> reclaim garbage (say after X run of GC it could not reclaim memory)? In >> this case it would be good to reclaim existing garbage instead of doing >> Allocation Stall and failing with heap out of memory. >> >> >> Thanks >> Sundar >> >>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden wrote: >>> >>> Hi, >>> >>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all >>> parts of the heap have roughly the same amount of garbage, then the GC >>> will face a situation where there are no free lunches and it will have >>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will >>> tolerate a certain amount of fragmentation/waste, in the hope that more >>> object will die soon, making compaction less expensive (at the expense >>> of using more memory for a while). How many CPU cycles to spend on >>> compaction vs. how much memory you can spare is of course a trade-off. >>> >>> You can use -XX:ZFragmentationLimit to control this. It currently >>> defaults to 25% and your workload seems to stabilize at 21%. If you want >>> more aggressive compaction/reclamation, then set the >>> -XX:ZFragmentationLimit to something below 21. This may or may not be a >>> good trade-off in your case. The alternative is to give the GC a larger >>> heap to work with. >>> >>> cheers, >>> Per >>> >>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote: >>>> Hi, >>>> I ran into this issue where ZGC is unable to reclaim memory for few >>>> hours/days. It just keep printing "Exception in thread "RMI TCP >>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and >>>> Allocation Stall happening on that thread. >>>> >>>> >>>> Here is the metrics which shows for some reason even though there is >>>> Garbage but it is unable to Reclaim >>>> >>>> .... >>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] >>>> GC(112126) Live: - 6366M (78%) 6366M >>> (78%) >>>> 6366M (78%) >>>> - - >>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] >>>> GC(112126) Garbage: - 1735M (21%) 1735M >>> (21%) >>>> 1731M (21%)* >>>> - - >>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] >>> GC(112126) >>>> Reclaimed: - - 0M (0%) >>>> 4M (0%) >>>> ... >>>> >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >>> GC(135520) >>>> Live: - 6367M (78%) 6367M (78%) >>>> 6367M (78%) >>>> - - >>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >>>> GC(135520) Garbage: - 1730M (21%) 1730M >>> (21%) >>>> 1724M (21%)* >>>> - - >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] >>> GC(135520) >>>> Reclaimed: - - 0M (0%) >>>> 6M (0%) >>>> >>>> Here it was in this state for ~8hours and it is still happening. It says >>>> has a Garbage of 21G but it is not able to Reclaim it everytime it >>> reclaims >>>> only 4-6M. >>>> >>>> Any idea what might be the issue here. >>>> >>>> >>>> TIA >>>> Sundar >>>> >>> > From m.sundar85 at gmail.com Wed Nov 6 20:07:31 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Wed, 6 Nov 2019 12:07:31 -0800 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: Message-ID: HI Per, Thanks. Will try changing ZFragmentationLimit value to see if it works. Regards Sundar On Wed, Nov 6, 2019 at 2:38 AM Per Liden wrote: > Hi, > > On 11/5/19 2:27 AM, Sundara Mohan M wrote: > > HI Per, > > This explains why it didn't work to reclaim memory, also my heap memory > > was 8G and 6G was strongly reachable (when i took heap dump). Agreed > > increasing heap memory will help in this case. > > > > Still trying to understand better on ZGC, > > 1. So shouldn't GC try to be more aggressive and try to put more effort > > to reclaim without additional settings? > > 2. Is there a reason why it shouldn't give more CPU to GC threads and > > reclaim garbage (say after X run of GC it could not reclaim memory)? In > > this case it would be good to reclaim existing garbage instead of doing > > Allocation Stall and failing with heap out of memory. > > The tricky part is knowing/detecting when to be more aggressive, since > it tends to become an exercise in trying to predict the future. Reacting > when something bad happens (e.g. allocation stall) tends to be too late. > > However, before thinking too much about heuristics, we might just want > to reconsider the ZFragmentationLimit default value, as it is perhaps a > bit too generous today. Most apps I've looked at tend to stabilize > somewhere between 2-10% fragmentation/waste (i.e. way below 25%), so > lowering the default might not hurt most apps, but help some apps. > > cheers, > Per > > > > > > > Thanks > > Sundar > > > > On Mon, Nov 4, 2019 at 12:40 PM Per Liden > > wrote: > > > > Hi, > > > > When a workload produces a uniformly swiss-cheesy heap, i.e. where > all > > parts of the heap have roughly the same amount of garbage, then the > GC > > will face a situation where there are no free lunches and it will > have > > to work hard (compact a lot) to reclaim memory. Therefore, the GC > will > > tolerate a certain amount of fragmentation/waste, in the hope that > more > > object will die soon, making compaction less expensive (at the > expense > > of using more memory for a while). How many CPU cycles to spend on > > compaction vs. how much memory you can spare is of course a > trade-off. > > > > You can use -XX:ZFragmentationLimit to control this. It currently > > defaults to 25% and your workload seems to stabilize at 21%. If you > > want > > more aggressive compaction/reclamation, then set the > > -XX:ZFragmentationLimit to something below 21. This may or may not > be a > > good trade-off in your case. The alternative is to give the GC a > larger > > heap to work with. > > > > cheers, > > Per > > > > On 11/4/19 7:56 PM, Sundara Mohan M wrote: > > > Hi, > > > I ran into this issue where ZGC is unable to reclaim memory > > for few > > > hours/days. It just keep printing "Exception in thread "RMI TCP > > > Connection(idle)" java.lang.OutOfMemoryError: Java heap space" > and > > > Allocation Stall happening on that thread. > > > > > > > > > Here is the metrics which shows for some reason even though there > is > > > Garbage but it is unable to Reclaim > > > > > > .... > > > [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] > > > GC(112126) Live: - 6366M (78%) > > 6366M (78%) > > > 6366M (78%) > > > - - > > > *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > > > GC(112126) Garbage: - 1735M (21%) > > 1735M (21%) > > > 1731M (21%)* > > > - - > > > [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > > GC(112126) > > > Reclaimed: - - 0M (0%) > > > 4M (0%) > > > ... > > > > > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > > GC(135520) > > > Live: - 6367M (78%) 6367M (78%) > > > 6367M (78%) > > > - - > > > *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > > > GC(135520) Garbage: - 1730M (21%) > > 1730M (21%) > > > 1724M (21%)* > > > - - > > > [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > > GC(135520) > > > Reclaimed: - - 0M (0%) > > > 6M (0%) > > > > > > Here it was in this state for ~8hours and it is still happening. > > It says > > > has a Garbage of 21G but it is not able to Reclaim it everytime > > it reclaims > > > only 4-6M. > > > > > > Any idea what might be the issue here. > > > > > > > > > TIA > > > Sundar > > > > > > From m.sundar85 at gmail.com Wed Nov 6 20:17:54 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Wed, 6 Nov 2019 12:17:54 -0800 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> Message-ID: Hi Per As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it can handle *few hundred megabytes* to multi terabytes*.* So my understanding was if my application is running with 8G before, with ZGC and same heap also it should run without issues. So far that is not the case i have to increase the heap size always to make sure it gets the same latency/RPS. For me this doesn't seem to be true always in my case(heap ranging from 8 - 48 G i need to change to higher value to make sure i am getting same RPS and latency). Again this is my observation and might vary for different workload. Thanks Sundar On Wed, Nov 6, 2019 at 2:46 AM Per Liden wrote: > On 11/5/19 4:48 PM, Peter Booth wrote: > > Reading this and similar threads I am struck by the fact that ZGC users > are experiencing things that users of Azul?s Zing JVM also go through. I > remember the amazement at seeing a JVM run without substantive GC pauses > and thinking that it was a free lunch. But the price was two parts - > ensuring adequate heap, and rewiring brains that are accustomed to seeing > cpu and memory as independent resources. The second turns out to be much > harder. > > > > From experience, I think a lot of pain can be avoided by clearly > communicating that an adequate heap is a prerequisite for a healthy JVM. > Most java developers have absorbed the notion that large heaps are > bad/risky and unlearning takes time. > > The documentation on the ZGC wiki [1] tries to be clear about this, but > I'm sure it could be improved. > > [1] https://wiki.openjdk.java.net/display/zgc/Main > > cheers, > Per > > > > > Sent from my iPhone > > > >> On Nov 4, 2019, at 8:28 PM, Sundara Mohan M > wrote: > >> > >> ?HI Per, > >> This explains why it didn't work to reclaim memory, also my heap memory > was > >> 8G and 6G was strongly reachable (when i took heap dump). Agreed > increasing > >> heap memory will help in this case. > >> > >> Still trying to understand better on ZGC, > >> 1. So shouldn't GC try to be more aggressive and try to put more effort > to > >> reclaim without additional settings? > >> 2. Is there a reason why it shouldn't give more CPU to GC threads and > >> reclaim garbage (say after X run of GC it could not reclaim memory)? In > >> this case it would be good to reclaim existing garbage instead of doing > >> Allocation Stall and failing with heap out of memory. > >> > >> > >> Thanks > >> Sundar > >> > >>> On Mon, Nov 4, 2019 at 12:40 PM Per Liden > wrote: > >>> > >>> Hi, > >>> > >>> When a workload produces a uniformly swiss-cheesy heap, i.e. where all > >>> parts of the heap have roughly the same amount of garbage, then the GC > >>> will face a situation where there are no free lunches and it will have > >>> to work hard (compact a lot) to reclaim memory. Therefore, the GC will > >>> tolerate a certain amount of fragmentation/waste, in the hope that more > >>> object will die soon, making compaction less expensive (at the expense > >>> of using more memory for a while). How many CPU cycles to spend on > >>> compaction vs. how much memory you can spare is of course a trade-off. > >>> > >>> You can use -XX:ZFragmentationLimit to control this. It currently > >>> defaults to 25% and your workload seems to stabilize at 21%. If you > want > >>> more aggressive compaction/reclamation, then set the > >>> -XX:ZFragmentationLimit to something below 21. This may or may not be a > >>> good trade-off in your case. The alternative is to give the GC a larger > >>> heap to work with. > >>> > >>> cheers, > >>> Per > >>> > >>>> On 11/4/19 7:56 PM, Sundara Mohan M wrote: > >>>> Hi, > >>>> I ran into this issue where ZGC is unable to reclaim memory for > few > >>>> hours/days. It just keep printing "Exception in thread "RMI TCP > >>>> Connection(idle)" java.lang.OutOfMemoryError: Java heap space" and > >>>> Allocation Stall happening on that thread. > >>>> > >>>> > >>>> Here is the metrics which shows for some reason even though there is > >>>> Garbage but it is unable to Reclaim > >>>> > >>>> .... > >>>> [2019-11-04T*08:39:53.986+0000*][1765465.981s][info][gc,heap ] > >>>> GC(112126) Live: - 6366M (78%) 6366M > >>> (78%) > >>>> 6366M (78%) > >>>> - - > >>>> *[2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > >>>> GC(112126) Garbage: - 1735M (21%) 1735M > >>> (21%) > >>>> 1731M (21%)* > >>>> - - > >>>> [2019-11-04T08:39:53.986+0000][1765465.981s][info][gc,heap ] > >>> GC(112126) > >>>> Reclaimed: - - 0M (0%) > >>>> 4M (0%) > >>>> ... > >>>> > >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > >>> GC(135520) > >>>> Live: - 6367M (78%) 6367M (78%) > >>>> 6367M (78%) > >>>> - - > >>>> *[2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > >>>> GC(135520) Garbage: - 1730M (21%) 1730M > >>> (21%) > >>>> 1724M (21%)* > >>>> - - > >>>> [2019-11-04T16:48:53.742+0000][1794805.738s][info][gc,heap ] > >>> GC(135520) > >>>> Reclaimed: - - 0M (0%) > >>>> 6M (0%) > >>>> > >>>> Here it was in this state for ~8hours and it is still happening. It > says > >>>> has a Garbage of 21G but it is not able to Reclaim it everytime it > >>> reclaims > >>>> only 4-6M. > >>>> > >>>> Any idea what might be the issue here. > >>>> > >>>> > >>>> TIA > >>>> Sundar > >>>> > >>> > > > From fw at deneb.enyo.de Wed Nov 6 20:47:26 2019 From: fw at deneb.enyo.de (Florian Weimer) Date: Wed, 06 Nov 2019 21:47:26 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: (Sundara Mohan M.'s message of "Wed, 6 Nov 2019 12:17:54 -0800") References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> Message-ID: <8736f03f6p.fsf@mid.deneb.enyo.de> * Sundara Mohan M.: > Hi Per > As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it can > handle *few hundred megabytes* to multi terabytes*.* > > So my understanding was if my application is running with 8G before, with > ZGC and same heap also it should run without issues. So far that is not > the case i have to increase the heap size always to make sure it gets the > same latency/RPS. What do you mean with ?before?? Thanks. From m.sundar85 at gmail.com Wed Nov 6 21:04:16 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Wed, 6 Nov 2019 13:04:16 -0800 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: <8736f03f6p.fsf@mid.deneb.enyo.de> References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> Message-ID: To be clear "before" i was referring to my previous GC, Parallel/G1/CMS. Here is what i was seeing Instance1 - 8G heap, ParallelGC, 100RPS, 200ms Latency Instance2 - 8G heap, ZGC, 100RPS, 600ms Latency Instance3 - 32G heap, ZGC, 100RPS, 200ms Latency My expectation was Instance2 should give me same result as Instance1 but that is not the case. Instead i had to move to Instance3 setting to get what i want. This is just my observation and it might not be same for all workloads. But on the other hand with ZGC my throughput increased from *(ParallelGC)97%* to* (ZGC)99.7% *and my STW pauses have never crossed 20ms. Thanks Sundar On Wed, Nov 6, 2019 at 12:47 PM Florian Weimer wrote: > * Sundara Mohan M.: > > > Hi Per > > As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it > can > > handle *few hundred megabytes* to multi terabytes*.* > > > > So my understanding was if my application is running with 8G before, with > > ZGC and same heap also it should run without issues. So far that is not > > the case i have to increase the heap size always to make sure it gets the > > same latency/RPS. > > What do you mean with ?before?? Thanks. > From conniall at amazon.com Wed Nov 6 21:15:38 2019 From: conniall at amazon.com (Connaughton, Niall) Date: Wed, 6 Nov 2019 21:15:38 +0000 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> Message-ID: <5C0FB2FF-2768-458E-9C60-7ECFCB8B8DAD@amazon.com> ZGC and other pauseless/low pause collectors are designed to allow your process to continue while the GC is running - hence why your pause times are low. The problem is that if your process is still running, it's still allocating. If you are filling up the heap faster than ZGC can keep up, you will have degraded performance. Assuming your allocations are mostly not long-lived, you give ZGC more time to collect by giving it a bigger heap, which takes longer for you to fill. This is the mindset shift that Peter Booth was referring to earlier. A lot of engineers have developed/been trained into a mindset where smaller heaps are better, particularly by STW GCs like ParallelGC. That approach is not helpful with concurrent collectors, and it's something you can already see with G1GC. In most cases giving G1GC a larger heap will improve performance, but it does depend on your workload and allocation/lifetime pattern. So the fact ZGC can handle heaps that are TBs in size doesn't mean that you can expect better performance by moving to ZGC and keeping the heap size the same, even if the heap is only a few GB. Niall ?On 11/6/19, 13:05, "zgc-dev on behalf of Sundara Mohan M" wrote: To be clear "before" i was referring to my previous GC, Parallel/G1/CMS. Here is what i was seeing Instance1 - 8G heap, ParallelGC, 100RPS, 200ms Latency Instance2 - 8G heap, ZGC, 100RPS, 600ms Latency Instance3 - 32G heap, ZGC, 100RPS, 200ms Latency My expectation was Instance2 should give me same result as Instance1 but that is not the case. Instead i had to move to Instance3 setting to get what i want. This is just my observation and it might not be same for all workloads. But on the other hand with ZGC my throughput increased from *(ParallelGC)97%* to* (ZGC)99.7% *and my STW pauses have never crossed 20ms. Thanks Sundar On Wed, Nov 6, 2019 at 12:47 PM Florian Weimer wrote: > * Sundara Mohan M.: > > > Hi Per > > As per [1] https://wiki.openjdk.java.net/display/zgc/Main it says it > can > > handle *few hundred megabytes* to multi terabytes*.* > > > > So my understanding was if my application is running with 8G before, with > > ZGC and same heap also it should run without issues. So far that is not > > the case i have to increase the heap size always to make sure it gets the > > same latency/RPS. > > What do you mean with ?before?? Thanks. > From fw at deneb.enyo.de Wed Nov 6 21:22:42 2019 From: fw at deneb.enyo.de (Florian Weimer) Date: Wed, 06 Nov 2019 22:22:42 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: (Sundara Mohan M.'s message of "Wed, 6 Nov 2019 13:04:16 -0800") References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> Message-ID: <87pni41yzh.fsf@mid.deneb.enyo.de> * Sundara Mohan M.: > To be clear "before" i was referring to my previous GC, Parallel/G1/CMS. > > Here is what i was seeing > Instance1 - 8G heap, ParallelGC, 100RPS, 200ms Latency > Instance2 - 8G heap, ZGC, 100RPS, 600ms Latency > Instance3 - 32G heap, ZGC, 100RPS, 200ms Latency Latency is measured end-to-end, including processing time and stalls etc.? Keep in mind that ZGC does not supported compressed pointers, so if your workload is heavy on pointers, the VM has to do more work, and the memory requirements are also higher. > My expectation was Instance2 should give me same result as Instance1 but > that is not the case. Instead i had to move to Instance3 setting to get > what i want. It's difficult to beat ParallelGC in terms of overall CPU efficiency. If you do not have CPU cycles to spare and the GC does not have sufficient extra heap to work with beyond the live object set (to some extent, you can trade RAM vs CPU), application performance will suffer. You could also give Shenandoah a try. 8-) From per.liden at oracle.com Wed Nov 6 23:08:44 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 7 Nov 2019 00:08:44 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> Message-ID: <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com> On 11/6/19 10:04 PM, Sundara Mohan M wrote: [...] > But on the other hand with ZGC my throughput increased from > *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never crossed 20ms. Could you please paste the last printout of the "Garbage Collection Statistics" table? If you have pauses close to 20ms, it would be interesting to see where that time is spent. I would assume it's accounted to "Subphase: Pause Roots Threads", but seeing the whole table would tell the me more. thanks, Per From m.sundar85 at gmail.com Thu Nov 7 01:28:25 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Wed, 6 Nov 2019 17:28:25 -0800 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com> References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com> Message-ID: Unfortunately i don't have that stats. I was running application with this option for gc log -Xlog:gc,gc+init,gc+start,gc+phases,gc+heap,gc+cpu,gc+reloc,gc+ref,gc+marking,gc+metaspace=info. Will see if i can get that issue reproducible and get that data for you. Thanks Sundar On Wed, Nov 6, 2019 at 3:08 PM Per Liden wrote: > On 11/6/19 10:04 PM, Sundara Mohan M wrote: > [...] > > But on the other hand with ZGC my throughput increased from > > *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never crossed > 20ms. > > Could you please paste the last printout of the "Garbage Collection > Statistics" table? If you have pauses close to 20ms, it would be > interesting to see where that time is spent. I would assume it's > accounted to "Subphase: Pause Roots Threads", but seeing the whole table > would tell the me more. > > thanks, > Per > From per.liden at oracle.com Thu Nov 7 07:25:10 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 7 Nov 2019 08:25:10 +0100 Subject: ZGC Unable to reclaim memory for long time In-Reply-To: References: <84077f64-2c0b-0e9b-d57b-2f2f9aa34f4a@oracle.com> <8736f03f6p.fsf@mid.deneb.enyo.de> <2d0a4a9b-55d2-f7be-bb4c-e9a4387586a0@oracle.com> Message-ID: On 11/7/19 2:28 AM, Sundara Mohan M wrote: > Unfortunately i don't have that stats. I was running application with > this option for gc > log?-Xlog:gc,gc+init,gc+start,gc+phases,gc+heap,gc+cpu,gc+reloc,gc+ref,gc+marking,gc+metaspace=info. > Will see if i can get that issue reproducible and get that data for you. Ok, thanks. I'd recommend using -Xlog:gc*, that way you catch pretty much all you need to tell what's going on, without being super verbose. /Per > > > Thanks > Sundar > > On Wed, Nov 6, 2019 at 3:08 PM Per Liden > wrote: > > On 11/6/19 10:04 PM, Sundara Mohan M wrote: > [...] > > But on the other hand with ZGC my throughput increased from > > *(ParallelGC)97%* to*(ZGC)99.7% *and my STW pauses have never > crossed 20ms. > > Could you please paste the last printout of the "Garbage Collection > Statistics" table? If you have pauses close to 20ms, it would be > interesting to see where that time is spent. I would assume it's > accounted to "Subphase: Pause Roots Threads", but seeing the whole > table > would tell the me more. > > thanks, > Per > From m.sundar85 at gmail.com Tue Nov 12 00:42:30 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Mon, 11 Nov 2019 16:42:30 -0800 Subject: Upgrade to JDK13 for ZGC? Message-ID: Hi, I am using ZGC and trying to see if any bug fixes gone in to JDK13 other than "Uncommit memory feature". 1. Was "memory uncommit" is the only feature gone in to JDK13? 2. Is there a way to find all the bug fixed in JDK13 and categorize it by ZGC, so in future i can do it myself. TIA, Sundar From thomas.schatzl at oracle.com Tue Nov 12 10:26:50 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 11:26:50 +0100 Subject: Upgrade to JDK13 for ZGC? In-Reply-To: References: Message-ID: Hi, On 12.11.19 01:42, Sundara Mohan M wrote: > Hi, > I am using ZGC and trying to see if any bug fixes gone in to JDK13 > other than "Uncommit memory feature". > > 1. Was "memory uncommit" is the only feature gone in to JDK13? > 2. Is there a way to find all the bug fixed in JDK13 and categorize it by > ZGC, so in future i can do it myself. > > Something like this JBS query: https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC should do the trick, i.e. showing everything with the "zgc" label and fix version of the various jdk 13 releases. Thanks, Thomas From m.sundar85 at gmail.com Tue Nov 12 21:27:55 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Tue, 12 Nov 2019 13:27:55 -0800 Subject: Upgrade to JDK13 for ZGC? In-Reply-To: References: Message-ID: Thank you. Regards Sundar On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl wrote: > Hi, > > On 12.11.19 01:42, Sundara Mohan M wrote: > > Hi, > > I am using ZGC and trying to see if any bug fixes gone in to JDK13 > > other than "Uncommit memory feature". > > > > 1. Was "memory uncommit" is the only feature gone in to JDK13? > > 2. Is there a way to find all the bug fixed in JDK13 and categorize it by > > ZGC, so in future i can do it myself. > > > > > > Something like this JBS query: > > > https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC > > should do the trick, i.e. showing everything with the "zgc" label and > fix version of the various jdk 13 releases. > > Thanks, > Thomas > From m.sundar85 at gmail.com Thu Nov 14 18:58:30 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 14 Nov 2019 10:58:30 -0800 Subject: Why does load average on host increases as Allocation Stall happens? Message-ID: Hi, I have notices Load average on the host increases 5 - 10 times when Allocation Stall happens, trying to understand what causes load average to increase when this happens. Looking at the code zPageAllocator.cpp do { // Start asynchronous GC ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall); // Wait for allocation to complete or fail page = request.wait(); } while (page == gc_marker); Seems request.wait() is internally doing a get call on ZFuture. 1. Will this use this thread to spin on CPU or it is async (mean this thread will go to sleep and can be woken up when it is ready and other process can occupy this CPU)? 2. Since load average increase matches exactly with allocation stall, is there any other operation (like Flushing page) can cause this behavior? Since i haven't enabled "gc,stats" tag in my logging i missed some information there. Will try to get that information when i can reproduce it. TIA, Sundar From fw at deneb.enyo.de Thu Nov 14 19:24:41 2019 From: fw at deneb.enyo.de (Florian Weimer) Date: Thu, 14 Nov 2019 20:24:41 +0100 Subject: Why does load average on host increases as Allocation Stall happens? In-Reply-To: (Sundara Mohan M.'s message of "Thu, 14 Nov 2019 10:58:30 -0800") References: Message-ID: <877e42w9ae.fsf@mid.deneb.enyo.de> * Sundara Mohan M.: > I have notices Load average on the host increases 5 - 10 times > when Allocation Stall happens, trying to understand what causes load > average to increase when this happens. > 2. Since load average increase matches exactly with allocation stall, is > there any other operation (like Flushing page) can cause this behavior? I don't know ZGC internals, but I think it stalls the application when the GC cannot keep up. This is a last resort. Before that happens, more GC threads will try hard to reclaim memory. That work increases system load. An alternative explanation could be that something else consumes CPU resources, taking it away from the GC threads, so that they cannot keep up, and ZGC has to introduce allocation stalls. From m.sundar85 at gmail.com Fri Nov 15 18:55:38 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Fri, 15 Nov 2019 10:55:38 -0800 Subject: MMU drops suddenly Message-ID: Hi, Have noticed following in gc log [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu ] GC(10952) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8% [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu ] GC(11441) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, *100ms/90.8%* [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu ] GC(11927) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%* [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu ] GC(12450) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu ] GC(12946) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu ] GC(13410) MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%* Was trying to understand what it means and here is my understanding, This says how much minimum CPU available for mutator thread in last Xms 1. Is this correct? 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other time unit it is 0% does that mean my application doesn't get a chance to run? Also i see it never goes back to higher value. 3. Does this measure indicates something good or bad? 3. If this is bad what should i look further to get more insights? Can someone help me to get better understanding on this. TIA, Sundar From per.liden at oracle.com Mon Nov 18 08:43:14 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Nov 2019 09:43:14 +0100 Subject: MMU drops suddenly In-Reply-To: References: Message-ID: Hi, On 11/15/19 7:55 PM, Sundara Mohan M wrote: > Hi, > Have noticed following in gc log > [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu ] GC(10952) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8% > [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu ] GC(11441) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, *100ms/90.8%* > [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu ] GC(11927) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%* > [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu ] GC(12450) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% > [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu ] GC(12946) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% > [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu ] GC(13410) > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%* > > Was trying to understand what it means and here is my understanding, This > says how much minimum CPU available for mutator thread in last Xms > 1. Is this correct? Not quite. The MMU printout tells you the minimum amount of time Java threads could execute in the different time windows. Note that it's the worst case since the VM started. For example, 10ms/23.7% means there has been at least one 10ms window, where Java threads could only execute for 23.7% of that time (2.37ms). > 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other time > unit it is 0% does that mean my application doesn't get a chance to run? Right, 2ms/0.0% means there was at least one 2ms windows, where the Java threads didn't get a chance to run at all. > Also i see it never goes back to higher value. Correct. Since it shows the worst case since the VM started it will never go back to a higher value. > 3. Does this measure indicates something good or bad? In general, 0% is bad, 100% is good. Exactly which time window you're interested in depends on what response time requirements you have. A simplified example to show the principle: Assume a request takes 5ms to process in your application, and you have a response time requirement of 10ms, then 10ms/60% would be good, but 10ms/40% would not be good enough. > 3. If this is bad what should i look further to get more insights? Look at the GC pauses, how long they are and how far apart they are. The GC statistics printed by gc+stats shows you where you're spending time in pauses. If the GC pauses are long, then ZGC is likely starved on CPU. If the GC pauses are close to each other, then ZGC is likely doing back-to-back GCs and needs more heap to work with. cheers, Per From per.liden at oracle.com Mon Nov 18 08:59:16 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Nov 2019 09:59:16 +0100 Subject: Why does load average on host increases as Allocation Stall happens? In-Reply-To: References: Message-ID: On 11/14/19 7:58 PM, Sundara Mohan M wrote: > Hi, > I have notices Load average on the host increases 5 - 10 times when > Allocation Stall happens, trying to understand what causes load average to > increase when this happens. It's impossible to say with certainty without inspecting what's actually going on in the system. Florian's explanations are good. It could just be that your application workload is peaking, which in turn causes the allocation stalls. > > Looking at the code > zPageAllocator.cpp > do { > // Start asynchronous GC > ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall); > > // Wait for allocation to complete or fail > page = request.wait(); > } while (page == gc_marker); > > Seems request.wait() is internally doing a get call on ZFuture. > > 1. Will this use this thread to spin on CPU or it is async (mean this > thread will go to sleep and can be woken up when it is ready and other > process can occupy this CPU)? It's async. /Per > 2. Since load average increase matches exactly with allocation stall, is > there any other operation (like Flushing page) can cause this behavior? > > Since i haven't enabled "gc,stats" tag in my logging i missed some > information there. Will try to get that information when i can reproduce it. > > > TIA, > Sundar > From per.liden at oracle.com Mon Nov 18 12:32:47 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Nov 2019 13:32:47 +0100 Subject: Upgrade to JDK13 for ZGC? In-Reply-To: References: Message-ID: The ZGC wiki also has a high-level change log, which highlights the most interesting user visible enhancements: https://wiki.openjdk.java.net/display/zgc/Main#Main-ChangeLog /Per On 11/12/19 10:27 PM, Sundara Mohan M wrote: > Thank you. > > Regards > Sundar > > On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl > wrote: > >> Hi, >> >> On 12.11.19 01:42, Sundara Mohan M wrote: >>> Hi, >>> I am using ZGC and trying to see if any bug fixes gone in to JDK13 >>> other than "Uncommit memory feature". >>> >>> 1. Was "memory uncommit" is the only feature gone in to JDK13? >>> 2. Is there a way to find all the bug fixed in JDK13 and categorize it by >>> ZGC, so in future i can do it myself. >>> >>> >> >> Something like this JBS query: >> >> >> https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC >> >> should do the trick, i.e. showing everything with the "zgc" label and >> fix version of the various jdk 13 releases. >> >> Thanks, >> Thomas >> From m.sundar85 at gmail.com Tue Nov 19 19:28:28 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Tue, 19 Nov 2019 11:28:28 -0800 Subject: Upgrade to JDK13 for ZGC? In-Reply-To: References: Message-ID: Cool, thanks! On Mon, Nov 18, 2019 at 4:32 AM Per Liden wrote: > The ZGC wiki also has a high-level change log, which highlights the most > interesting user visible enhancements: > > https://wiki.openjdk.java.net/display/zgc/Main#Main-ChangeLog > > /Per > > On 11/12/19 10:27 PM, Sundara Mohan M wrote: > > Thank you. > > > > Regards > > Sundar > > > > On Tue, Nov 12, 2019 at 2:29 AM Thomas Schatzl < > thomas.schatzl at oracle.com> > > wrote: > > > >> Hi, > >> > >> On 12.11.19 01:42, Sundara Mohan M wrote: > >>> Hi, > >>> I am using ZGC and trying to see if any bug fixes gone in to > JDK13 > >>> other than "Uncommit memory feature". > >>> > >>> 1. Was "memory uncommit" is the only feature gone in to JDK13? > >>> 2. Is there a way to find all the bug fixed in JDK13 and categorize it > by > >>> ZGC, so in future i can do it myself. > >>> > >>> > >> > >> Something like this JBS query: > >> > >> > >> > https://bugs.openjdk.java.net/browse/JDK-8225227?jql=fixVersion%20in%20(%2213%22%2C%2013.0.1%2C%2013.0.2)%20AND%20labels%20%3D%20zgc%20order%20by%20lastViewed%20DESC > >> > >> should do the trick, i.e. showing everything with the "zgc" label and > >> fix version of the various jdk 13 releases. > >> > >> Thanks, > >> Thomas > >> > From m.sundar85 at gmail.com Tue Nov 19 19:29:37 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Tue, 19 Nov 2019 11:29:37 -0800 Subject: Why does load average on host increases as Allocation Stall happens? In-Reply-To: References: Message-ID: Thank you for the clarification. I will try to get more gc log and system information during that time to get more detail. Thanks Sundar On Mon, Nov 18, 2019 at 12:59 AM Per Liden wrote: > On 11/14/19 7:58 PM, Sundara Mohan M wrote: > > Hi, > > I have notices Load average on the host increases 5 - 10 times when > > Allocation Stall happens, trying to understand what causes load average > to > > increase when this happens. > > It's impossible to say with certainty without inspecting what's actually > going on in the system. Florian's explanations are good. It could just > be that your application workload is peaking, which in turn causes the > allocation stalls. > > > > > Looking at the code > > zPageAllocator.cpp > > do { > > // Start asynchronous GC > > ZCollectedHeap::heap()->collect(GCCause::_z_allocation_stall); > > > > // Wait for allocation to complete or fail > > page = request.wait(); > > } while (page == gc_marker); > > > > Seems request.wait() is internally doing a get call on ZFuture. > > > > 1. Will this use this thread to spin on CPU or it is async (mean this > > thread will go to sleep and can be woken up when it is ready and other > > process can occupy this CPU)? > > It's async. > > /Per > > > 2. Since load average increase matches exactly with allocation stall, is > > there any other operation (like Flushing page) can cause this behavior? > > > > Since i haven't enabled "gc,stats" tag in my logging i missed some > > information there. Will try to get that information when i can reproduce > it. > > > > > > TIA, > > Sundar > > > From m.sundar85 at gmail.com Thu Nov 21 23:33:53 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 21 Nov 2019 15:33:53 -0800 Subject: MMU drops suddenly In-Reply-To: References: Message-ID: Got it. Thanks for the explanation. Regards Sundar On Mon, Nov 18, 2019 at 12:43 AM Per Liden wrote: > Hi, > > On 11/15/19 7:55 PM, Sundara Mohan M wrote: > > Hi, > > Have noticed following in gc log > > [2019-11-13T19:24:13.095+0000][69629.984s][info][gc,mmu ] GC(10952) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, 100ms/90.8% > > [2019-11-13T20:12:55.339+0000][72552.228s][info][gc,mmu ] GC(11441) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/23.7%, 20ms/61.9%, 50ms/81.7%, > *100ms/90.8%* > > [2019-11-13T21:00:53.415+0000][75430.304s][info][gc,mmu ] GC(11927) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, *100ms/70.7%* > > [2019-11-13T21:52:46.244+0000][78543.133s][info][gc,mmu ] GC(12450) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% > > [2019-11-13T22:40:35.887+0000][81412.776s][info][gc,mmu ] GC(12946) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/44.2%, 100ms/70.7% > > [2019-11-13T23:27:23.807+0000][84220.696s][info][gc,mmu ] GC(13410) > > MMU: 2ms/0.0%, 5ms/0.0%, 10ms/0.0%, 20ms/0.0%, 50ms/0.0%, *100ms/43.0%* > > > > Was trying to understand what it means and here is my understanding, This > > says how much minimum CPU available for mutator thread in last Xms > > 1. Is this correct? > > Not quite. The MMU printout tells you the minimum amount of time Java > threads could execute in the different time windows. Note that it's the > worst case since the VM started. For example, 10ms/23.7% means there has > been at least one 10ms window, where Java threads could only execute for > 23.7% of that time (2.37ms). > > > 2. Why is this suddenly dropping from (100ms 90% -> 40%) ? Also other > time > > unit it is 0% does that mean my application doesn't get a chance to run? > > Right, 2ms/0.0% means there was at least one 2ms windows, where the Java > threads didn't get a chance to run at all. > > > Also i see it never goes back to higher value. > > Correct. Since it shows the worst case since the VM started it will > never go back to a higher value. > > > 3. Does this measure indicates something good or bad? > > In general, 0% is bad, 100% is good. Exactly which time window you're > interested in depends on what response time requirements you have. A > simplified example to show the principle: Assume a request takes 5ms to > process in your application, and you have a response time requirement of > 10ms, then 10ms/60% would be good, but 10ms/40% would not be good enough. > > > 3. If this is bad what should i look further to get more insights? > > Look at the GC pauses, how long they are and how far apart they are. The > GC statistics printed by gc+stats shows you where you're spending time > in pauses. If the GC pauses are long, then ZGC is likely starved on CPU. > If the GC pauses are close to each other, then ZGC is likely doing > back-to-back GCs and needs more heap to work with. > > cheers, > Per > From m.sundar85 at gmail.com Thu Nov 21 23:37:54 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Thu, 21 Nov 2019 15:37:54 -0800 Subject: Heap dump is always around 8G on process with 80G heap Message-ID: Hi, I am trying to take a heap dump of java process with 80G heap with ZGC, this is always giving me around 8G dump file. Same application with ParallelGC running with 48G heap i am getting around 30G dump file. I am using following command on both process and verified both process has same no of request processed and Used memory from gc log is similar jcmd GC.heap_dump 1. Why is ZGC heap dump always less compared to process running with ParallelGC? 2. Is there something i am missing? Thanks Sundar From per.liden at oracle.com Fri Nov 22 08:31:03 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 09:31:03 +0100 Subject: Heap dump is always around 8G on process with 80G heap In-Reply-To: References: Message-ID: On 11/22/19 12:37 AM, Sundara Mohan M wrote: > Hi, > I am trying to take a heap dump of java process with 80G heap with ZGC, > this is always giving me around 8G dump file. > Same application with ParallelGC running with 48G heap i am getting around > 30G dump file. > I am using following command on both process and verified both process has > same no of request processed and Used memory from gc log is similar > jcmd GC.heap_dump > > 1. Why is ZGC heap dump always less compared to process running with > ParallelGC? > 2. Is there something i am missing? > There are various reasons why a heap dump from one GC is larger or smaller compared to another GC. For example, ZGC only ever dumps reachable objects, while PrallelGC can also dump unreachable objects under some conditions (even though you didn't ask for them). It's hard to tell where the difference comes from in your case, without further inspection/debugging. /Per From m.sundar85 at gmail.com Fri Nov 22 19:39:08 2019 From: m.sundar85 at gmail.com (Sundara Mohan M) Date: Fri, 22 Nov 2019 11:39:08 -0800 Subject: Heap dump is always around 8G on process with 80G heap In-Reply-To: References: Message-ID: Hi Per, "ZGC only ever dumps reachable objects" Does that mean we can never dump unreachable objects in ZGC or there is some options that can be passed to get it? I will try to see if the dump from other GC has unreachable object which is showing as large file. Thanks Sundar On Fri, Nov 22, 2019 at 12:31 AM Per Liden wrote: > On 11/22/19 12:37 AM, Sundara Mohan M wrote: > > Hi, > > I am trying to take a heap dump of java process with 80G heap with > ZGC, > > this is always giving me around 8G dump file. > > Same application with ParallelGC running with 48G heap i am getting > around > > 30G dump file. > > I am using following command on both process and verified both process > has > > same no of request processed and Used memory from gc log is similar > > jcmd GC.heap_dump > > > > 1. Why is ZGC heap dump always less compared to process running with > > ParallelGC? > > 2. Is there something i am missing? > > > > There are various reasons why a heap dump from one GC is larger or > smaller compared to another GC. For example, ZGC only ever dumps > reachable objects, while PrallelGC can also dump unreachable objects > under some conditions (even though you didn't ask for them). > > It's hard to tell where the difference comes from in your case, without > further inspection/debugging. > > /Per > From conniall at amazon.com Tue Nov 26 22:45:34 2019 From: conniall at amazon.com (Connaughton, Niall) Date: Tue, 26 Nov 2019 22:45:34 +0000 Subject: Is ZGC still in experimental? In-Reply-To: <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com> References: <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com> Message-ID: I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11? Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version. Thanks, Niall ?On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" wrote: I would way that's unlikely at this time, given ZGC's experimental status in 11. /Per On 10/30/19 8:28 PM, Sundara Mohan M wrote: > Hi Per > Will these changes be merged back to JDK11 at any point? > For ex. uncommit memory feature or C2 related changes will be merged > back to 11? > > Thanks > Sundar > > > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M > wrote: > > Ok, thanks for the update. > > On Tue, Oct 22, 2019 at 1:12 AM Per Liden > wrote: > > Hi, > > No decision has been made, but we're continuously evaluating > where we > stand. The new C2 load barriers (JDK-8230565) was a major milestone > towards making ZGC rock solid. We can hopefully make it > non-experimental > sooner rather than later. > > /Per > > On 10/22/19 12:14 AM, Sundara Mohan M wrote: > > Hi, > > Any idea when ZGC will be moved out of experimental flags? > > Understand it is too early to move it out of experimental but > do we have > > any plan to run it without +UnlockExperimentalVMOptions? > > > > Thanks > > Sundar > > > From per.liden at oracle.com Wed Nov 27 10:05:40 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 27 Nov 2019 11:05:40 +0100 Subject: Is ZGC still in experimental? In-Reply-To: References: <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com> Message-ID: Hi, On 11/26/19 11:45 PM, Connaughton, Niall wrote: > I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11? > > Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version. Experimental status basically means it's a technical preview giving people a chance to test it and provide feedback, without having to roll their own JDK. We have backported bug fixes on a few occasions, but that's typically only done if we think it's critical for one reason or another, and the bar is high. We don't intend to make ZGC in 11 non-experimental. In the new JDK release model, LTS releases typically don't get "new features" after GA. ZGC and the supporting infrastructure in Hotspot is moving along at a fairly rapid pace, and quite a lot of good stuff has gone into each release since 11. Using the latest JDK is always recommended if you're using ZGC. ZGC in JDK 11 vs. 13 can for some workloads be a quite noticeable leap, in terms of performance, latency, etc. Hope that helps. cheers, /Per > > Thanks, > Niall > > ?On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" wrote: > > I would way that's unlikely at this time, given ZGC's experimental > status in 11. > > /Per > > On 10/30/19 8:28 PM, Sundara Mohan M wrote: > > Hi Per > > Will these changes be merged back to JDK11 at any point? > > For ex. uncommit memory feature or C2 related changes will be merged > > back to 11? > > > > Thanks > > Sundar > > > > > > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M > > wrote: > > > > Ok, thanks for the update. > > > > On Tue, Oct 22, 2019 at 1:12 AM Per Liden > > wrote: > > > > Hi, > > > > No decision has been made, but we're continuously evaluating > > where we > > stand. The new C2 load barriers (JDK-8230565) was a major milestone > > towards making ZGC rock solid. We can hopefully make it > > non-experimental > > sooner rather than later. > > > > /Per > > > > On 10/22/19 12:14 AM, Sundara Mohan M wrote: > > > Hi, > > > Any idea when ZGC will be moved out of experimental flags? > > > Understand it is too early to move it out of experimental but > > do we have > > > any plan to run it without +UnlockExperimentalVMOptions? > > > > > > Thanks > > > Sundar > > > > > > > From conniall at amazon.com Wed Nov 27 22:18:22 2019 From: conniall at amazon.com (Connaughton, Niall) Date: Wed, 27 Nov 2019 22:18:22 +0000 Subject: Is ZGC still in experimental? In-Reply-To: References: <7edbce9a-d89c-a16d-20a9-a20c48d51e5b@oracle.com> Message-ID: Thanks, that helps clarify. For us running on an LTS release is preferred as we don't necessarily want to be in a position of needing to move to a newer JDK at high frequency. We're aware of ZGC being experimental and know we're somewhat wading into unknown waters, so it's a balance we have to think about between the potential benefits of a fundamentally different GC vs potential pitfalls or additional effort in keeping up to date. The Shenandoah team seem to be putting a lot of effort into backporting to earlier JDKs - I guess this is down to a choice they've specifically made. ?On 11/27/19, 02:06, "Per Liden" wrote: Hi, On 11/26/19 11:45 PM, Connaughton, Niall wrote: > I wanted to double check on this. Is the intention that ZGC in JDK 11 is not intended for production use and will never have backports to bring it up to a production ready level? Put another way - if we want to evaluate using ZGC for a production service, is it an effective pre-requisite to move to a later JDK than 11? > > Obviously we can test with the JDK11 version, and if we don't happen to see any problems then can make an "at-your-own-risk" call on that. But the longer term plans for addressing issues will affect the appetite for whether to even start out on that version. Experimental status basically means it's a technical preview giving people a chance to test it and provide feedback, without having to roll their own JDK. We have backported bug fixes on a few occasions, but that's typically only done if we think it's critical for one reason or another, and the bar is high. We don't intend to make ZGC in 11 non-experimental. In the new JDK release model, LTS releases typically don't get "new features" after GA. ZGC and the supporting infrastructure in Hotspot is moving along at a fairly rapid pace, and quite a lot of good stuff has gone into each release since 11. Using the latest JDK is always recommended if you're using ZGC. ZGC in JDK 11 vs. 13 can for some workloads be a quite noticeable leap, in terms of performance, latency, etc. Hope that helps. cheers, /Per > > Thanks, > Niall > > On 10/31/19, 02:00, "zgc-dev on behalf of Per Liden" wrote: > > I would way that's unlikely at this time, given ZGC's experimental > status in 11. > > /Per > > On 10/30/19 8:28 PM, Sundara Mohan M wrote: > > Hi Per > > Will these changes be merged back to JDK11 at any point? > > For ex. uncommit memory feature or C2 related changes will be merged > > back to 11? > > > > Thanks > > Sundar > > > > > > On Tue, Oct 22, 2019 at 11:04 AM Sundara Mohan M > > wrote: > > > > Ok, thanks for the update. > > > > On Tue, Oct 22, 2019 at 1:12 AM Per Liden > > wrote: > > > > Hi, > > > > No decision has been made, but we're continuously evaluating > > where we > > stand. The new C2 load barriers (JDK-8230565) was a major milestone > > towards making ZGC rock solid. We can hopefully make it > > non-experimental > > sooner rather than later. > > > > /Per > > > > On 10/22/19 12:14 AM, Sundara Mohan M wrote: > > > Hi, > > > Any idea when ZGC will be moved out of experimental flags? > > > Understand it is too early to move it out of experimental but > > do we have > > > any plan to run it without +UnlockExperimentalVMOptions? > > > > > > Thanks > > > Sundar > > > > > > >