Status of JEP-8204088/JDK-8236073

Thomas Schatzl thomas.schatzl at oracle.com
Fri May 21 08:41:25 UTC 2021


Hi,

On 20.05.21 20:00, Jonathan Joo wrote:
> +cc Man Cao (manc at google.com <mailto:manc at google.com>)
> 
> Hi Thomas,
> 
> I've been thinking more about SoftMaxHeapSize and how we might use it. 
> Our preliminary thoughts have revolved around using GC CPU overhead as a 
> metric to determine a reasonable SoftMaxHeapSize value (assuming 
> SoftMaxHeapSize is dynamic and can change at runtime). Do you think this 
> is viable? For example, setting a predetermined target GC CPU overhead, 
> and using this to either increase or decrease SoftMaxHeapSize accordingly.

Yes.

> 
> Doing this may also have the benefit of removing the need for 
> MinHeapFreeRatio, MaxHeapFreeRatio, and GCTimeRatio flags. Because the 
> heap size will be changed solely based on GC CPU usage, we may not need 
> these separate flags to trigger heap resizing events.

What you suggest is exactly like the GCTimeRatio flag (specified in a 
different way though), and actually that's what it's supposed to do. 
Size the amount of committed and used heap so that the gc cpu overhead 
(or in this case ratio between gc cpu usage and mutator cpu usage) is 
kept at a certain level.

However as at least indicated, the current heap sizing based on 
GCTimeRatio is *broken* (basically since day one), and expands too much, 
even on stable loads. This is exactly what these other CRs I mentioned 
earlier are/were supposed to fix. (JDK-8253413 and JDK-8238687, really, 
please have a look what they do :) JDK-8247843 is then about re-tuning 
default GCTimeRatio value; note that I'm not sure how it's specified as 
a ratio is perfect).

I think they also remove or at least push back the use of 
MinHeapFreeRatio and MaxHeapFreeRatio. (Removing it a bit more with 
JDK-8248324 I think, but there were thoughts to go further for full gc, 
because otherwise it won't complement what non-full gcs do).

After that I wanted to add SoftMaxHeapSize as another sizing condition 
for cases that this heuristic does not catch.

> 
> I'm sure there are a number of factors that go into deciding whether a 
> heap is under or over-provisioned, but I'm wondering if there are any 
> significant ones that need to be considered alongside GC CPU usage. I 
> can also see long pause times as being an indicator that GC may need to 
> run more frequently, etc.

Long pause times are often an indication of a) changing application 
behavior or b) the prediction being way off.

One of these patches I mentioned earlier also improves the latter by 
remodeling young gen sizing. Unfortunately it does not fix cardinality 
estimations for the remembered sets, which impact young gen sizing a 
lot, that would be JDK-82231731 for which there are ideas/prototypes. 
This kind of needed an overhaul of the remembered sets though 
(JDK-8017163, which is out for review _now_...).

Problem a) is kind of a research question that needs to be addressed at 
some point. There is some experience about the causes and how one could 
detect them, but nothing concrete. It seems that getting problem b) out 
of the way will likely decrease the work to be spent on a) significantly 
anyway...

 > (Though I'm not sure whether these will be
 > implicitly encompassed as part of GC CPU overhead already.)

Pauses are counted towards gctimeratio.

> 
> Let me know what you think - happy to also set up a meeting to discuss 
> this in more detail.

I believe you can say that we are aware of these issues, and there is 
already some "grand plan" sort of in place (if you look at the bugs 
assigned to me) to get at least approximately where you want to be - I 
think at least, if I understand your problem and ideas correctly :). In 
any case, getting there takes time, and hence help would be very 
appreciated.

Fine with me about talking about this in more detail.

Thanks,
   Thomas



More information about the hotspot-gc-dev mailing list