Status of JEP-8204088/JDK-8236073
Thomas Schatzl
thomas.schatzl at oracle.com
Fri May 21 08:41:25 UTC 2021
Hi,
On 20.05.21 20:00, Jonathan Joo wrote:
> +cc Man Cao (manc at google.com <mailto:manc at google.com>)
>
> Hi Thomas,
>
> I've been thinking more about SoftMaxHeapSize and how we might use it.
> Our preliminary thoughts have revolved around using GC CPU overhead as a
> metric to determine a reasonable SoftMaxHeapSize value (assuming
> SoftMaxHeapSize is dynamic and can change at runtime). Do you think this
> is viable? For example, setting a predetermined target GC CPU overhead,
> and using this to either increase or decrease SoftMaxHeapSize accordingly.
Yes.
>
> Doing this may also have the benefit of removing the need for
> MinHeapFreeRatio, MaxHeapFreeRatio, and GCTimeRatio flags. Because the
> heap size will be changed solely based on GC CPU usage, we may not need
> these separate flags to trigger heap resizing events.
What you suggest is exactly like the GCTimeRatio flag (specified in a
different way though), and actually that's what it's supposed to do.
Size the amount of committed and used heap so that the gc cpu overhead
(or in this case ratio between gc cpu usage and mutator cpu usage) is
kept at a certain level.
However as at least indicated, the current heap sizing based on
GCTimeRatio is *broken* (basically since day one), and expands too much,
even on stable loads. This is exactly what these other CRs I mentioned
earlier are/were supposed to fix. (JDK-8253413 and JDK-8238687, really,
please have a look what they do :) JDK-8247843 is then about re-tuning
default GCTimeRatio value; note that I'm not sure how it's specified as
a ratio is perfect).
I think they also remove or at least push back the use of
MinHeapFreeRatio and MaxHeapFreeRatio. (Removing it a bit more with
JDK-8248324 I think, but there were thoughts to go further for full gc,
because otherwise it won't complement what non-full gcs do).
After that I wanted to add SoftMaxHeapSize as another sizing condition
for cases that this heuristic does not catch.
>
> I'm sure there are a number of factors that go into deciding whether a
> heap is under or over-provisioned, but I'm wondering if there are any
> significant ones that need to be considered alongside GC CPU usage. I
> can also see long pause times as being an indicator that GC may need to
> run more frequently, etc.
Long pause times are often an indication of a) changing application
behavior or b) the prediction being way off.
One of these patches I mentioned earlier also improves the latter by
remodeling young gen sizing. Unfortunately it does not fix cardinality
estimations for the remembered sets, which impact young gen sizing a
lot, that would be JDK-82231731 for which there are ideas/prototypes.
This kind of needed an overhaul of the remembered sets though
(JDK-8017163, which is out for review _now_...).
Problem a) is kind of a research question that needs to be addressed at
some point. There is some experience about the causes and how one could
detect them, but nothing concrete. It seems that getting problem b) out
of the way will likely decrease the work to be spent on a) significantly
anyway...
> (Though I'm not sure whether these will be
> implicitly encompassed as part of GC CPU overhead already.)
Pauses are counted towards gctimeratio.
>
> Let me know what you think - happy to also set up a meeting to discuss
> this in more detail.
I believe you can say that we are aware of these issues, and there is
already some "grand plan" sort of in place (if you look at the bugs
assigned to me) to get at least approximately where you want to be - I
think at least, if I understand your problem and ideas correctly :). In
any case, getting there takes time, and hence help would be very
appreciated.
Fine with me about talking about this in more detail.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list