Status of JEP-8204088/JDK-8236073

Thomas Schatzl thomas.schatzl at oracle.com
Wed Jun 9 09:15:29 UTC 2021


Hi Jonathan,

On 08.06.21 01:16, Jonathan Joo wrote:
> Hi Thomas,
> 
> 
> I took some time to read through the bugs related to GCTimeRatio.
> 
> 
> I think GCTimeRatio *may* work for this purpose, if all of the relevant 
> open issues are addressed. Like you mentioned in your email, I was 
> indeed able to repro the fact that even when GCTimeRatio is set to 
> aggressive levels (i.e. GCTimeRatio=1), too much of the heap is still 
> allocated. So fixing the related bugs may definitely help here, and I'll 
> experiment more with your proposed fixes.

Okay, thanks for giving them a try.

> Furthermore, I'd like to also
> investigate how well SoftMaxHeapSize works at keeping heap usage within 
> the limit - you mentioned in your earlier email that the heap sizing 
> issues have been addressed but I wasn't sure of the exact status of 
> that. I'll patch your changes at 
> https://github.com/tschatzl/jdk/tree/8238687-investigate-memory-uncommit-during-young-gc2 to 
> get a firsthand idea.

Summing it up, the current available patches are:

JDK-8238687 and JDK-8253413: improves (re-)sizing policy and acts on 
that at any young gc:
https://github.com/tschatzl/jdk/tree/8238687-investigate-memory-uncommit-during-young-gc2
JDK-8248324: removes heap resizing at remark, which used a completely 
different policy anyway. Full gc is still an issue, but "it should not 
happeen". Patch attached to CR.
JDK-8236073: implements SoftMaxHeapSize, patch attached to CR.

Not sure if they address all of the heap sizing issues, but in my tests 
with them, heap size is much more stable and following GCTimeRatio more 
closely.

There may be need to reconsider GCTimeRatio default value after these 
changes, idk.

> 
> 
> However, one consideration against GCTimeRatio is that GCTimeRatio 
> relies on GC pause times, whereas ideally we can use total CPU overhead. 
> (The latter would be able to incorporate time spent by concurrent GC 
> worker threads, which may be constantly doing work in the background. As 
> far as I understand, this is not necessarily reflected in pause times.) 

GCTimeRatio has been introduced with pure STW garbage collectors. At 
that time getting per-process cpu measurements might have been more 
complicated too, and less cpu sharing common.

So some modification of the meaning of "GCTime" for partially concurrent 
GCs may be apropriate.

> Thus, I believe there are slight differences there which make CPU 
> overhead a more accurate measurement of "load" than GC pause times (at 
> least, for the use case we anticipate here at Google).

> 
> 
> We already have developed some internal patches which allow us to 
> compute GC CPU overhead, so using this metric to influence 
> SoftMaxHeapSize shouldn't be too much of a problem for us. Given that we 
> have this information:

Fwiw, in my opinion the intention of SoftMaxHeapSize has been more to 
account for external user requirements not caught by the internal gc 
load, not that gc load should guide SoftMaxHeapSize (and override it) 
directly. I.e. as an orthogonal consideration for heap sizing.

Of course, GCTimeRatio (or GCCpuRatio) ultimately also determines some 
kind of heap size goal, and both SoftMaxHeapSize and that goal 
determined by GCTimeRatio need to be consolidated into a single value - 
after all there can be only one actual heap size in the VM :)

Afaik the current thinking is that that ultimate heap size goal should 
be something like

   min(SoftMaxHeapSize, Goal-set-by-GCTimeRatio)

>  1.
> 
>     Do you see any benefit to using pause times to determine
>     SoftMaxHeapSize rather than CPU overhead? Is one more viable than
>     the other?

To a large degree I think that pause time has (historically) been just a 
more convenient to calculate (cross OS and everything) and a fairly 
accurate substitute for GC cpu overhead.

> 
>  2.
> 
>     Do you think there is value in modifying GCTimeRatio to measure CPU
>     overhead rather than pause times?

This is just my opinion, but yes.

> 
>  3.
> 
>     If not, would it be helpful to still introduce this functionality
>     into the JVM, perhaps as a new JVM flag like `GCCpuRatio`? (So as to
>     not collide with GCTimeRatio's existing functionality.)

Although I agreed above, there may be value in adding a new flag anyway: 
GCTimeRatio is fairly clumsy to use (i.e. GCCpuRatio = 1 / (1 + 
GCTimeRatio)). At least we should make it a floating point value....

Thanks,
   Thomas



More information about the hotspot-gc-dev mailing list