Further discussion on Adaptable Heap Sizing with G1

Tue Nov 12 19:11:58 UTC 2024

Hi everyone,
Thank you all for the valuable and detailed discussion around AHS and heap management for G1. I wanted to share some thoughts that align with Thomas’s comments and clarify the best path forward, especially given the distinctions between AHS (Automatic Heap Sizing) and Google’s Adaptive Heap Sizing (AHS-Google). I’ve included simple diagrams to illustrate the technical flow and interactions of each approach.

1. Consolidating Around SoftMaxHeapSize for Dynamic, Adaptive Sizing

Thomas’s suggestion to prioritize SoftMaxHeapSize as the main dynamic driver aligns with my understanding of an effective AHS model. Using SoftMaxHeapSize in this way allows us to minimize the CPU overhead associated with frequent uncommit/commit cycles, which would be a potential risk with a more rigid setting like ProposedHeapSize. Here’s a basic illustration of how Automatic Heap Sizing (AHS) with SoftMaxHeapSize would work dynamically:
   +-----------------------------+
   |        External Inputs      |
   |-----------------------------|
   | - Global Memory Pressure    |
   | - GCTimeRatio policy        |
   | - Heap tunables via         |
   |   commandline               |
   +-----------------------------+
               |
               v
   +-----------------------------+
   |       Automatic Heap        |
   |          Sizing (AHS)       |
   +-----------------------------+
               |
               v
   +-----------------------------+
   |  SoftMaxHeapSize (Dynamic)  |
   | - Guides heap size          |
   | - Shrinks under pressure    |
   | - Uses target heuristics    |
   +-----------------------------+
               |
               v
   +-----------------------------+
   |     JVM Heap Management     |
   | - Adjusts committed memory  |
   | - Controls expansions &     |
   |   contractions smoothly     |
   +-----------------------------+

By consolidating around SoftMaxHeapSize as the primary “target” flag, we create a more straightforward, adaptive, and consistent experience.

2. The AHS-Google Approach and Its Challenges

Google’s current Adaptive Heap Sizing (AHS-Google) approach uses ProposedHeapSize as a fixed committed size target. While this allows for setting a specific target for memory use, it introduces some challenges, particularly with forced uncommit/commit cycles that might ignore dynamic inputs. Here’s how this approach typically functions:
   +-----------------------------+
   |        AHS-Google Logic     |
   |-----------------------------|
   | - Periodic GC with target   |
   | - Uses ProposedHeapSize as  |
   |   "optimal" committed size  |
   +-----------------------------+
               |
               v
   +-----------------------------+
   |    ProposedHeapSize (Fixed) |
   | - Forced committed memory   |
   | - Overrides dynamic inputs  |
   | - Can cause frequent        |
   |   uncommit/commit cycles    |
   +-----------------------------+
               |
               v
   +-----------------------------+
   |     JVM Heap Management     |
   | - Follows set memory level  |
   | - May ignore external       |
   |   pressure signals          |
   +-----------------------------+

A purely AHS-based approach would allow SoftMaxHeapSize to adjust dynamically in response to real-time signals without forcing committed memory levels. This avoids unnecessary CPU cycles and provides a more adaptive response to environmental changes, such as fluctuating memory demands in containerized and cloud environments.

3. Key Differences Between AHS and AHS-Google

In my understanding:
•       AHS (Automatic Heap Sizing): Focuses on finding a reasonable heap size based on external memory pressure and dynamically adjusts according to environmental inputs. This aligns with Thomas’s point that AHS should allow for minimal user intervention and let dynamic factors guide heap behavior.
•       AHS-Google: Treats ProposedHeapSize as a fixed input, overriding dynamic adjustments. While this gives more explicit control, it limits adaptability and could introduce inefficiencies, as mentioned earlier.

4. Moving Forward with a Balanced, Dynamic AHS for G1

Based on the discussion, I suggest we focus on developing an AHS model that leverages SoftMaxHeapSize as the adaptable target, allowing the JVM to adjust based on real-time memory pressures and CPU usage. Integrating multiple inputs dynamically will create a robust model for managing “noisy neighbor” challenges—a very real need in today’s cloud and container scenarios and one that AHS is well-suited to manage, as highlighted in Erik’s recent JVMLS presentation.
Thank you all again for the insightful conversation and technical contributions. I believe these steps will help us build a technically sound and stable AHS for G1.

Please feel free to correct any misunderstandings or clarify any points where further alignment is needed.
Regards,
Monica

[https://res-h3.public.cdn.office.net/assets/bookwithme/misc/CalendarPerson20px.png]<https://outlook.office.com/bookwithme/user/6dc2f1f46dfd446aa456d1c1245cecd6@microsoft.com?anonymous&ep=bwmEmailSignature>          Book time to meet with me<https://outlook.office.com/bookwithme/user/6dc2f1f46dfd446aa456d1c1245cecd6@microsoft.com?anonymous&ep=bwmEmailSignature>

From: hotspot-gc-dev <hotspot-gc-dev-retn at openjdk.org> On Behalf Of Jonathan Joo
Sent: Thursday, October 17, 2024 7:11 PM
To: Thomas Schatzl <thomas.schatzl at oracle.com>
Cc: hotspot-gc-dev at openjdk.org
Subject: Re: Further discussion on Adaptable Heap Sizing with G1

Hi Thomas,

The points you mentioned make sense to me! There are some nuances that I'd like to dig into further to make sure that we are aligned. I think to summarize - I'm not sure exactly how SoftMaxHeapSize is intended to work, whereas we have experimented with ProposedHeapSize at Google already, so I want to bridge my gap in understanding there.

I appreciate you offering to meet and discuss! As far as meeting time - I'm currently in US Pacific time, but flexible in terms of when we meet. (I am generally awake from 9am-1am PT, so I am good to meet any time in that time period -- please let me know what time works best for you.) Tuesday and Thursday of the coming week I have the most availability, but if you have any other dates/times in mind, I can let you know whether that works for me.

Best,

~ Jonathan

On Mon, Oct 14, 2024 at 2:52 AM Thomas Schatzl <thomas.schatzl at oracle.com<mailto:thomas.schatzl at oracle.com>> wrote:
Hi,

On 11.10.24 09:16, Jonathan Joo wrote:
> Hi Thomas,
>
>     I think what this suggestion overlooks is that a SoftMaxHeapSize that
>     guides used heap size will automatically guide committed size: i.e. if
>     G1 shrinks the used heap, G1 will automatically shrink (and keep) the
>     committed size.
>
>     So ProposedHeapSize seems to be very similar to SoftMaxHeapSize.
>
>
> If I'm understanding this correctly - both ProposedHeapSize and (the
> proposed version of) SoftMaxHeapSize have similar semantics, but
> actually modify the heap in different ways. SoftMaxHeapSize helps us
> determine when to start a concurrent mark, whereas ProposedHeapSize
> doesn't actually trigger any GC directly, but affects the size of the
> heap after a GC. Is that correct? Would it make sense then to have both
> flags, where one helps set a trigger point for a GC, and one helps us
> determine the heap size we are targeting after the GC? I might also be
> missing some nuances here.

I think SoftMaxHeapSize (or actually either) will result in
approximately the same behavior. The difference is in intrusiveness.

ProposedHeapSize forcefully attempts to decrease the committed heap size
and then the rest of the "heap sizing machinery" follows, while
SoftMaxHeapSize gives a target for the "heap sizing machinery" and
committed heap size follows.

ProposedHeapSize has the following disadvantages (as implemented):

- since it forces committed heap size, I expect that in case you are
close or above that target, you can get frequent uncommits/attempts to
uncommit which waste cpu cycles.

Hopefully, by giving the heap sizing machinery a goal, it will itself
determine a sustainable committed memory level without too frequent
commits and uncommits.

- for some reason it does not allow less memory to be committed than
proposed (but still larger than MinHeapSize). This can be inefficient
wrt to memory usage.
I.e. it basically disables other heap sizing afaict.

- (that's more a nit) the use of "0" as special marker for
SoftMaxHeapSize is unexpected.

This mechanism kind of feels like a very blunt tool to get the desired
effect (a certain committed heap) without caring about other goals. It
may be necessary to pull out the immediately un/commit hammer in some
situations, and imho, let's not give that hammer to users as the first
option to screw themselves.

>
>       I.e. if I understand this correctly: allowing a higher GC overhead,
>     automatically shrinks the heap.
>
>
> Exactly - in practice, tuning this one parameter up (the target gc cpu
> overhead) correlates with decreasing both the average as well as maximum
> heap usage for a java program.
>
>       I noticed the same with the patch attached to the SoftMaxHeapSize CR
>     (https://bugs.openjdk.org/browse/JDK-8236073<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8236073&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506605286%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=tDXYfCNOqe7cR%2FnHwmD%2F87tbeUvo06K6mpTVqKjaYmw%3D&reserved=0>
>     <https://bugs.openjdk.org/browse/JDK-8236073<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8236073&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506634149%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=CE1feUMnxtgoxREf%2BaLK9yyv6a545238mDXDLUMcwF4%3D&reserved=0>>) discounting effects of
>     Min/MaxHeapFreeRatio (i.e. if you remove it,
>     https://bugs.openjdk.org/browse/JDK-8238686<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8238686&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506653746%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=mwYKt9FtJ72rRZYNgtXHRwHBCGRzgIGeeJj56WsQwvs%3D&reserved=0>
>     <https://bugs.openjdk.org/browse/JDK-8238686<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8238686&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506670981%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=qm7IT0%2BgGk%2FC6rEufD0KodFoKtLND40Hu72tyMXW28A%3D&reserved=0>> explains the issue).
>     In practice, these two flags prohibit G1 from adjusting the heap unless
>     the SoftMaxHeapSize change is very large.
>
>
>     So I would prefer to only think of an alternative to SoftMaxHeapSize if
>     it has been shown that it does not work.
>
>
> Given that you have a much stronger mental model than I do of how all
> these flags fit together in the context of G1 GC, perhaps it would be
> helpful to schedule some time to chat in person! I think that would help
> clarify things much more quickly than email. To be clear - I have no
> reason to doubt that SoftMaxHeapSize does not work. On the other hand,
> could we possibly make use of both flags? For example, could
> SoftMaxHeapSize potentially be a good replacement for our periodic GC?

Not sure what periodic GC has to do with SoftMaxHeapSize.

>
>     There is the nit that unlike in this implementation of ProposedHeapSize,
>     SoftMaxHeapSize will not cause uncommit below MinHeapSize. This is
>     another discussion on what to do about this issue - in a comment in
>     https://bugs.openjdk.org/browse/JDK-8236073<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8236073&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506688389%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=U3SJVBO%2F0gtF4xZOLdGi2QMORV8DIht1FciPivWgMA8%3D&reserved=0>
>     <https://bugs.openjdk.org/browse/JDK-8236073<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.openjdk.org%2Fbrowse%2FJDK-8236073&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506706310%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=CMgyTApGCNiM6WoIWf%2F1T9GCXRIpgvaTQ08QJGhSajw%3D&reserved=0>> it is proposed to make
>     MinHeapSize manageable.
>
>
> How useful is MinHeapSize in practice? Do we need it, or can we just set
> it to zero to avoid having to deal with it at all?

I think you are mixing AHS (give decent heap sizing in presence of
external memory pressure) and getting "optimal" heap sizing (or iow
"steering heap size" externally).

AHS is mostly about the user not doing/setting any heap sizes; in this
case just having min heap size very low is just fine just as suggested
in the JEP.

SoftMaxHeapSize (and ProposedHeapSize) is about the user setting a
particular goal according to his whims. It is still interesting to set
-Xms==-Xmx for e.g. fast startup or during heavy activity; however if an
external system decides that it is useful to intermittently save memory
up to a certain level, then follow that guidance.

The mechanism to internally follow that guidance can be used by AHS.

>
>     I (still) believe that AHS and SoftMaxHeapSize/ProposedHeapSize are
>     somewhat orthogonal.
>
>     AHS (https://openjdk.org/jeps/8329758<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopenjdk.org%2Fjeps%2F8329758&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506722980%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=Aymh70Ju5C%2FX4hmtj2ri7EX4v%2F7NCA733kDBxAp%2FX8I%3D&reserved=0>
>     <https://openjdk.org/jeps/8329758<https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fopenjdk.org%2Fjeps%2F8329758&data=05%7C02%7Cmonica.beckwith%40microsoft.com%7Cfdcf0de799d34a28c1a708dcef098c5b%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638648071506741674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=quHGouKrnBKMDAHLTmnKVOGZ8%2FCDY0z79FiRL5xN%2BSg%3D&reserved=0>>) is about finding a reasonable
>     heap size, and adjust on external "pressure". SoftMax/ProposedHeapSize
>     are manual external tunings.
>
>
>     Wdyt?
>
>
> I agree with the general idea - for us, we used a manual external flag
> like ProposedHeapSize because we did not implement any of the AHS logic
> in the JVM. (We had a separate AHS thread reading in container
> information and then doing the calculations, then setting
> ProposedHeapSize as a manageable flag.) The way I see it is that
> SoftMax/ProposedHeapSize is the "output" of AHS, and then
> SoftMax/ProposedHeapSize is the "input" for the JVM, after which the JVM
> uses this input to adjust its behavior accordingly. Does that align with
> how you see things?

As mentioned in the other thread, SoftMaxHeapSize can be used by AHS to
get heap to a certain level (based on memory pressure), but there is
also that external entity that can modify SoftMaxHeapSize to adjust VM
behavior.

So ultimately there will be multiple inputs for target heap size (and
probably I'm forgetting one or the other):

* External memory pressure (AHS) (*)

* CurrentMaxHeapSize

* SoftMaxHeapSize

* CPU usage (existing GCTimeRatio based policy)

* other *HeapSize flags

that need to be merged into some target heap level using some policy.

After knowing that level, the VM needs to decide on a proper reaction,
which might be anything from just setting internal IHOP goal, to
(un-)committing memory directly, to doing the appropriate garbage
collection in a "timely" fashion (which is where the regular periodic
gc/marking comes in) or anything inbetween.

(*) I am aware that the AHS JEP not only includes reaction on external
memory pressure but also the merging of goals for different sources;
some of them are ZGC specific. Some of them are already implemented in
G1. So for this discussion it is imo useful to limit "AHS" in G1 context
to things that G1 does not do. Ie. "return another goal based on
external memory pressure", "min/max heap size defaults(!)", and "adjust
adaptive sizing".

> If we do indeed implement AHS logic fully within the JVM, then we could
> internally manage the sizing of the heap without exposing a manageable
> flag. That being said, it seems to me that exposing this as a manageable
> flag brings the additional benefit that one could plug in their own AHS
> implementation that calculates target heap sizes with whatever data they
> want (and then passes it into the JVM via the manageable flag).
>
> Again, I wonder if meeting to discuss would be efficient, and then we
> can update the mailing list with the results of our discussion. Let me
> know your thoughts!

It's fine with me to meet to recap and discuss above; please suggest
some time.

Hth,
   Thomas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20241112/17e4a9a2/attachment-0001.htm>