G1 AHS + Request for Feedback and Testing on G1 Heap Resizing Prototype

Monica Beckwith Monica.Beckwith at microsoft.com
Thu May 8 22:47:25 UTC 2025


Hi all,

Thanks to everyone for the ongoing AHS discussions across 8236073, 8238686/87, and umbrella JDK-8353716.

>From the Microsoft side, we have been reviewing logs from a range of prod-like use cases across the broader MSFT environment, including first-party Java services (both Azure-hosted and non-Azure), as well as OSS-based deployments (Cassandra, Kafka, etc). We've also been benchmarking with various combinations (ReservePercent, GCTimeRatio, periodic GC, etc) and exploring early models to help gauge expected shrink/grow behavior under service conditions. These observations have shaped our perspective and contributions to upstream design discussions.

Here's where we currently stand:

------------------------------------------------------------------------
1.  SoftMaxHeapSize semantics and placement
------------------------------------------------------------------------

We continue to support the current SoftMax proposal as a **soft upper bound** on heap usage—one that the GC controller respects, but may temporarily exceed if necessary. Our analysis of logs shows that an effective SoftMax, even when static, would help reduce RSS under light traffic without requiring aggressive full GCs.

We also plan to evaluate the controller changes under PR #24211 once they’re merged, and we’d like to keep the option of a `jcmd GC.set_soft_max` interface, consistent with ZGC and future container signals (e.g. memory.high).

------------------------------------------------------------------------
2.  GCTimeRatio as a feedback driver
------------------------------------------------------------------------

We support the move to a higher default value for `GCTimeRatio` as it aligns well with throughput goals in our measured workloads, including SPECjbb2015, DBs, and Spring-based services. We plan to continue stepped testing across representative service patterns.  We'd also support exposing an alias like `-XX:GCCPUPercent` to improve ergonomics for operators. 

------------------------------------------------------------------------
3.  Reserve floor and shrink control
------------------------------------------------------------------------

We strongly recommend retaining `G1ReservePercent` as a configurable minimum, particularly in low-latency scenarios or when allocation bursts are expected immediately after idle phases. We’d also be open to exploring future adaptive variants of the reserve floor as the AHS loop matures.

------------------------------------------------------------------------
4.  Periodic GC fallback and field heuristics
------------------------------------------------------------------------

Until AHS-driven shrink behavior is well understood and widely adopted, we recommend retaining a periodic GC safety net—especially for services with extended idle phases. As AHS matures, we’ll continue to evaluate whether this fallback remains necessary in production.

------------------------------------------------------------------------
5.  Role of externally-supplied limits
------------------------------------------------------------------------

Internally, we’ve discussed how AHS should behave in managed container environments such as AKS. In most cases we expect the JVM to operate within cgroup-defined memory.max and possibly memory.high bounds.

We don’t currently envision supporting non-cgroup (custom/embedded) environments on day one. We also believe that memory.high or RSS-based constraints could eventually serve as complementary signals for guiding heap elasticity, especially for AKS customers.

These use cases are still exploratory, but we hope they can be accommodated within the direction of AHS without adding undue complexity to the core loop.

------------------------------------------------------------------------
6.  Design notes and alignment
------------------------------------------------------------------------

For reference, our current AHS evaluation and alignment write-up (including control flow diagrams and tuning strategy) is here:

    https://github.com/microsoft/openjdk-workstreams/tree/main/G1-AHS

We’ll continue to update that as PRs land and more data becomes available. We welcome any feedback on the write-up or our alignment approach and would be happy to incorporate community input via PRs. We are also open to hosting the write-up within an OpenJDK project repo if that's deemed appropriate.

Thanks again to everyone driving this effort forward—happy to continue refining as the pieces come together.

Best regards,  
Monica



More information about the hotspot-gc-dev mailing list