Integrated: 8353184: ZGC: Simplify and correct tlab_used() tracking
Stefan Johansson
sjohanss at openjdk.org
Tue May 13 07:47:00 UTC 2025
On Wed, 23 Apr 2025 07:58:35 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
> Please review this change to improve TLAB handling in ZGC.
>
> **Summary**
> In ZGC the maximum TLAB size is 256k and in many cases we want the TLABs to be this big. But for threads only allocating a fraction of this, using TLABs of this size will render significant waste. This is normally handled by the shared TLAB sizing heuristic, but there have been a few things in ZGC which have prevented this mechanism to work as expected.
>
> The heuristic bases the resizing on several things, and the GC is responsible for providing the amount used memory for TLABs (`tlab_used()`) and the capacity available for TLABs (`tlab_capacity()`). Capacity is more or less the size of Eden for the other GCs, but ZGC does not have any generation sizes there is no given size for Eden. Before this change we returned the heap capacity as the TLAB capacity, since in theory we can use what is left for TLABs. Returning this, more or less disables the sizing heuristic since we only sample the usage when this holds:
>
> bool update_allocation_history = used > 0.5 * capacity;
> ```
>
> So we need to come up with a better value to return as capacity, we could use the amount of free memory, but this is also an over estimation of what will actually be used. The proposed approach is to use an average over the last 10 values of what was actually used for TLABs as the capacity. This will provide a good estimate of what the expected TLAB capacity is and the sizing heuristic will work as expected.
>
> Another problem in this area is that since ZGC does TLAB retiring concurrently, the used value returned has previously been reset before used in the sizing heuristic. So to be able to use consisten values, we need to snapshot the usage in the mark start pause for the young generation and use those value for any TLAB retired after this pause.
>
> How we track the TLAB used value is also changed. Before this change, TLAB used was tracked per-cpu and the way it was implemented let to some unwanted overhead. We added two additional fields that were tracked for all ages, but only used for Eden. These fields were cleared in the mark start pause, and when having many CPUs this actually affect the pause time. The new code tracks the Eden usage in the page-allocator instead.
>
> This change also fixes to that the maximum TLAB size returned from ZGC is in words not bytes, which will mostly help logging, since the actual sizing is still enforced correctly.
>
> **Testing**
> * Functional testing tier1-tier7
> * Performance testing in A...
This pull request has now been integrated.
Changeset: 526f543a
Author: Stefan Johansson <sjohanss at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/526f543adfeb90341b3b5b18916c1bb7ef725599
Stats: 227 lines in 12 files changed: 180 ins; 41 del; 6 mod
8353184: ZGC: Simplify and correct tlab_used() tracking
Reviewed-by: stefank, aboldtch
-------------
PR: https://git.openjdk.org/jdk/pull/24814
More information about the hotspot-gc-dev
mailing list