ZGC Allocation stall metrics via MXBean
conniall at amazon.com
Thu Oct 24 17:34:26 UTC 2019
Thanks Per, that's helpful to understand.
On exposing allocation stall information via an MXBean, it would be super nice if it was exposed via a bean that implements NotificationEmitter. We're currently using notifications from the GarbageCollectorMXBean to subscribe to GC events and record data on pause duration, cause, etc, as well as what in-flight operations may have been impacted by the pause. If we could use a similar approach to watch for allocation stalls, instead of polling for stalls via the ThreadMXBean, that would be awesome.
On 10/24/19, 00:47, "Per Liden" <per.liden at oracle.com> wrote:
When allocating a small object (object size <= 256K), if the thread
already has a TLAB it will continue to allocate from it without being
stalled. If the TLAB is exhausted, the thread will try to allocate a new
TLAB from a CPU-local ZPage (this can be seen as a "CPU-LAB" for
allocating TLABS), again without being stalled. Only if that CPU-local
ZPage is also exhausted will the thread try to allocate a new ZPage, in
which case it will be stalled if we're currently out of memory.
The allocation path is slightly different when allocating medium objects
(object size <= 4M). In this cases, the first attempt is to allocate the
object into a global/shared medium ZPage. If that page is exhausted, it
will try to allocate a new medium ZPage, and is subject to allocation
stall if we're out of memory.
For large objects (object size > 4M), we always allocate a new large
ZPage, so we'll have an allocation stall if we're out of memory.
In summary, if we're out of memory, a thread might still be able to
allocate obejcts without being stalled. If circumstances are right.
Exposing allocation stall information via an MXBean might be useful. We
certainly have the information, so it's mostly a question about if and
how we want to expose it. Just thinking out loud, one could imagine
adding something to c.s.m.GarbageCollectorMXBean or c.s.m.ThreadMXBean,
or maybe even introduce a c.s.m.ZGarbageCollectorMXBean.
On 10/24/19 6:08 AM, Connaughton, Niall wrote:
> I was going to ask the same question.
> In addition - is there any documentation on how the allocation stalls work? I'm looking to understand things like whether the stall happens to any thread that attempts to allocate a new object, or only threads that need a new TLAB, or some other mechanism. Put another way - if we do something like jHiccup and have a thread constantly sleeping and allocating a small amount, would it detect allocation stalls? Or would it not be stalled until it exhausts its TLAB?
> On 10/22/19, 11:19, "zgc-dev on behalf of Sundara Mohan M" <zgc-dev-bounces at openjdk.java.net on behalf of m.sundar85 at gmail.com> wrote:
> I was trying to get GC metrics via GarbageCollectorMXBean but only see
> CollectionCount and CollectionTime.
> Even though i can get the Allocation Stall event from gc log i have to do
> some special setup to get that collected and reported properly.
> Since ZGC allocation stall is important event to identify if the
> application is having issue, can we expose it via any other MXBean?
More information about the zgc-dev