Can GC implementations provide a cheap estimation of live set size?

Jaroslav Bachorík jaroslav.bachorik at datadoghq.com
Thu Feb 11 18:09:38 UTC 2021


On Thu, Feb 11, 2021 at 6:55 PM Roman Kennke <rkennke at redhat.com> wrote:
>
> Notice that liveness information is only somewhat reliable right after
> marking. In Shenandoah, this is in the final-mark pause, and then the

Yes, I understand this. What I am looking at is to have something like
'last known liveness' value - captured at a well defined point and
providing an estimate within the bounds of GC implementation.

> program is at a safepoint already. This is where you'd want to emit a
> JMX event or something similar. You can't simply query a counter and
> assume it represents current liveness in the middle or outside of GC
> cycle. This should be true for all GCs.
>
> For Serial and Parallel I am not sure at all that you can do this.
> AFAIK, they don't count liveness at all.
>
> Roman
>
> > Hi Roman,
> >
> > Thanks for your response. I checked ZGC implementation and, indeed, it
> > is very easy to get the liveness information just by extending
> > `ZStatHeap` class to report the last valid value of
> > `_at_mark_end.live`.
> >
> > I am also able to get this info from Shenandoah, although my first
> > attempt still involves a safepointing VM operation since I need to
> > iterate over regions to get the liveness info for each of them and sum
> > it up. I think it is still an acceptable trade-off, though.
> >
> > The next one in the queue is the Serial GC. My assumptions, based on
> > reading the code, are that for young gen 'live = used' at the end of
> > DefNewGeneration::collect() method and for old gen 'live = used -
> > slack' (slack is the cumulative size of objects considered to be alive
> > for the purpose of compaction although they are really dead - see
> > CompactibleSpace::scan_and_forward()). Does this sound reasonable?
> >
> > I will post my findings for Parallel GC and G1 GC later.
> >
> > Cheers,
> >
> > -JB-
> >
> > On Wed, Feb 10, 2021 at 11:34 AM Roman Kennke <rkennke at redhat.com> wrote:
> >>
> >> Hello Jaroslav,
> >>
> >>> In connection with https://bugs.openjdk.java.net/browse/JDK-8258431 I
> >>> am trying to figure out whether providing a cheap estimation of live
> >>> set size is something actually achievable across various GC
> >>> implementations.
> >>>
> >>> What I am looking at is piggy-backing on a concurrent mark task to get
> >>> the summary size of live objects - using the 'straight-forward'
> >>> heap-inspection like approach is prohibitively expensive.
> >>
> >> In Shenandoah, this information is already collected during concurrent
> >> marking. We currently don't print it directly, but we could certainly do
> >> that. I'll look into implementing it. I'll also look into exposing
> >> liveness info via JMX.
> >>
> >> I'm not quite sure about G1: that information would only be collected
> >> during mixed or full collections. I am not sure if G1 prints it, though.
> >>
> >> ZGC prints this under -Xlog:gc+heap:
> >>
> >> [6,502s][info][gc,heap     ] GC(0)                Mark Start
> >> Mark End        Relocate Start      Relocate End           High
> >>         Low
> >> [6,502s][info][gc,heap     ] GC(0)  Capacity:      834M (10%)
> >> 1076M (13%)        1092M (14%)        1092M (14%)        1092M (14%)
> >>        834M (10%)
> >> [6,502s][info][gc,heap     ] GC(0)      Free:     7154M (90%)
> >> 6912M (87%)        6916M (87%)        7388M (92%)        7388M (92%)
> >>       6896M (86%)
> >> [6,502s][info][gc,heap     ] GC(0)      Used:      834M (10%)
> >> 1076M (13%)        1072M (13%)         600M (8%)         1092M (14%)
> >>        600M (8%)
> >> [6,502s][info][gc,heap     ] GC(0)      Live:         -
> >> 195M (2%)          195M (2%)          195M (2%)             -
> >>          -
> >> [6,502s][info][gc,heap     ] GC(0) Allocated:         -
> >> 242M (3%)          270M (3%)          380M (5%)             -
> >>          -
> >> [6,502s][info][gc,heap     ] GC(0)   Garbage:         -
> >> 638M (8%)          606M (8%)           24M (0%)             -
> >>          -
> >> [6,502s][info][gc,heap     ] GC(0) Reclaimed:         -
> >>    -                32M (0%)          614M (8%)             -
> >>         -
> >>
> >> I hope that is useful?
> >>
> >> Thanks,
> >> Roman
> >>
> >
>



More information about the hotspot-gc-dev mailing list