Priority redistribution in Old Object Sample event
Thomas Stüfe
thomas.stuefe at gmail.com
Wed Jun 14 07:55:53 UTC 2023
Hi Erik,
I have some small additional questions:
- span is redistributed to its (in time) predecessor sample. Following your
explanation, should the span not be evenly divided between predecessor and
successor?
- I am looking at the code that calculates the span in the first place:
https://github.com/tstuefe/jdk/blob/ba837b4bfa2dea85653d8a8fccd0817a569b4378/src/hotspot/share/jfr/leakprofiler/sampling/objectSampler.cpp#L229-L230
The span of a new sample is calculated as "_total_allocated (as of now) -
sum-of-all-spans-of-samples-in-prio-queue". _total_allocated increases
monotonously, whereas the prio queue only contains *retained* samples. This
means that the new sample's span is the "what had been allocated since the
last sample was taken" + "all spans of samples that had been discarded". Is
that intentional? Would this not mean that with prolonged sampling, the
spans get ever larger?
I may be reading this code wrong.
Thank you!
..Thomas
On Wed, Jun 14, 2023 at 9:37 AM Erik Gahlin <erik.gahlin at oracle.com> wrote:
> The intent of the code is to remove the sample from both the time-based
> list and the priority queue as the object is no longer alive. The weight
> of the object, the allocation span, in the time-based list, should be
> divided to its neighbour(s). The rationale for this is to keep samples
> evenly distributed over time, or to be exact, the amount allocation has
> happened. It will be as if the object was never sampled (as if the TLAB
> size would be larger)
>
> The priority queue exists to find the sampled object with the lowest
> allocation span quickly when the list fills up. It needs to be replaced
> with a sampled object with higher weight/span.
>
> This may, or may not, be what the code does. It was written many years
> ago, and I need to re-read the code to check if that is really the case,
> but before digging deeper, I want to check if we are on the same page when
> it comes to the intent
>
> Erik
>
>
> ------------------------------
> *From:* hotspot-jfr-dev <hotspot-jfr-dev-retn at openjdk.org> on behalf of
> Galder Zamarreno <galder at redhat.com>
> *Sent:* Wednesday, June 14, 2023 6:30 AM
> *To:* hotspot-jfr-dev at openjdk.org <hotspot-jfr-dev at openjdk.org>
> *Subject:* Priority redistribution in Old Object Sample event
>
> Hi,
>
> I've been porting over JFR Old Object Sample event to GraalVM and I had
> the following doubt about the code in HotSpot:
>
> I'm trying to understand the code in [1]:
>
> ```
> // push span onto previous
> if (previous != NULL) {
> _priority_queue->remove(previous);
> previous->add_span(sample->span());
> _priority_queue->push(previous);
> }
> ```
>
> This code is related to old object sampling. When the queue is full,
> samples that have been GC'd are removed from the queue. To do so it takes
> its previous (from the list view) and adds the span of the removed dead
> object onto it. Basically, I'm wondering why this is done this way.
>
> My initial gut feeling when I first saw the code is that it does it so
> that the priority queue does not get too imbalanced (it uses a heap
> underneath) when removing objects, but that only makes sense to me if the
> previous is not the previous object sampled based on time, but the previous
> object sampled based on the order of the priority queue, so that's not what
> seems to happen.
>
> I'm wondering now if the fact that the list view is ordered by insertion
> time is relevant here. Is it maybe trying to give more importance (by
> increasing its span) to surviving objects that are near in time to those
> that have been garbage collected?
>
> Something that's probably related to this is how the span is initially
> calculated. This is done in [2], which you can find below:
>
> ```
> _total_allocated += allocated;
> const size_t span = _total_allocated - _priority_queue->total();
> ```
>
> Again when I first looked at it, I thought: well that's just equivalent to:
>
> ```
> const size_t span = allocated;
> ```
>
> But clearly things are more nuanced than that.
>
> The only documentation I've found so far is in Marcus' blog post [3]:
>
> > If sampled objects are garbage collected, they are removed and the
> priority redistributed to the neighbours. The priority is called span, and
> is currently the size of the allocation, giving more weight to larger
> (therefore more severe) leaks.
>
> Unfortunately what is a neighbour is not explained, nor why the priority
> gets redistributed.
>
> Thanks
> Galder
>
> [1]
> https://github.com/openjdk/jdk17u-dev/blob/master/src/hotspot/share/jfr/leakprofiler/sampling/objectSampler.cpp#L264-L269
> [2]
> https://github.com/openjdk/jdk17u-dev/blob/ef86ea2842b1a204834291d9d6665bfcd7b75fbc/src/hotspot/share/jfr/leakprofiler/sampling/objectSampler.cpp#L212-L213
> [3] https://hirt.se/blog/?p=1055
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-jfr-dev/attachments/20230614/0e638453/attachment-0001.htm>
More information about the hotspot-jfr-dev
mailing list