Priority redistribution in Old Object Sample event

Erik Gahlin erik.gahlin at oracle.com
Wed Jun 14 07:37:19 UTC 2023


The intent of the code is to remove the sample from both the time-based list and the priority queue as the object is no longer alive. The weight of the object, the allocation span, in the time-based list, should be divided to its neighbour(s). The rationale for this is to keep samples evenly distributed over time, or to be exact, the amount allocation has happened. It will be as if the object was never sampled (as if the TLAB size would be larger)

The priority queue exists to find the sampled object with the lowest allocation span quickly when the list fills up. It needs to be replaced with a sampled object with higher weight/span.

This may, or may not, be what the code does. It was written many years ago, and I need to re-read the code to check if that is really the case, but before digging deeper, I want to check if we are on the same page when it comes to the intent

Erik


________________________________
From: hotspot-jfr-dev <hotspot-jfr-dev-retn at openjdk.org> on behalf of Galder Zamarreno <galder at redhat.com>
Sent: Wednesday, June 14, 2023 6:30 AM
To: hotspot-jfr-dev at openjdk.org <hotspot-jfr-dev at openjdk.org>
Subject: Priority redistribution in Old Object Sample event

Hi,

I've been porting over JFR Old Object Sample event to GraalVM and I had the following doubt about the code in HotSpot:

I'm trying to understand the code in [1]:

```
  // push span onto previous
  if (previous != NULL) {
    _priority_queue->remove(previous);
    previous->add_span(sample->span());
    _priority_queue->push(previous);
  }
```

This code is related to old object sampling. When the queue is full, samples that have been GC'd are removed from the queue. To do so it takes its previous (from the list view) and adds the span of the removed dead object onto it. Basically, I'm wondering why this is done this way.

My initial gut feeling when I first saw the code is that it does it so that the priority queue does not get too imbalanced (it uses a heap underneath) when removing objects, but that only makes sense to me if the previous is not the previous object sampled based on time, but the previous object sampled based on the order of the priority queue, so that's not what seems to happen.

I'm wondering now if the fact that the list view is ordered by insertion time is relevant here. Is it maybe trying to give more importance (by increasing its span) to surviving objects that are near in time to those that have been garbage collected?

Something that's probably related to this is how the span is initially calculated. This is done in [2], which you can find below:

```
_total_allocated += allocated;
const size_t span = _total_allocated - _priority_queue->total();
```

Again when I first looked at it, I thought: well that's just equivalent to:

```
const size_t span = allocated;
```

But clearly things are more nuanced than that.

The only documentation I've found so far is in Marcus' blog post [3]:

> If sampled objects are garbage collected, they are removed and the priority redistributed to the neighbours. The priority is called span, and is currently the size of the allocation, giving more weight to larger (therefore more severe) leaks.

Unfortunately what is a neighbour is not explained, nor why the priority gets redistributed.

Thanks
Galder

[1] https://github.com/openjdk/jdk17u-dev/blob/master/src/hotspot/share/jfr/leakprofiler/sampling/objectSampler.cpp#L264-L269
[2] https://github.com/openjdk/jdk17u-dev/blob/ef86ea2842b1a204834291d9d6665bfcd7b75fbc/src/hotspot/share/jfr/leakprofiler/sampling/objectSampler.cpp#L212-L213
[3] https://hirt.se/blog/?p=1055
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-jfr-dev/attachments/20230614/caf132f7/attachment-0001.htm>


More information about the hotspot-jfr-dev mailing list