RFR: 8368015: Shenandoah: fix error in computation of average allocation rate [v2]
Kelvin Nilsen
kdnilsen at openjdk.org
Tue Sep 23 15:41:13 UTC 2025
On Mon, 22 Sep 2025 23:07:16 GMT, William Kemper <wkemper at openjdk.org> wrote:
>> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
>>
>> remove state variable previous GC allocated
>
> src/hotspot/share/gc/shenandoah/heuristics/shenandoahAdaptiveHeuristics.cpp line 364:
>
>> 362:
>> 363: double ShenandoahAllocationRate::force_sample(size_t allocated, size_t &unaccounted_bytes_allocated) {
>> 364: const double MinSampleTime = 0.002; // Do not sample if time since last update is less than 2 ms
>
> Would it make sense to "carry over" the `unaccounted_bytes_allocated` if, at the time of reset, the last sample was taken less than `ShenandoahAllocationRate::_interval_sec` (instead of `MinSampleTime`)? I believe this would result in more samples being carried over.
I've done experiments with a variety of different values for MinSampleTime (e.g. 50 ms, 25ms, 2 ms). I find that any value
greater than 1 ms works pretty well in the sense that the measured rate is pretty consistent with the average rate as computed
by the preexisting mechanisms.
The biggest problem I observed was when we tried to measure rate over an interval of a few microseconds, and these intervals
sometimes had zero allocations.
We could make the change you suggest, but that would actually result in fewer samples rather than more. It would also make
the value of bytes_allocated_since_gc_start() less precise in that this quantity would carry up to 100 ms of allocations that
actually occurred before the start of the current GC.
Here's an ASCII art attempt to illustrate how it works in the current implementation.
Here is a timeline with a tick mark to illustrate the desired sample time, every 100 ms:
.1s .2s .3s .4s. .5s .6s .7s .8s. .9s 1s 1.1s 1.2s
|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---
^ ^
Existing | |
Samples: A B C D E
| |
Proposed | |
Samples: A | B C D
| |
| |
Trigger GC at 0.14s Finish GC at 1.03s
In either approach, we sample at 0.1s and this sample contributes to the average allocation rate that causes us to trigger at
time 0.14s. In the current implementation, at time 0.14s, we take a new sample representing the most current allocation rate
as computed over the span between 0.1s and 0.14s. As implemented, if this time delta (0.14s minus 0.10s =0.04s) is less than
2 ms, we'll skip sample B. I think you are proposing that we skip sample B whenever this delta is less than 100 ms
(_interval_sec).
We do not sample during GC. As soon as GC finishes, we begin asking the question, should I start another GC. This resumes the
sampling activity, aligning the periodic samples on the time at which we finished previous GC cycle.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27398#discussion_r2372740860
More information about the hotspot-gc-dev
mailing list