RFR: 8368015: Shenandoah: fix error in computation of average allocation rate

Sat Sep 20 00:11:28 UTC 2025

On Fri, 19 Sep 2025 20:43:59 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> We use bytes_allocated_since_gc_start() to compute allocation rates.  This leaves a blind spot, as our current implementation ignores allocations and the time period between the moment we begin GC and the first time we update the allocation rate following the start of GC.  When this happens, we typically find that the sampled number of allocations is smaller than the allocations that had accumulated by the time we triggered the start of the current GC cycle.
>> 
>> This PR adds tracking for that accounting gap.
>
> src/hotspot/share/gc/shenandoah/heuristics/shenandoahAdaptiveHeuristics.cpp line 374:
> 
>> 372:     _rate.add(rate);
>> 373:     _rate_avg.add(_rate.avg());
>> 374:     _last_sample_time = now;
> 
> Note: in original implementation, the first sample() collected following the start of GC typically finds that allocated < _last_sample_value, so we ignore all data associated with that time gap.  With Shenandoah, the next sample happens the first time we ask should_start_gc() following the completion of the current GC.  With GenShen, the next sample happens after marking, when we calculate the allocation runway in order to size our reserves.

It is possible that the first sample() collected() following the start of GC finds allocated > _last_sample value.  If this happens, the above computes a bogus result.  It thinks we've only allocated (new_sample_value - _last_sample_value) bytes in total time (now - _last_sample_time).  In truth, we've allocated new_sample_value plus currently_unknown_allocations following _last_sample_time but before the start of GC.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27398#discussion_r2364497570