RFR: 8368015: Shenandoah: fix error in computation of average allocation rate [v2]

Wed Sep 24 21:56:47 UTC 2025

On Tue, 23 Sep 2025 15:36:42 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

>> src/hotspot/share/gc/shenandoah/heuristics/shenandoahAdaptiveHeuristics.cpp line 364:
>> 
>>> 362: 
>>> 363: double ShenandoahAllocationRate::force_sample(size_t allocated, size_t &unaccounted_bytes_allocated) {
>>> 364:   const double MinSampleTime = 0.002;    // Do not sample if time since last update is less than 2 ms
>> 
>> Would it make sense to "carry over" the `unaccounted_bytes_allocated` if, at the time of reset, the last sample was taken less than `ShenandoahAllocationRate::_interval_sec` (instead of `MinSampleTime`)? I believe this would result in more samples being carried over.
>
> I've done experiments with a variety of	different values for MinSampleTime (e.g. 50 ms, 25ms, 2 ms).  I	find that any value
> greater	than 1 ms works	pretty well in the sense that the measured rate	is pretty consistent with the average rate as computed
> by the preexisting mechanisms.
> 
> The biggest problem I observed was when	we tried to measure rate over an interval of a few microseconds, and these intervals
> sometimes had zero allocations.
> 
> We could make the change you suggest, but that would actually result in	fewer samples rather than more.	 It would also make
> the value of bytes_allocated_since_gc_start() less precise in that this quantity	would carry up to 100 ms of allocations	that
> actually occurred before the start of the current GC.
> 
> Here's an ASCII art attempt to illustrate how it works	in the current implementation.
> 
> Here is	a timeline with	a tick mark to illustrate the desired sample time, every	100 ms:
> 
> 
>       	 .1s  	   .2s	     .3s       .4s.     .5s  	   .6s	     .7s       .8s.    .9s  	   1s	      1.1s      1.2s
> |---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---
>       	       ^      	      	      	      	    	      	      	      	      	      	       ^
> Existing       |                                                                                       |
> Samples:  A    B                                                                                       C      	 D         E
>                |                                                                                       |
> Proposed       |                                                                                       |
> Samples:  A    |                                                                                       B      	 C    	   D
>                |                                                                                       |
>                |                                                                                       |
>       	       Trigger GC at 0.14s    	      	      	      	      	      	      	      	       Finish GC at 1.03s
> 
> In either approach, we sample at 0.1s and this sample contributes to the average allocation rate that causes us	to trigger at
> time 0.14s.  In	the current implementation, at time 0.14s, we take a new sample	representing the most current allocation rate
> as computed over the span between 0.1s and 0.14s.  As implemented, if this time	delta (0.14s minus 0.10s =0...

With the change in the PR, I was thinking of "skipped" samples as being "carried over" to the next cycle (per `unaccounted_bytes_allocated = allocated - _last_sample_value;`).

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27398#discussion_r2377167640