RFR: 8368015: Shenandoah: fix error in computation of average allocation rate [v2]

Tue Sep 23 15:41:13 UTC 2025

On Mon, 22 Sep 2025 23:07:16 GMT, William Kemper <wkemper at openjdk.org> wrote:

>> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   remove state variable previous GC allocated
>
> src/hotspot/share/gc/shenandoah/heuristics/shenandoahAdaptiveHeuristics.cpp line 364:
> 
>> 362: 
>> 363: double ShenandoahAllocationRate::force_sample(size_t allocated, size_t &unaccounted_bytes_allocated) {
>> 364:   const double MinSampleTime = 0.002;    // Do not sample if time since last update is less than 2 ms
> 
> Would it make sense to "carry over" the `unaccounted_bytes_allocated` if, at the time of reset, the last sample was taken less than `ShenandoahAllocationRate::_interval_sec` (instead of `MinSampleTime`)? I believe this would result in more samples being carried over.

I've done experiments with a variety of	different values for MinSampleTime (e.g. 50 ms, 25ms, 2 ms).  I	find that any value
greater	than 1 ms works	pretty well in the sense that the measured rate	is pretty consistent with the average rate as computed
by the preexisting mechanisms.

The biggest problem I observed was when	we tried to measure rate over an interval of a few microseconds, and these intervals
sometimes had zero allocations.

We could make the change you suggest, but that would actually result in	fewer samples rather than more.	 It would also make
the value of bytes_allocated_since_gc_start() less precise in that this quantity	would carry up to 100 ms of allocations	that
actually occurred before the start of the current GC.

Here's an ASCII art attempt to illustrate how it works	in the current implementation.

Here is	a timeline with	a tick mark to illustrate the desired sample time, every	100 ms:

      	 .1s  	   .2s	     .3s       .4s.     .5s  	   .6s	     .7s       .8s.    .9s  	   1s	      1.1s      1.2s
|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---
      	       ^      	      	      	      	    	      	      	      	      	      	       ^
Existing       |                                                                                       |
Samples:  A    B                                                                                       C      	 D         E
               |                                                                                       |
Proposed       |                                                                                       |
Samples:  A    |                                                                                       B      	 C    	   D
               |                                                                                       |
               |                                                                                       |
      	       Trigger GC at 0.14s    	      	      	      	      	      	      	      	       Finish GC at 1.03s

In either approach, we sample at 0.1s and this sample contributes to the average allocation rate that causes us	to trigger at
time 0.14s.  In	the current implementation, at time 0.14s, we take a new sample	representing the most current allocation rate
as computed over the span between 0.1s and 0.14s.  As implemented, if this time	delta (0.14s minus 0.10s =0.04s) is less than
2 ms, we'll skip sample	B.  I think you	are proposing that we skip sample B whenever this delta	is less	than 100 ms
 (_interval_sec).

We do not sample during	GC.  As	soon as	GC finishes, we	begin asking the question, should I start another GC.  This resumes the
sampling activity, aligning the	periodic samples on the	time at	which we finished previous GC cycle.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27398#discussion_r2372740860