RFR: 8137022: Concurrent refinement thread adjustment and (de-)activation suboptimal [v3]

Kim Barrett kbarrett at openjdk.org
Thu Sep 29 13:28:24 UTC 2022


On Tue, 27 Sep 2022 22:49:01 GMT, Kim Barrett <kbarrett at openjdk.org> wrote:

>> My reasoning is that #cards in the bounded thread-buffers doesn't necessarily follow a normal distribution, so one can't predict the future valuse using avg + std. Taking an extreme example of a single thread-buffer, if the population avg is ~buffer_capacity, #cards in the thread-buffer can exhibit large jumps btw 0 and ~buffer_capacity due to the implicit modulo operation.
>> 
>>> It's not a very accurate prediction, but it's
>> still worth doing.
>> 
>> Which benchmark shows its effect? I hard-coded `size_t predicted_thread_buffer_cards = 0;` in `G1Policy::record_young_collection_end` but can't see much difference. Normally, #cards in global dirty queue should be >> #cards in thread-local buffers.
>
> Here's an example log line from my development machine:
> [241.020s][debug][gc,ergo,refine ] GC(27) GC refinement: goal: 86449 + 8201 / 2.00ms, actual: 100607 / 2.32ms, HCC: 1024 / 0.00ms (exceeded goal)
> Note the cards in thread buffers prediction (8149) is approaching 10% of the goal.
> This is from specjbb2015 with
> `-Xmx40g -XX:MaxGCPauseMillis=100 -XX:G1RSetUpdatingPauseTimePercent=2`
> on a machine with 32 cores.
> 
> specjbb2015 with default pause time and refinement budget probably won't see
> much impact from the cards still in buffers because the goal will be so much
> larger.  OTOH, such a configuration also probably does very little concurrent
> refinement.
> 
> Lest one thinks that configuration is unreasonable or unlikely, part of the
> point of this change is to improve the behavior with a smaller percentage of a
> pause budgeted for refinement. That allows more time in a pause for other
> things, like evacuation. (Even with that more restrictive condiguration
> specjbb2015 still doesn't do much concurrent refinement. For example, during
> the mutator phase before that GC there was never more than one refinement
> thread running, and it was only running for about the last 5% of the phase.)
> 
> I'm using the prediction infrastructure to get a moving average over several
> recent samples, to get a number that has some basis. The stdev implicit in
> that infrastructure makes the result a bit higher than the average. I think
> probably doesn't matter much, as none of the inputs nor the calculations that
> use them are very precise.  But the behavior does seem to be worse (in the
> sense of more frequently blowing the associated budget and by larger amounts)
> if this isn't accounted for to some extent.
> 
> But maybe your point is more about the stddev, and that should not be
> included. I can see that, and could just use the moving average.

I experimented with using the average rather than the prediction. The
difference between the two is not large (estimating just by eye, average
difference is in the neighborhood of 10%). Using the average seems more
appropriate though, so I don't mind changing to it.

-------------

PR: https://git.openjdk.org/jdk/pull/10256


More information about the hotspot-dev mailing list