RFR: 8311883: [Genshen] Adaptive tenuring threshold

Fri Jul 14 02:33:54 UTC 2023

On Wed, 12 Jul 2023 18:49:02 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:

> I've run this through some Extremem workloads. Good news is I see no regressions. On the other hand, I am not yet seeing huge benefit. (I may be seeing a decrease in degenerated cycles, but need to run a few more tests to be sure.)
> 
> One concern is that we are still improperly identifying promotions as mortality. I'm attaching a log with some of my comments preceded by ;; at the start of lines. See line 3421 of the log, for example. IN GC(16), we chose tenure age 2. Then we promoted in place 420 regions. This caused us to believe there was high mortality of ages 3-6, but really there was no mortality and only promotions. In GC(17), we should have stayed with tenure age 2.
> 
> I think the way to fix this is to only scan from 1 to the current tenure age when you select a new tenure age. If there is no mortality at the current tenure age, then we can set the new tenure age to 1 + current tenure age. [auto-tenure.out.txt](https://github.com/openjdk/shenandoah/files/12031253/auto-tenure.out.txt)

Thanks for helping me reason through this. My instinct was that the temporary wide-opening of the survivor window (or conversely the raising of the tenuring bar) would be harmless because the next census would realize that the oldest cohort of young survivors in this gc did not exhibit sufficient mortality and quickly lower the bar again before the next subsequent collection. However, this does indeed result in the aforementioned survivor cohort staying in the young gen for one GC longer which could be bad from a performance perspective.

My second idea of leaving that window open (or the bar raised, whichever your metaphor of preference is) and looking at all of the cohorts in the young generation every time was that in the event that other criteria (e.g. low garbage density) had us keeping a region in the young gen longer where we were already including that cohort in our census, it would make sense for us to consider its mortality each time we encountered it.

However, upon further reflection, it might make sense to decouple these effects and ignore cohorts that the algorithm would have "logically" promoted, but which were physically were not because of other considerations. Such cohorts then would constitute a case of "delayed tenure cohorts", but whose tenure isn't readjudicated at each epoch just because they were held back for other considerations. Thinking in terms of this academic tenuring metaphor makes me feel that your proposal is in fact the correct one (and would be considered fairer in an academic setting than my revisionist original algorithm :-)

I'll make that change and gather performance data. Many thanks for the performance data that you obtained and shared with me, and for the related discussion!

-------------

PR Comment: https://git.openjdk.org/shenandoah/pull/289#issuecomment-1635176970