RFR: 8327388: GenShen: census during marking is partial [v5]

Y. Srinivas Ramakrishna ysr at openjdk.org
Sat Apr 13 02:35:57 UTC 2024


On Sat, 13 Apr 2024 00:24:08 GMT, Y. Srinivas Ramakrishna <ysr at openjdk.org> wrote:

>> There was a bug in the placement of the call to clear stale census data before starting a fresh one for a new marking cycle that marks through the younger generation. This bug resulted in the use of a co-terminal suffix of the census collection, losing all data from the earlier iterations of an iterative collection process that may run up to 5 times.
>> 
>> We stumbled upon the defect when working on a refactoring task involving separation of generational extensions of Shenandoah from its non-generational version. The (performance) defect has existed since day zero of the adaptive tenuring code in GenShen.
>> 
>> Along with fixing the defect, an assertion has been added to check the "reasonable completeness" of the census, which came in useful to detect a reset call inadvertently left behind in one place.
>> 
>> Some ShenandoahAgeCensus APIs have been narrowed and cleaned up a bit, and documentation clarified a bit more.
>> 
>> **Testing**:
>> - [x] GHA 
>> - [x] Code pipeline testing : one intermittent stress dacapo failure, ascribed to an existing bug in coalesce-and-fill
>> - [x] SPECjbb
>> 
>> **Performance**: 
>> - [x] SPECjbb : the variance in tests fails any significant change under 2-tailed Mann-Whitney
>
> Y. Srinivas Ramakrishna has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits:
> 
>  - Merge branch 'master' into clear_census
>  - Merge branch 'master' into clear_census
>  - Merge branch 'master' into clear_census
>  - Merge branch 'master' into clear_census
>  - Remove local_reset of age_census object inadvertently left behind in the
>    GLOBAL gen concurrent mark, which was triggering the newly added
>    reasonableness assert (yay!).
>  - Avoid divide-by-zero.
>  - Merge branch 'master' into clear_census
>  - Fix word and byte size unit confusion in comparison/assert.
>  - Loose verification of "reasonable completeness" of census.
>  - Fix release build.
>  - ... and 7 more: https://git.openjdk.org/shenandoah/compare/9d869ca1...787e37a8

Note: The linux-64/test (tier1) runtime failure is just an infra failure to upload test results. The tests all passed as verified from the test info:

==============================
Test summary
==============================
   TEST                                              TOTAL  PASS  FAIL ERROR   
   jtreg:test/hotspot/jtreg:tier1_runtime              642   642     0     0   
==============================
TEST SUCCESS


So the red x against the linux-64/tier1_runtime can be ignored.

Once the rest of the ongoing GHA tests complete, I'll consider integrating this change.

-------------

PR Comment: https://git.openjdk.org/shenandoah/pull/403#issuecomment-2053034076


More information about the shenandoah-dev mailing list