RFR: 8376839: GenShen: Improve performance of evacuations into the old generation [v3]
William Kemper
wkemper at openjdk.org
Thu Feb 5 17:39:16 UTC 2026
On Mon, 2 Feb 2026 17:28:39 GMT, William Kemper <wkemper at openjdk.org> wrote:
>> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 41:
>>
>>> 39: ShenandoahScanRemembered* const _scanner;
>>> 40:
>>> 41: public:
>>
>> I had not initially appreciated that we are investing in more precise dirtying of cards as part of this PR. Please check my analysis of the tradeoffs here:
>>
>> Option 1 (as currently implemented):
>> 1. This will take longer to do entry_update_card_table() because we have to rescan every copied object. This rescanning may also result in increased contention for cache lines and memory bus with mutator threads during this phase. This also results in redundant dirtying of cards for any card that holds more than one "interesting pointer".
>> 2. The benefit of this option is that our subsequent scan-remembered pass will have less work to do because potentially fewer cards will need to be scanned.
>>
>> Option 2(blindly dirty the entire range of copied objects):
>> 1. This matches current implementation. The existing design is based on the idea that it is "overall" more efficient to scan this data once rather than twice. We'll scan the data once when we next scan remembered set.
>> 2. The scan-once benefit applies only to cards that are dirty. In option 1, we scan data corresponding to dirty cards twice. In option 2, we scan data corresponding to dirty cards once. Data corresponding to clean cards is scanned only once in either option, though the timing of when we scan that data is different.
>>
>> I'm wondering if we've done any experiment to evaluate the tradeoffs of these alternative approaches on various workloads?
>
> I ran these tests on an earlier iteration of the changes here. They showed the reduction in remembered set scan times was greater than the increase in the time to update the card table. I will re-run the tests with the current version of the code here.
Latest round of testing shows that more precise card dirtying approximately doubles the time of this new phase with no significant reduction in average remembered set scan times.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/29511#discussion_r2770346664
More information about the shenandoah-dev
mailing list