RFR: 8376839: GenShen: Improve performance of evacuations into the old generation
William Kemper
wkemper at openjdk.org
Mon Feb 2 17:31:38 UTC 2026
On Sat, 31 Jan 2026 01:22:43 GMT, Kelvin Nilsen <kdnilsen at openjdk.org> wrote:
>> When GenShen evacuates an object into the old generation, it also dirties the card for that object and updates the offsets of the first and last object in the card. In many cases, the same card may dirtied repeatedly and the object starts updated unnecessarily. We can reduce the total amount of work by moving these operations into a separate phase of the cycle which allows them to be batched.
>
> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 41:
>
>> 39: ShenandoahScanRemembered* const _scanner;
>> 40:
>> 41: public:
>
> I had not initially appreciated that we are investing in more precise dirtying of cards as part of this PR. Please check my analysis of the tradeoffs here:
>
> Option 1 (as currently implemented):
> 1. This will take longer to do entry_update_card_table() because we have to rescan every copied object. This rescanning may also result in increased contention for cache lines and memory bus with mutator threads during this phase. This also results in redundant dirtying of cards for any card that holds more than one "interesting pointer".
> 2. The benefit of this option is that our subsequent scan-remembered pass will have less work to do because potentially fewer cards will need to be scanned.
>
> Option 2(blindly dirty the entire range of copied objects):
> 1. This matches current implementation. The existing design is based on the idea that it is "overall" more efficient to scan this data once rather than twice. We'll scan the data once when we next scan remembered set.
> 2. The scan-once benefit applies only to cards that are dirty. In option 1, we scan data corresponding to dirty cards twice. In option 2, we scan data corresponding to dirty cards once. Data corresponding to clean cards is scanned only once in either option, though the timing of when we scan that data is different.
>
> I'm wondering if we've done any experiment to evaluate the tradeoffs of these alternative approaches on various workloads?
I ran these tests on an earlier iteration of the changes here. They showed the reduction in remembered set scan times was greater than the increase in the time to update the card table. I will re-run the tests with the current version of the code here.
> src/hotspot/share/gc/shenandoah/shenandoahScanRemembered.cpp line 210:
>
>> 208: log_debug(gc, remset)("Update remembered set from " PTR_FORMAT ", to " PTR_FORMAT, p2i(start), p2i(end));
>> 209:
>> 210: while (address < end) {
>
> I'm assuming there must be a preparatory pass over all cards to pre-initialize each one, denoting that each card does not hold the start of an object. Then, this loop changes that state only for the cards that do hold the start of an object.
>
> I haven't worked through the all the details, so my intuition may be wrong here. But it feels to me like we could skip the preparatory pass by making a small change to how this loop is structured. The following is my "first" impulse for how I would write this loop. I'm not sure it's better, but offer it for your consideration.
>
> next_relevant_object = address
> For each card_index in the range:
> if the next_relevant_object pertains to this card {
> set_first_start(card_index, offset_in_card(next_relevant_object))
> while (next_relevant_object + next_relevant_object->size() < addr_for_card_index(card_index+1)) {
> next_relevant_object += next_relevant_object->size();
> }
> set_last_start(card_index, offset_in_card(next_relevant_object);
> next_relevant_object += next_relevant_object->size();
> } else {
> clear_card_status(card_index); // no objects start in this card's range
> }
Somewhat surprisingly, we clear the card table for old regions when they are initially allocated: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp#L1558.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/29511#discussion_r2755470989
PR Review Comment: https://git.openjdk.org/jdk/pull/29511#discussion_r2755462158
More information about the shenandoah-dev
mailing list