RFR: 8166811: Missing memory fences between memory allocation and refinement
Thomas Schatzl
thomas.schatzl at oracle.com
Tue Nov 8 12:52:29 UTC 2016
Hi Kim,
On Tue, 2016-10-25 at 19:13 -0400, Kim Barrett wrote:
> Please review this change to address missing memory barriers needed
> to ensure ordering between allocation and refinement in G1.
>
> Rather than simply adding the "obvious" barriers, this change
> modifies refinement to not need any additional memory barriers.
>
> First, the heap region type predicate used to decide whether the card
> should be processed has been changed. Previously, !is_young was
> used, but that isn't really the state of interest. Rather, processing
> should only occur if the region is old or humongous, not if young or
> *free*. The free case (and so other cases that should be filtered
> out) can happen if the card is stale, and there are several ways to
> get stale cards here. So added is_old_or_humongous type predicate
> and use it for filtering based on the region's type.
>
> Second, moved to refine_card the card region trimming to the heap
> region's allocated space, and the associated filtering, to be
> co-located with the type-based filtering. An empty trimmed card
> region is another indication of a stale card.
>
> We should filter out cards that are !is_old_or_humongous or when the
> card's region is beyond the allocated space for the heap
> region. Only if the card is old/humongous and covers allocated space
> should we proceed with processing, and then only for the subset of
> the card covering allocated space.
>
> Moved the card cleaning to refine_card. Having the cleaning in the
> iterator seemed misplaced. Placing it in refine_card, after the card
> trimming and type-based filtering also allows the fence needed for
> the cleaning to serve double duty; in addition to ensuring processing
> occurs after card cleaning (the original purpose), it now also
> ensures processing follows the filtering. And this ensures the
> necessary synchronization with allocation; we can't safely examine
> some parts of the heap region object or the memory designated by the
> card until after the card has been trimmed and filtered. Part of
> this involved changing the storeload to a full fence, though for
> supported platforms that makes no difference in the underlying
> implementation.
>
> (This change to card cleaning would benefit from a store_fence
> operation on some platforms, but that operation was phased out, and a
> release_store_fence is stronger (and more expensive) than needed on
> some platforms.)
It would also be beneficial to make the fence conditional on
is_gc_active(), but that may be another change as we previously did the
storeload unconditionally too.
> There is still a situation where processing can fail, namely an
> in-progress humongous allocation that hasn't set the klass yet. We
> continue to handle that as before.
- I am not completely sure about whether this case is handled
correctly. I am mostly concerned that the information used before the
fence may not be the correct ones, but the checks expect them to be
valid.
Probably I am overlooking something critical somewhere.
A: allocates humongous object C, sets region type, issues storestore,
sets top pointers, writes the object, and then sets C.y = x to mark a
card
Refinement: gets card (and assuming we have no further synchronization
around which is not true, e.g. the enqueuing)
592 if (!r->is_old_or_humongous()) {
assume refinement thread has not received the "type" correctly yet, so
must be Free. So the card will be filtered out incorrectly?
That is contradictory to what I said in the other email about the
comment discussion, but I only thoroughly looked at the comment aspect
there. :)
I think at this point in general we can't do anything but !is_young(),
as we can't ignore cards in "Free" regions - they may be for cards for
humongous ones where the thread did not receive top and/or the type
yet?
- assuming this works due to other synchronization, I have another
similar concern with later trimming:
653 } else {
654 // Non-humongous objects are only allocated in the old-gen during
655 // GC, so if region is old then top is stable. Humongous object
656 // allocation sets top last; if top has not yet been set, then
657 // we'll end up with an empty intersection.
658 scan_limit = r->top();
659 }
660 if (scan_limit <= start) {
661 // If the trimmed region is empty, the card must be stale.
662 return false;
663 }
Assume that the current value of top for a humongous object has not
been seen yet by the thread and we end up with an empty intersection.
Now, didn't we potentially just drop a card to a humongous object in
waiting to scan but did not re-enqueue it? (And we did not clear the
card table value either?)
We may do it after the fence though I think.
Maybe I am completely wrong though, what do you think?
- another stale comment:
636 // a card beyond the heap. This is not safe without a perm
637 // gen at the upper end of the heap.
Could everything after "without" be removed in this sentence? We
haven't had a "perm gen" for a long time...
Thanks,
Thomas
More information about the hotspot-dev
mailing list