RFR: 8166607: G1 needs klass_or_null_acquire

Thu Sep 29 22:49:09 UTC 2016

Please review this change to G1 to use klass_or_null_acquire where
needed.

[Reviewing on hotspot-dev; only GC code is involved, but some non-GC
folks have expressed interest.]

G1 BOT uses of klass_or_null have all been changed to instead use
klass_or_null_acquire.  This isn't necessary for the call in the
revised version of oops_on_card_seq_iterate_careful (discussed below),
but there can be (and are) other callers.  The performance impact of
unnecessary barriers in oops_on_card_xxx will hopefully not be too bad
on non-TSO systems; the number reduces to one as the BOT gets filled
in.  Rather than splitting or otherwise conditionalizing, for now we
just make it safe.  [The CollectedHeap block_start &etc API needs some
TLC in the face of collectors that use klass_or_null therein, since
the answer to a call to block_start could change instantaneously.  And
some (all?) of the existing not-GC-specific callers probably should be
using block_start_const rather than block_start.]

Changed oops_on_card_seq_iterate_careful to use different code paths
when the region is humongous or not.

The humongous case is simplified because there can only be the one
(potential) object in the region.  Use klass_or_null_acquire to
determine whether the object has been "published" or allocation is
still in-progress.  The acquire barrier ensures processing the
humongous object is safe.  When the klass is NULL the card marking
must be stale and can be ignored.

In the non-humongous case, we know we're dealing with an old-gen
region.  Because of that, we know we don't need to deal with
in-progress allocations here.  That allows many simplifications.

As a result of the changes in oops_on_card_seq_iterate_careful, we now
almost never fail to process the card.  The only place where that can
occur is a stale card in a humongous region with an in-progress
allocation, where we can just ignore it.  So the only caller,
refine_card, no longer needs to examine the result of the call and
enqueue the card for later reconsideration.

Note that some of the complexity in refine_card / oops_on_card_xxx
arises from the possibility of "stale" marked cards, especially during
concurrent refinement.  These are cards in regions that have been
freed and then reallocated.  At present the only source of stale cards
in the concurrent case seems to be HCC eviction.  (Eager reclaim of
humongous objects flushes relevant remembered sets (possibly
containing stale entries) for reconsideration during that GC, but
in-GC card clipping to scan_top deals with that.)  Doing HCC cleanup
when freeing regions might remove the need for klass_or_null checking
in the humongous case for concurrent refinement, so might be worth
looking into later.

CR:
https://bugs.openjdk.java.net/browse/JDK-8166607

Webrev:
http://cr.openjdk.java.net/~kbarrett/8166607/webrev.01/

Testing:
Nightly test equiv. via rbt. (in progress)
Local specjbb2015 (non-perf).
Local jtreg hotspot_all.