RFR (L): 8060025: Object copy time regressions after JDK-8031323 and JDK-8057536

Thomas Schatzl thomas.schatzl at oracle.com
Wed Dec 17 19:19:17 UTC 2014


Hi,

On Tue, 2014-12-16 at 17:32 -0500, Kim Barrett wrote:
> On Dec 16, 2014, at 6:57 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> > 
> >> src/share/vm/gc_implementation/g1/g1InCSetState.hpp  
> >> 
> >> The name "is_not_in_cset" is unfortunate; I keep wanting to misread
> >> that as meaning !is_in_cset.  I don't have a better suggestion that
> >> doesn't involve a complete renaming of the type, and I don't have a
> >> better type name to suggest.
> >> 
> >> Similarly, G1CollectedHeap::in_cset_state() is also confusing to me; I
> >> keep wanting to misread that as a predicate.
> > 
> > In some different implementation I had region_attr_t as type name…
> 
> region_attr() or region_state() for the G1CollectedHeap accessor certainly fixes the second.
> 
> And a name involving “region” lines up well with all the doc comments in the existing class, e.g.
> 
>   31 // Per-region state during garbage collection.
>   32 struct InCSetState {
> 
> “Per-<em>region</em> state …”
> 
>   56     Humongous    = -1,    // The region is humongous - note that actually any value < 0 would be possible here.
>   [… etc …]
> 
> “The <em>region</em> is …”
> 
> So yes, I like a “region”-based name.  Unless, of course, this ends up being confusing because it’s too
> close to some other name(s) used in G1.  I can’t think of any, but there’s a lot of code I haven’t studied
> yet.

No. Any good idea?

Otherwise I would like to separate this naming discussion from this CR, as I would prefer to have this change
in asap. A few other changes depend on it.

> Regarding is_not_in_cset, what exactly does that state mean?  Specifically, does it mean the region
> is presently unused / inactive?  Or could it be used for something other than young/old/humongous?
> If the former, then perhaps rename that state (and the predicate) accordingly, e.g. Inactive or something
> like that.

The names of the predicates directly derive from the state names.

A region that has NotInCSet state is not inactive in any way, it's just a region that we do not consider
for potential collection/reclamation.

The difference between humongous (candidate) regions (state Humongous) and Young/Old region is only that
the former is only checked for references, but not evacuated in any way.

Young/Old regions in the collection set are evacuation sources.

One option would be to define a collection set that includes both humongous and young/old regions in the
collection set and the young/old regions into some sort of "evacuation set" (in quotes because I do not like
the name; typically I would assume that collection set == evacuation set).

Then we could have the following predicates:

static bool is_in_cset()  const { return _value != NotInCSet; }
static bool is_not_in_cset() const { return !is_in_cset(); } // I guess that this is not true is your point
of contention
static bool is_in_evac_set() const { return _value > NotInCSet; }

instead of

bool is_not_in_cset() const          { return _value == NotInCSet; }
bool is_in_cset_or_humongous() const { return _value != NotInCSet; }
bool is_in_cset() const              { return _value > NotInCSet; }

Alternatively maybe just renaming

  is_not_in_cset

above to

  is_not_in_cset_or_humongous()

may be as good or better, but is a bit on the long side. It would be the
negation of is_in_cset_or_humongous() below then (and could be expressed
as such).

Or we could just remove the is_not_in_cset() predicate and use !
is_in_cset_or_humongous() everywhere it's needed.

> Another possibility might be NotInCSet => NonCSet and is_not_in_cset() => is_non_cset().

Not sure actually. I agree it is somewhat of a mess, but at the moment "cset" (or collection set) only
includes the young/old regions (that we are evacuating from). That's why there is this
"is_in_cset_or_humongous()" predicate.

> >> Below are for merging in_cset_state_t and InCSetState.  I don't have a
> >> strong preference between the two options.
> > 
> > I will use the "b" variant then as I think it is cleaner (even if only by a little).
> 
> Continuing to look at this code, I seem to be developing a preference for the “b” variant too.

Good. :)

> >> src/share/vm/gc_implementation/g1/g1InCSetState.hpp 
> >>  32 struct InCSetState {
> >> 
> >> Use of "struct" instead of "class" makes _value public.
> > 
> > Added explicit visibility specifier.
> 
> Unfortunately, I think the in_cset_state_t typedef needs to be public.  Otherwise,
> how can a client declare the type of a variable that will hold the result of the
> value() function.

It does compile though. I will fix it.

> 
> > Diff webrev:
> > http://cr.openjdk.java.net/~tschatzl/8060025/webrev.1b_to_2b/
> > Full webrev:
> > http://cr.openjdk.java.net/~tschatzl/8060025/webrev.2b/
> 
> A minor point that I failed to notice previously: the constructors are initializing _value by assignment
> in the constructor body, rather than in a member initializer.  In this case it’s likely the same code will
> be generated, but using member initializers is a good habit that every detailed C++ style guide
> recommends (because there are performance costs for some types).
> 
> Also consider adding an is_valid() assertion in the one arg constructor, making it impossible for
> there to ever be a non-valid object, so no need for is_valid checks elsewhere.
> 
> Could also reduce to only one constructor, by replacing the two existing constructors with
> 
>   InCSetState(in_cset_state_t value = NotInCSet) : _value(value) { assert(is_valid(), “Invalid state”); }
> 

Done.

Thomas





More information about the hotspot-gc-dev mailing list