RFR: 8162929: Enqueuing dirty cards into a single DCQS during GC does not scale
Thomas Schatzl
thomas.schatzl at oracle.com
Tue Jul 16 09:06:12 UTC 2019
Hi Kim,
On Mon, 2019-07-15 at 17:37 -0400, Kim Barrett wrote:
> > On Jul 12, 2019, at 8:36 AM, Thomas Schatzl <
> > thomas.schatzl at oracle.com> wrote:
> > On Fri, 2019-07-05 at 00:50 -0400, Kim Barrett wrote:
> > > Please review this refactoring of G1DirtyCardQueue[Set],
> > > splitting
> > > into two sets of classes that each better support the purpose
> > > they
> > > are used for. […]
> > >
> > > CR:
> > > https://bugs.openjdk.java.net/browse/JDK-8162929
> > >
> > > Webrev:
> > > http://cr.openjdk.java.net/~kbarrett/8162929/open.00/
> > >
> >
> > some comments:
> >
> > - pre-existing: I would prefer if PtrQueue::~PtrQueue would be
> > virtual to avoid the risk of object slicing. There is no problem
> > right now in the code afaics though, but we've been bitten by
> > object slicing issues before, and it's imho okay to use a virtual
> > destructor here.
>
> There is no risk of object slicing. ~PtrQueue (and ~PtrQueueSet and
> ~SATBMarkQueueSet) are, by design, protected. The "usual" rule for
> base class destructors is public and virtual (if support for deleting
> through a base class pointer is desired), or protected and
> non-virtual. (See, for example,
> http://www.gotw.ca/publications/mill18.htm,
> Guideline #4.)
I fully agree with that. Thanks.
> Also, making these destructors virtual would require PtrQueue and
> PtrQueueSet to have an accessible operator delete. The presence of a
> virtual destructor induces the creation of a "deleting destructor",
> which will call the operator delete found by lookup in the class.
> That probably means deriving PtrQueue and PtrQueueSet from CHeapObj,
> even though we don't actually want to heap allocate any of these
> objects. (In the original version of this change I made PtrQueueSet
> derive from CHeapObj<mtGC>, and forgot to back that out before
> starting the review. What I was doing there is really more properly
> part of JDK-8221361, and is probably the wrong approach anyway. The
> redirty card queue set should be local to a collection, perhaps part
> of the evacuation info.)
>
> The problem of unnecessarily inducing deleting destructors was also
> discussed here:
>
>
http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-December/007566.html
> RFR (S): 8004845: Catch incorrect usage of new and delete during
> compile time for value objects and stack objects
>
> AbstractGangTask was given a protected non-virtual destructor as part
> of that change. Unfortunately, that destructor was removed by
> JDK-8087323, leaving it with the default (public) destructor.
>
> Regretably, what seems like it should be the most relevant gcc
> warning option (-Wnon-virtual-dtor) is overenthusiastic; -Wnon-
> virtual-dtor warns about a class with virtual functions and an
> accessible non-virtual destructor. That's broken. (Not just my
> opinion; I've seen multiple discussions of this in various
> places.) That warns about a leaf derived class with a (necessarily)
> accessible non-virtual destructor, because the compiler doesn't know
> it's a leaf class. Warning if a class has a problematic base class
> would be fine. But warning that some hypothetical (but nonexistent)
> derived class could be sliced, because the compiler doesn't know that
> the leaf derived class is actually a leaf, is just wrong. (This might
> be improved using the C++11 final class specifier to let the compiler
> recognize a leaf class. But apparently not yet:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58140)
>
> Unfortunately, even if -Wnon-virtual-dtor did the right thing, we'd
> still get warnings from it because PtrQueue befriends VMStructs,
> making its destructor accessible (though unused) from a non-derived
> class. Drat!
>
> The other relevant warning option (-Wdelete-non-virtual-dtor) has
> some similar problems IMO (warning about hypothetical derived classes
> that don't actually exist).
>
> While I was looking at that, I noticed that while PtrQueue is
> non-copyable, someone (likely me) forgot to do the same for
> PtrQueueSet.
>
> So I've made the following changes:
> * Reverted PtrQueueSet derivation from CHeapObj.
> * Made PtrQueueSet non-copyable.
>
> but did not make ~PtrQueue or ~PtrQueueSet virtual.
Thanks a lot for this extensive explanation.
>
> > - I would prefer to add an "is_" prefix to the
> > G1RedirtyCardsQueueSet::empty() member as this reads better.
>
> empty() was only being used for assertions. I've replaced it with
> verify_empty().
Okay.
>
> > - maybe rename the local "QSet" to "LocalPtrQueueSet" to make it
> > more self-describing. At least something a bit more descriptive if
> > possible.
>
> I renamed QSet to LocalQSet.
Thanks.
>
> > - it's a bit unfortunate that on the way from the redirty qset to
> > the global qsets (in G1DirtyCardQueueSet::merge_bufferlists) we can
> > not easily keep a "tail" pointer and count, and then when merging
> > the redirty qset with the dirty qset have to iterate over the
> > BufferNodes to determine it.
>
> The tail pointer and count are now carried all the way through to the
> merge into the dirty card queue set, eliminating the iteration down
> the (possibly very long) list in that merge.
>
> > This might have some minor (seemingly unnecessary) impact on huge
> > loads like we discussed internally when doing perf testing.
> > Probably not worth the effort dealing with here, given that you may
> > simply increase the buffer size there.
> > I will do some cursory tests, with probably no detrimental outcome
> > though.
> >
> > Unfortunately there is no good way to just move the data between
> > them other than the one actually implemented.
>
> I think I found a good way to do that, enabled by the phased usage
> pattern.
>
> > - in G1DirtyCardQueueSet::merge_bufferlists() I am not sure whether
> > that loop used to determine the tail and the count is much
> > favorable
> > (e.g. easiert to read) than this due to endless loop + break
> > "somewhere" within the loop body:
>
> That loop is gone now.
>
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8162929/open.01/
> incr: http://cr.openjdk.java.net/~kbarrett/8162929/open.01.inc/
>
> Testing:
> mach5 tier1-5
>
looks good to me. The additional delay is also gone now in my
testing, thanks a lot.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list