CMS parallel initial mark

Thomas Schatzl thomas.schatzl at oracle.com
Thu Jun 6 19:25:40 UTC 2013


Hi all,

On Thu, 2013-06-06 at 10:06 -0700, Jon Masamitsu wrote:
> On 6/6/13 9:02 AM, Thomas Schatzl wrote:
> > Hi,
> >
> > On Tue, 2013-05-28 at 17:24 -0700, Hiroshi Yamauchi wrote:
> >> Hi,
> >>
> >> I'd like to have the following contributed if it makes sense.
> >>
> >> 1) Here's a patch (against a recent revision of the hsx/hotspot-gc repo):
> >>
> >>    http://cr.openjdk.java.net/~hiroshi/webrevs/cmsparinitmark/webrev.00/
> >>
> >> that implements a parallel version of the initial mark phase of the
> >> CMS collector. It's relatively a straightforward parallelization of
> >> the existing single-threaded code. With the above patch, I see about
> >> ~3-6x speedup in the initial mark pause times.
> >    had a look at this patch in the version at
> > http://cr.openjdk.java.net/~hiroshi/webrevs/cmsparinitmark/webrev.01/
> >
> > Looks good so far, some comments:
> >    - in cmsOopClosures.hpp, MarkRefsIntoClosure and the new
> > Par_MarkRefsIntoClosure could be refactored slightly as they have
> > exactly the same member variables. Not sure how this situation is
> > handled in other code though, and what others (Jon) think.
> Thomas,
> 
> If you don't mind I'd like to keep this changeset to a minimum so
> not do any additional refactoring.  That's a good suggestion but
> since this is the first sizable contribution I'm sponsoring, simpler
> is better for me.

Okay. It would be a tiny additional change though, which has only been
enabled by the addition of the Par_MarkRefsIntoClosure, and of course
depends on whether the old serial initial marking code is kept.

(Maybe it is also possible to remove the underscore in the class name?
It's not exactly customary to have class names with underscores in the
Hotspot code - I think)

> >
> >    - in concurrentMarkSweepGeneration.cpp in
> > CMSCollector::checkpointRootsInitialWork() the new code seems a little
> > strange to me. I.e. roughly it does the following:
> >
> >    [...]
> >    if (CMSParallelInitialMarkEnabled) {
> >      [ set up parallel stuff ]
> >      if (n_workers > 1) {
> >        // do parallel thread
> >      } else {
> >        // do serial work using the current thread
> >      }
> >    } else {
> >      // do serial work
> >    }
> >
> > it does not feel good to have two serial paths here; maybe use the
> > serial/original code if n_workers == 1 and guarantee(n_workers > 1, ) in
> > the first part of the if? Or look if the old serial path could be
> > removed? (or the new one).
> 
> Is the inner serial path one thread executing a single chunk that
> does all the work (i.e., executes the parallel code but by only
> one thread)?

Yes, that is in my opinion a little awkward. It would be nice if we had
only one serial initial marking code path to simplify
debugging/maintenance if there are issues. Removal of one or the other
(if so) probably depends on how different the performance between the
two paths is.

> >   - the comments for at least CMSCollector::
> > initialize_sequential_subtasks_for_young_gen_rescan() and
> > CMSCollector::merge_survivor_plab_arrays() should be reworked - they
> > only mention parallel rescan as entry points.

I recommend fixing the comments though - wrong comments are not exactly
helpful, although it's just an omission here.

> >   - the change in the assert in
> > CMSCollector::merge_survivor_plab_arrays() looks a little odd now. I
> > mean, looking at that assert in isolation poses the question why is this
> > method only called when doing initial mark in parallel? It somehow seems
> > that the same work is done somewhere else just for the initial mark and
> > serial case, which just looks strange to me. That's just an observation
> > from me.
> >
> > assert(_collectorState == FinalMarking ||
> >           (CMSParallelInitialMarkEnabled && _collectorState ==
> > InitialMarking), "Error");
> >
> >   - in CMSCollector::CMSCollector() there is this unconditional setup of
> > a few data structures used for parallel processing if any of the
> > CMSParallel* switches is enabled.
> I'm not sure what code this is.  What lines in the webrev?

I hit the send button too eagerly (was at leaving the office), sorry. I
meant ConcurrentMarkSweepGeneration.cpp:727, and was thinking about
removing on of the serial initial marking paths.

If the old serial version (old code) is kept, this condition needs to be
adapted, as data structures within the if-block started at that line are
only used in the parallel case.

But that was just a thought as I am not sure how far you are with
testing it.

Thanks,
Thomas





More information about the hotspot-gc-dev mailing list