RFR (XL): 8202845: Refactor reference processing for improved parallelism

Tue Jun 12 18:19:35 UTC 2018

Hi Kim,

  thanks for your review.

On Tue, 2018-06-12 at 12:20 -0400, Kim Barrett wrote:
> > On Jun 6, 2018, at 1:13 PM, Thomas Schatzl <thomas.schatzl at oracle.c
> > om> wrote:
> > 
> > I thought marks_oops_alive is an unnecessary quirk of the
> > collectors that found its way into the reference processing
> > interface; its purpose is to indicate whether the gc should perform
> > stealing at the end of a phase, as the complete closures drain the
> > stacks completely by themselves anyway.
> > 
> > Stealing or not should be part of the complete closure (i.e. the
> > gc, and attempted by default by the gc) imho (it "completes" the
> > phase), like G1 does.
> > 
> > I fixed it though.
> 
> marks_oops_alive is another piece (really an entirely different
> mechanism) of the avoidance of stealing and the termination protocol
> when the keep_alive closure won't be marking any previously unmarked
> objects, so (maybe) won't be generating any potentially stealable
> work.  It seems to me to be a better approach than sometimes not
> calling the complete_gc closure (the other mechanism, used by
> process_phase2), since it puts the decision on the specific collector
> that knows how its keep_alive and complete_gc closures interact.  See
> further discussion below.

Since the collector (via the keep_alive closure) generates the
potentially stealable work, it could already find out whether there is
work even without the mark_oops_alive mechanism - in a way that is even
more exact than the all-or-nothing mark_oops_alive flag.

> > New webrevs:
> > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1 (full)
> > Testing:
> > hs-tier1-5,jdk-tier-1-3 (with +/-ParallelRefProcEnabled)
> > 
> > Thanks for your quick review,
> >  Thomas
> 
> Sorry this one took longer.
> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/shared/referenceProcessor.cpp
>  315   if (log_is_enabled(Trace, gc, ref)) {
>  324   if (log_is_enabled(Trace, gc, ref)) {
> 
> s/log_is_enabled/log_develop_is_enabled/
> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/shared/referenceProcessor.cpp
>  571     complete_gc.do_void();
> 
> I think this is unavoidable (at least for now), but I'm wondering if
> this is a performance issue for other than G1-young collections.
> 
> In the phases where marks_oops_alive is false (like this one), for
> the other collectors (and G1 concurrent and full GC, I think), there
> won't be anything to steal.

Why? Any imbalance in how fast the work will be processed will need
stealing to keep the threads doing useful work.
Particularly traversing and keeping alive followers is highly dependent
on the object graph referenced by the referent.

> Starting the stealing process (and its associated termination
> protocol) is a waste of time in those cases.

Starting the stealing process is basically free - at most what these do
is looking through the work queues whether they are empty - this is a
normal load of a few variables.

Shenandoah's task terminator is very good at detecting these cases with
very little or no work to be done and not waking up anyone (i.e.
terminating very quickly).

I have been running almost all test runs for all patches sent in the
last month or so with it with no problem, so it will likely be a change
for the first few weeks for jdk12.

> ParallelScavange and ParallelCompaction both avoid creating the
> stealing tasks if marks_oops_alive is false.

This is actually the only case where the flag makes some sense because
parallel gc needs to set the tasks beforehand.

But we may want to look into making parallel gc use workgangs too,
because the current implementation of the task queue in parallel is
very problematic as it is a single point of contention every time a
task ends, and it causes us to use lots of boiler plate code to support
both workgang and what parallel gc uses.

>  CMS avoids calling do_work_stealing when marks_oop_live is
> false.  ParNew doesn't seem to look at that bit though.
> 
> G1 concurrent and full GCs also don't seem to look at that bit, and
> so I think will be negatively affected.  Maybe this is a job for
> another RFE.

Yes, we will probably need to spend more time in this area :)

> Some comment revision might be called for though.  It would be nice
> to not have to re-discover (no pun intended) all this again.
> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/shared/referenceProcessor.cpp 
>  457   // Close the reachable set
>  458   complete_gc->do_void();
> 
> Same issues here for PhantomReferences as earlier for
> Soft/Weak/FinalReferences.  And I think the same solution; pay
> attention to the marks_oops_alive bit.

 - fix the task terminator (we need that anyway)
 - let the collector decide whether it has work
 - fix the parallel gc work gang

:)

> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/shared/referenceProcessor.cpp 
>  595     : ProcessTask(ref_processor, true /* marks_oops_alive */,
> phase_times) { }
> 
> For PhantomReferences, marks_oops_alive should be false, for the same
> reason as for soft/weak/final.
> 
> -------------------------------------------------------------------
> -----------
> 

Fixed.

http://cr.openjdk.java.net/~tschatzl/8202845/webrev.1_to_2 (diff)
http://cr.openjdk.java.net/~tschatzl/8202845/webrev.2 (full)

Thanks,
  Thomas