RFR: 8151556: Use the PreservedMarks* classes for the G1 preserved mark stacks

Tue Mar 22 11:01:25 UTC 2016

Hi Tony,

  some comments to your questions:

On Thu, 2016-03-10 at 12:56 -0500, Tony Printezis wrote:
> Change to use the newly-introduced PreservedMarks* classes for
> preserving marks in G1:
> 
> http://cr.openjdk.java.net/~tonyp/8151556/webrev.0/
> 
> G1 already had per-worker mark stacks so this change was mostly
> straightforward. A few comments / questions:
> 
> - The main change to the PreservedMarks* classes is to add the 
> par_restore() method so that G1 continues doing that phase in 
> parallel. I changed the PreservedMarks::restore() method to do what 
> G1 was doing (i.e., keep popping entries from the stack until the 
> stack is empty) instead of iterating over the stack, then clearing 
> it. The latter was, for some reason, faster in the serial case (what 
> we’ve had so far in PreservedMarks). However, it doesn’t seem to 
> parallelize well.

Contention in free() when everyone is finished?

Some experiments in this area (with G1) showed that the work
distribution is far from optimal btw, particularly for smaller sets of
marks to restore.

> So, I opted to copy what G1 was doing so that at least there’s no
> regression in G1. I also changed the logic slightly: the parallel
> workers now claim tasks to do (using the SequentialSubTasksDone
> class) instead of only doing the one that corresponds to their worker
> id (what G1 was doing before). This doesn’t seem to penalize this
> phase (at least in the tests I ran) and it’s a bit safer (if one
> worker wakes up late, maybe another one will do the work).

If a worker is later, this is still a problem, because if it is late it
is often very late. And at the end of that parallel phase they need to
synchronize anyway.

> - Changed the max cache size of the stacks to 0 so that the stacks do
> not cache any segments. Given that we don’t expect promotion /
> evacuation failures to happen very frequently, caching segments on
> these stacks is mostly wasted space. I also added asserts to ensure
> that there are no cached segments when the stacks are supposed to be
> empty.

Okay.

> - I wanted to have one method that has the logic to initialize an
> object’s mark word after promotion / evacuation failure. Both ParNew
> / ParallelGC use the RemoveForwardedPointerClosure, but G1 has its
> own closure (as it has to do some extra work). I introduced the
> init_forwarded_mark() method that’s called from both places. I could
> also make the G1 closure a subclass of RemoveForwardedPointerClosure
> and just call the do_object() on the parent. Or drop the idea of
> using the same method. Feedback welcome.

The current method looks fine to me.

> 
> - Now that G1 doesn’t use the OopAndMarkOop and OopAndMarkOopStack
> classes directly I made them private to PreservedMarks.
> 
> - Small re-arranging of the #include declarations in
> preservedMarks.inline.hpp as I had put them before the #ifndef guard
> for some reason (oops! sorry…)
> 
> - As it was straightforward, I also call par_restore() from ParNew
> too. I opted to decide whether to call the serial or parallel cases
> in DefNew based on whether workers() is NULL or not. I can also do
> that with a virtual method on DefNew which is overwritten in ParNew.
> Your call. BTW, I can do this change on a separate CR too if it’s
> considered outside the scope of this one (but it was pretty trivial
> so I think it’s OK piggy-backing it here).

I think the suggestion in the review thread for the extracted parallel
header restoration is best in terms of size of interface, lines of code
and code reuse opportunities. I.e. there does not seem to be a need to
use inheritance here imho.

Thanks,
  Thomas