RFR (M/L): 8247928: Refactor G1ConcurrentMarkThread for mark abort (JDK-8240556)

stefan.johansson at oracle.com stefan.johansson at oracle.com
Fri Aug 21 08:20:59 UTC 2020


Hi Thomas,

On 2020-06-25 10:56, Thomas Schatzl wrote:
> Hi all,
> 
>    can I have reviews for this change that refactors the 
> G1ConcurrentMarkThread class to prepare it better for mark abort 
> (JDK-8240556).
> 
> So the idea in the latter is to abort the concurrent cycle if after 
> concurrent start gc we find that old gen occupancy actually went down 
> below the IHOP again due to e.g. eager reclaim. To do that, G1 needs to 
> scrub any marks on the next bitmap during the gc pause.
> The current, original change just performs this work in the pause, which 
> is very slow, so the idea is to do the bitmap cleaning concurrently 
> instead. The problem is that G1ConcurrentMarkThread is a mess and you 
> can't easily "jump" to the end of the current marking.
> 
> Additionally, the code was worth refactoring without that requirement 
> anyway ;)
> 
> This change refactors the code so that it is much easier to do add a 
> second path through the concurrent cycle.
> 
> Overall, there are two options to do that:
> 1) provide an explicit state machine for the concurrent marking so that 
> you can jump to the end easily.
> 2) provide building blocks that can be easily put together to implement 
> the second path.
> 
> While 1) works (there is a sample POC at 
> http://cr.openjdk.java.net/~tschatzl/8247928/webrev.sm/), I found that 
> it is much more code and less understandable than just building two 
> paths through the marking cycle as in the "abort marking" case we only 
> need to do the very tail end of the regular full marking cycle, so I 
> propose this option for review.
> 
> This is refactoring (almost) only, the additional path should be added 
> with JDK-8240556. Almost because I changed two things:
> - the concurrent mark (control) thread does not do any marking any more 
> for a long time (long long ago it did root scanning/marking), so I 
> removed the _vtime_mark_accum accumulator.
> - the "Concurrent Mark" finish message is now only printed at the end of 
> all marking, not every iteration. First, the restart case is very rare 
> so probably anyone parsing it will not handle this case correctly, and 
> the contents (times) of that finish message are confusing, which means 
> most likely anyone handling it will likely do the wrong thing.
> The "Concurrent Mark Restart for Mark Overflow" remains to indicate a 
> restart.
> 
> Based on JDK-8248221 also out for review.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8247928
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8247928/webrev/
Thanks for taking on this task. As we discussed before, there are pros 
and cons with the different approaches but I agree with you that this 
solution is easy to follow and understand.

Generally this looks good, just a few comments:

src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp
---
  118 class G1ConcPhaseTimer : public 
GCTraceConcTimeImpl<LogLevel::Info, LOG_TAGS(gc, marking)> {
  119   G1ConcurrentMark* _cm;
  120   const char* _t;

The addition of _t is unused and should be removed.
---
  145     concurrent_cycle_start();
  146     full_concurrent_cycle_do();
  147     concurrent_cycle_end();

I think do_concurrent_cycle sounds better than full_concurrent_cycle_do, 
but if you prefer the current name I'm ok with it.
---
  253
  254   Ticks mark_start = Ticks::now();
  255   log_info(gc, marking)("Concurrent Mark (%.3fs)", 
mark_start.seconds());
  256
  ...
  278
  279   Ticks mark_end = Ticks::now();
  280   log_info(gc, marking)("Concurrent Mark (%.3fs, %.3fs) %.3fms",
  281                         mark_start.seconds(), mark_end.seconds(),
  282                         (mark_end - mark_start).seconds() * 1000.0);
  283

As we talked about offline I would like to group this part into a phase, 
to avoid timing in the middle of the main function. Something like this:
http://cr.openjdk.java.net/~sjohanss/8247928/rev1/

What do you and other reviewers think about this?
---
  271     // Phase 6: Remark pause
  272     if (phase_remark(needs_restart)) return;
  273     if (needs_restart) {
  274       log_info(gc, marking)("Concurrent Mark Restart for Mark 
Stack Overflow (iteration #%u)",
  275                             iter++);
  276     }

Not a big fan out out parameters, what do you think about instead 
wrapping _cm->has_overflown() in a function needs_restart()? If you want 
to avoid having to do two calls, we could make the loop run forever 
until needs_restart is false and break (or call the wrapper 
marking_completed() and break if it is true).
---

src/hotspot/share/gc/g1/g1ConcurrentMarkThread.inline.hpp
---
   38   // The G1ConcurrentMarkThread does not do any marking at all.
   39   return _cm->all_task_accum_vtime();

I think we can skip the comment, that is basically the justification to 
remove this, but going forward it isn't really needed.
---

Thanks,
Stefan

> Testing:
> tier1-5, a few local jtreg runs of the gc directory
> 
> Thanks,
>    Thomas



More information about the hotspot-gc-dev mailing list