RFR: 8240556: Abort concurrent mark after effective eager reclamation of humongous objects

Stefan Johansson sjohanss at openjdk.java.net
Fri Sep 18 08:43:02 UTC 2020


On Tue, 15 Sep 2020 12:40:46 GMT, Thomas Schatzl <tschatzl at openjdk.org> wrote:

> Hi all,
> 
>   please review this change that implements concurrent mark abort as proposed by Liang Mao from Alibaba.
> 
> Basically, if at the end of the concurrent start pause we notice that we went below the IHOP threshold (and it has been
> a concurrent start pause caused by humongous allocation), instead of scheduling a mark operation, we schedule a
> "concurrent undo" operation that undoes the changes.  Regarding the removal of the
> test/hotspot/jtreg/gc/stress/jfr/TestStressBigAllocationGCEventsWithG1.java test: it only ever accidentally worked in
> G1. G1 never sent the AllocationRequiringGC events for GCs caused by going over the IHOP threshold for humongous
> allocations that the test actually expects.  That test previously only worked because G1 could not reclaim the
> humongous objects fast enough so crossing the IHOP threshold causes a full concurrent mark. Allocations during that
> concurrent mark do not cause a GC that can reclaim these objects, so ultimately some young GC that sends the
> AllocationRequiringGC event will be sent.  With concurrent undo this is not guaranteed any more, i.e. only in
> environments where concurrent undo is slow (and we'll improve it soon) this test works.  The test is too timing
> dependent, and checking for the wrong thing in this case, so removing it.  Testing: tbd.

Thanks for taking this on Thomas. The change looks good but I have a few comments on the structure.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3064:

> 3062:         concurrent_operation_is_mark_cycle =
> 3063:           should_start_concurrent_mark_operation &&
> 3064:           ((gc_cause() != GCCause::_g1_humongous_allocation) || policy()->need_to_start_conc_mark("Revise"));

What do you think about setting a state in `G1CollectorState` instead of passing this around? Something like
`_cancel_concurrent_mark`.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3074:

> 3072:         double sample_end_time_sec = os::elapsedTime();
> 3073:         double pause_time_ms = (sample_end_time_sec - sample_start_time_sec) * MILLIUNITS;
> 3074:         policy()->record_collection_pause_end(pause_time_ms, concurrent_operation_is_mark_cycle);

Then we can just check the collector state in `record_collection_pause_end` instead of passing it in here.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 3121:

> 3119:     // Note: of course, the actual marking work will not start until the safepoint
> 3120:     // itself is released in SuspendibleThreadSet::desynchronize().
> 3121:     start_concurrent_cycle(concurrent_operation_is_mark_cycle /* full_mark */);

Instead of passing the state down here what do you think of something like:
Suggestion:

    if (_collector_state->cancel_conc_start()) {
      cancel_concurrent_cycle();
    } else {
      start_concurrent_cycle();
    }

Where `cancel_concurrent_cycle()` would do:
  root_regions()->cancel_scan();
  MutexLocker x(CGC_lock, Mutex::_no_safepoint_check_flag);
  if (!_cm_thread->in_progress()) {
    _cm_thread->set_canceled();
    CGC_lock->notify();
  }

src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 706:

> 704:     root_regions()->cancel_scan();
> 705:     return;
> 706:   }

No need for this if you agree with my above comments.

src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 67:

> 65:   assert(_state == Idle, "cycle in progress");
> 66:   _state = mark_cycle ? StartMark : StartUndo;
> 67: }

I would prefer two different function, one for start and one for undo/cancel.

src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 208:

> 206:     result = UndoCycle;
> 207:   }
> 208:

I would prefer if `wait_for_next_cycle()` continued returning bool and not see Terminate as a command. I'm also
thinking that the command and _state is more or less the same thing, or am I missing something? If we want it could we
instead have the command as a member that we set when we either start or undo/cancel the cycle?

src/hotspot/share/gc/g1/g1Policy.cpp line 907:

> 905:       abort_time_to_mixed_tracking();
> 906:     }
> 907:     collector_state()->set_mark_or_rebuild_in_progress(start_concurrent_mark_cycle);

As mentioned, I would prefer to check the `collector_state()` here rather than a passed in value.

src/hotspot/share/gc/g1/g1ConcurrentMarkThread.cpp line 179:

> 177:     }
> 178:
> 179:     concurrent_cycle_end(cmd == MarkCycle && !_cm->has_aborted());

I think it would be more clear if this was called with `false` for the undo-case and with `!_cm->has_aborted()` for the
normal case. But if you want to avoid two calls, I'm fine with this.

-------------

Changes requested by sjohanss (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/177


More information about the hotspot-jfr-dev mailing list