RFR: 8333105: Shenandoah: Results of concurrent mark may be lost for degenerated cycle

William Kemper wkemper at openjdk.org
Thu May 30 00:09:01 UTC 2024


On Wed, 29 May 2024 19:46:15 GMT, Y. Srinivas Ramakrishna <ysr at openjdk.org> wrote:

>> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 147:
>> 
>>> 145:     bool cancelled = check_cancellation_and_abort(ShenandoahDegenPoint::_degenerated_mark);
>>> 146:     assert(cancelled, "GC must have been cancelled between concurrent and final mark");
>>> 147:     return false;
>> 
>> Not your change, but I notice that the semantics of the return value hasn't been explicitly documented anywhere in the class hierarchy. Might make sense to place a line of documentation, say in `ShenandoahGC` which first declares the virtual method, even something simple such as:
>> 
>> 
>> // Returns true if the collection was completed successfully, false otherwise.
>
> It also occurs to me that the pattern here is, roughly of the following form:
> 
> 
>   do_some_uninterruptible_operation();
>   check cancellation status & return if cancelled, else continue to next uninterruptible operation;
>   ...
> 
> 
> In that case, the checks should not be on the state of the progress of the collection, but on whether the cycle has been cancelled, and the asserts should then check the state of the collection; in other words, the dual of the structure of control in this method.
> 
> That said, I do not understand if there's a deeper rationale here in that sometimes we will want to skip a next uninterruptible phase of the collection regardless, because the state machine of the collection explicitly transitions us, under appropriate conditions, into a different phase than the otherwise normal flow, independent of whether a cancellation happened.
> 
> For example, what would be the difference between your current test:
> 
> 
>   if (concurrent_mark_in_progress()) {
> 
> 
> vs, say:
> 
> 
>   if (!evacuation_in_progress()) {
> 
> 
> vs : 
> 
> 
>    if (check_cancellation()) {
>      ... return false;
>    }
>    assert(!concurrent_mark_in_progress(), "Should have completed");
>    assert(evacuation_in_progress(), "Next logical phase of collection");
> 
> 
> Just wondering if a more uniform idiom might make for easier code patterns and easier, more uniform, reasoning about correctness.

This case doesn't quite follow the `check_cancellation` pattern of the others because it is a little bit different. Here, the uninterruptible operation is the final mark safepoint. This safepoint will itself check if the mark has been cancelled. If final mark detects the cancellation, it will do _nothing_, the concurrent mark will still be in progress and we _must_ resume the degenerated cycle from the marking phase. However, if the final mark safepoint does _not_ detect a cancellation, it will initialize the evacuation phase. If the code simply `checks_cancellation` after final mark it cannot tell if the cancellation request was received just _before_ final mark or just _after_ final mark. These two conditions require different resumption points for the degenerated cycle, so we must check the state of the collection cycle here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19434#discussion_r1619610654


More information about the shenandoah-dev mailing list