RFR: 8333105: Shenandoah: Results of concurrent mark may be lost for degenerated cycle [v2]
Y. Srinivas Ramakrishna
ysr at openjdk.org
Thu May 30 18:28:07 UTC 2024
On Thu, 30 May 2024 00:06:23 GMT, William Kemper <wkemper at openjdk.org> wrote:
>> It also occurs to me that the pattern here is, roughly of the following form:
>>
>>
>> do_some_uninterruptible_operation();
>> check cancellation status & return if cancelled, else continue to next uninterruptible operation;
>> ...
>>
>>
>> In that case, the checks should not be on the state of the progress of the collection, but on whether the cycle has been cancelled, and the asserts should then check the state of the collection; in other words, the dual of the structure of control in this method.
>>
>> That said, I do not understand if there's a deeper rationale here in that sometimes we will want to skip a next uninterruptible phase of the collection regardless, because the state machine of the collection explicitly transitions us, under appropriate conditions, into a different phase than the otherwise normal flow, independent of whether a cancellation happened.
>>
>> For example, what would be the difference between your current test:
>>
>>
>> if (concurrent_mark_in_progress()) {
>>
>>
>> vs, say:
>>
>>
>> if (!evacuation_in_progress()) {
>>
>>
>> vs :
>>
>>
>> if (check_cancellation()) {
>> ... return false;
>> }
>> assert(!concurrent_mark_in_progress(), "Should have completed");
>> assert(evacuation_in_progress(), "Next logical phase of collection");
>>
>>
>> Just wondering if a more uniform idiom might make for easier code patterns and easier, more uniform, reasoning about correctness.
>
> This case doesn't quite follow the `check_cancellation` pattern of the others because it is a little bit different. Here, the uninterruptible operation is the final mark safepoint. This safepoint will itself check if the mark has been cancelled. If final mark detects the cancellation, it will do _nothing_, the concurrent mark will still be in progress and we _must_ resume the degenerated cycle from the marking phase. However, if the final mark safepoint does _not_ detect a cancellation, it will initialize the evacuation phase. If the code simply `checks_cancellation` after final mark it cannot tell if the cancellation request was received just _before_ final mark or just _after_ final mark. These two conditions require different resumption points for the degenerated cycle, so we must check the state of the collection cycle here.
Thanks for the explanation. I do wonder if there's a one-to-one bijection between the interrupt/cancellation detection point and the resumption point for the degeerated cycle. If that is true, then the resumption point may be identified with the state of the concurrent GC state machine (the degeneration work is in a coupled state machine in the sense of David Harel) and might avoid this more subtle and perhaps a little bit more fragile programming idiom. But we can think about this in the fullness of time.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19434#discussion_r1621250979
More information about the hotspot-gc-dev
mailing list