[crac] RFR: Wait until G1 GC has finished before creating a snapshot. [v6]

Thu Aug 3 12:28:57 UTC 2023

On Thu, 3 Aug 2023 12:03:44 GMT, Anton Kozlov <akozlov at openjdk.org> wrote:

>> Jan Kratochvil has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Make -active volatile.
>>    - bugreported by Sergey Nazarkin
>
> src/hotspot/share/gc/g1/g1UncommitRegionTask.cpp line 133:
> 
>> 131:     // Nothing more to do, change state and report a summary.
>> 132:     set_active(false);
>> 133:     _active_sem.signal();
> 
> AFAICS the signal() increments the semaphore count. So the count is the number of executions of the uncommit task. That's, another wait() may succeed even when the opeartion is in progress (by decrementing the value accumulated during previous exections).
> 
> And I share concerns about access to the _active. For safe access there should be a critical section with acquire and release. 
> 
> Do we really need to synchronize with the G1UncommitRegionTask? Maybe, can we call `G1CollectedHeap::uncommit_regions` directly (that is being called by the UncommitTask eventually)?

If there are multiple `signal()` calls stored then it will just run multiple times with no effect in the `while` cycle: https://github.com/openjdk/crac/pull/93/files#diff-b7c9d7a6a2eb233cfe201f2b07c4623fda47f1ac06174b66a95b1e46d9ec6450R143
I do not think `_active` is needed to be atomic there, that would be also already fixing existing OpenJDK code. But I will put it there, I agree it cannot harm. atomic is definitely cheaper than a mutex acquire+release protecting a single variable with the same safety so no mutex is needed.
You suggest during CRaC snapshot the whole VM is already frozen so we do not need any synchronization? OK, if it is so I can just call `uncommit_regions`.

-------------

PR Review Comment: https://git.openjdk.org/crac/pull/93#discussion_r1283129815