RFR: 8343460: ZGC: Crash in ZRemembered::scan_page_and_clear_remset [v2]

Stefan Karlsson stefank at openjdk.org
Tue Nov 12 13:33:14 UTC 2024


On Tue, 12 Nov 2024 12:57:27 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:

>> `free_page` may concurrently delete the remset while `scan_page_and_clear_remset` is scanning the page.  Move it to after the `_safe_recycle.register_and_clone_if_activated`. Doing the deletion on the new cloned page will not occur as it not old. And the registered page's remset will be deleted by the destructor when the `_safe_recycle` scope quest up the `safe_destroy`. 
>> 
>> To be able to push the deletion all the way into `prepare_to_recycle` the unnecessary use of this mechanism had to be removed. `free_pages_alloc_failed` does not need to protect the pages, as they are not yet present in the PageTable. We have simply taken them out of the cache, but failed to commit or map some memory, so we are putting these pages back into the cache. See bed9c260bbc9bd208b03d7eedd4e2cfa151b58f2
>> 
>> The fix works without this last commit. So we must be careful to check that these pages cannot be reached by some other means. The FoundOld bitmap iteration goes through the PageTable so even if an old page was registered, we would not find these pages.
>> 
>> There is a scary lack of a fence between the removal of the page from the PageTable and the lock in `register_and_clone_if_activated`.
>> 
>> The stress test will deterministically crash with this modified code 0756e0056b44ee16bee81256f556c8df981ceaf9 and using these options `-XX:+UseZGC -XX:+UseNewCode -XX:ZCollectionIntervalMinor=0.1 -XX:ZCollectionIntervalMajor=1 -XX:ZFragmentationLimit=0 -XX:-CreateCoredumpOnCrash`, and no longer does after with this patch.
>
> Axel Boldt-Christmas has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Add comment about prepare_to_recycle
>  - Revert recycle_page call, still update last_used

Marked as reviewed by stefank (Reviewer).

-------------

PR Review: https://git.openjdk.org/jdk/pull/21905#pullrequestreview-2429602801


More information about the hotspot-gc-dev mailing list