RFR: 8366223: ZGC: ZPageAllocator::cleanup_failed_commit_multi_partition is broken [v2]
Joel Sikström
jsikstro at openjdk.org
Wed Aug 27 13:58:43 UTC 2025
On Wed, 27 Aug 2025 09:57:25 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:
>> While investigating [JDK-8366147](https://bugs.openjdk.org/browse/JDK-8366147) we also found that ZPageAllocator::cleanup_failed_commit_multi_partition is broken.
>>
>> The implementation is intended to work by going over each partitions part of the allocation one by one and returning any harvested or committed and mapped memory to to cache and returning any failed to be committed physical associations back to our internal free lists.
>>
>> But when deriving what part of the memory is associated with which partition it uses the wrong variable and ends up working with the wrong memory. And multiple partitions will end up working on the same supposedly mutually exclusive memory.
>>
>> This fix is to use the correct `partial_vmem` rather than `vmem` which holds the whole allocation.
>>
>> The new test reproduces this error. The test is tightly coupled to the current ZPageAllocator implementation and its policies. We might want to enhance this test in the future to ensure that we are actually provoking commit failures with harvesting and get notified if this changes. Currently there is no none intrusive way to do this. The best option might be our JFR events which contain all the information.
>>
>> * Testing (In progress)
>> * Oracle supported platforms tier1 + ZGC tier1-8
>
> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
>
> Run in driver rather than othervm
Good find and thank you for this! The test is definitely a nice addition.
-------------
Marked as reviewed by jsikstro (Committer).
PR Review: https://git.openjdk.org/jdk/pull/26953#pullrequestreview-3159917044
More information about the hotspot-gc-dev
mailing list