RFR: 8366223: ZGC: ZPageAllocator::cleanup_failed_commit_multi_partition is broken [v2]
Axel Boldt-Christmas
aboldtch at openjdk.org
Thu Aug 28 05:05:51 UTC 2025
On Wed, 27 Aug 2025 09:57:25 GMT, Axel Boldt-Christmas <aboldtch at openjdk.org> wrote:
>> While investigating [JDK-8366147](https://bugs.openjdk.org/browse/JDK-8366147) we also found that ZPageAllocator::cleanup_failed_commit_multi_partition is broken.
>>
>> The implementation is intended to work by going over each partitions part of the allocation one by one and returning any harvested or committed and mapped memory to to cache and returning any failed to be committed physical associations back to our internal free lists.
>>
>> But when deriving what part of the memory is associated with which partition it uses the wrong variable and ends up working with the wrong memory. And multiple partitions will end up working on the same supposedly mutually exclusive memory.
>>
>> This fix is to use the correct `partial_vmem` rather than `vmem` which holds the whole allocation.
>>
>> The new test reproduces this error. The test is tightly coupled to the current ZPageAllocator implementation and its policies. We might want to enhance this test in the future to ensure that we are actually provoking commit failures with harvesting and get notified if this changes. Currently there is no none intrusive way to do this. The best option might be our JFR events which contain all the information.
>>
>> * Testing (In progress)
>> * Oracle supported platforms tier1 + ZGC tier1-8
>
> Axel Boldt-Christmas has updated the pull request incrementally with one additional commit since the last revision:
>
> Run in driver rather than othervm
Thanks for the reviews.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26953#issuecomment-3231893652
More information about the hotspot-gc-dev
mailing list