RFR: 8353564: Fail fatally if os::release_memory or os::uncommit_memory fails [v5]
Thomas Stuefe
stuefe at openjdk.org
Thu Jan 29 08:02:46 UTC 2026
On Wed, 28 Jan 2026 21:24:33 GMT, Robert Toyonaga <duke at openjdk.org> wrote:
>> This PR is a follow up to JDK-8341491. See original discussion: https://github.com/openjdk/jdk/pull/24084#issuecomment-2752513700
>>
>> This PR makes `os::release_memory`, `os::uncommit_memory`, `os::release_memory_special`, and `os::unmap_memory` fail fatally if they encounter an error. These methods require obtaining the NMT lock. Fatally failing in these places would potentially allow for the tightening of NMT virtual memory locking scopes (future work, if this PR is accepted). Already in most cases, the callers fail fatally or assert(false) when these os:: methods fail. Another reason to fatally fail is that if the OS memory operation fails, it can be difficult to know for sure what state the OS left the memory in and recover.
>>
>> `release_memory`/`uncommit_memory`/`release_memory_special`/`unmap_memory` can fail due to ① Bad arguments, or ② The OS encountered an issue out of control of the JVM.
>>
>> ①
>> If there is a JVM bug, it's probably reasonable to fatally fail here. Or the caller could be intentionally passing arguments that may or may not be valid. I don't think there is any code like that currently, and this is probably a bad pattern to be following anyway.
>>
>> ②
>> In platform dependent code:
>> With regard to mmap/munmap, the only errors that aren't due to bad arguments are ENOMEM and ones related to file descriptors (which are not applicable to uncommit or release).
>> On Windows, VirtualFree only fails due to bad arguments.
>> On AIX, shmdt and disclaim64 only fail due to bad arguments. msync could spontaneously fail with EIO: "An I/O error occurred while reading from or writing to the file system."
>> On BSD, it seems like mprotect and madvise fail only due to bad arguments or invalid privileges.
>>
>> In the [original discussion](https://github.com/openjdk/jdk/pull/24084#issuecomment-2752513700), the main question was whether ENOMEM upon os::uncommit_memory was recoverable. This may be possible if we uncommit the middle of a region - splitting it in two. This could exceed the limit of the number of mappings resulting in ENOMEM.
>>
>> If none of the scenarios in ② are recoverable, then perhaps fatally failing is OK.
>>
>> Testing:
>> - Tier 1.
>> - Manual testing to make sure we fatally fail and the correct messages are printed.
>
> Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision:
>
> Don't pass false, rely on default argument. Punctuation. Whitespace.
Some small remarks remain, otherwise this is good.
src/hotspot/os/linux/os_linux.cpp line 3458:
> 3456: if (ep.saved_errno() == ENOMEM) {
> 3457: fatal("Failed to uncommit " RANGEFMT ". It is possible that the process's maximum number of mappings would have been exceeded. Try increasing the limit.", RANGEFMTARGS(addr, size));
> 3458: }
good.
src/hotspot/share/memory/memoryReserver.cpp line 236:
> 234:
> 235: if (reserved.special()) {
> 236: os::release_memory_special(reserved.base(), reserved.size());
Here is another possible simplification.
The point of `os::release_memory_special` was to handle system V shm backed large pages on Linux. We got rid of that mechanism though, so now it just calls os::release_memory.
I created https://bugs.openjdk.org/browse/JDK-8376650 to track that.
test/hotspot/gtest/runtime/test_os.cpp line 667:
> 665: char* p = setup_release_test_memory();
> 666: os::release_memory(p, M); // Release part of the range
> 667: }
This is a good rewrite of the tests. I did not even think of this.
Proposal: do this:
#ifdef ASSERT
#define TEST_RELEASE_RANGE_ERROR(name) TEST_VM_ASSERT_MSG(os, name, ".*bad release")
#else
#define TEST_RELEASE_RANGE_ERROR(name) TEST_VM_FATAL_ERROR_MSG(os, name, ".*Failed to release.*")
#endif
and use TEST_RELEASE_RANGE_ERROR as header, saves duplicating these lines five times.
test/hotspot/gtest/runtime/test_os.cpp line 745:
> 743:
> 744: if (p2 != nullptr) {
> 745: os::release_memory((char*)p2, stripe_len);
All these tests, if os::commit is broken, will crash now instead of giving us a test error in gtest. But I think that is acceptable. These errors are very improbable.
-------------
PR Review: https://git.openjdk.org/jdk/pull/29240#pullrequestreview-3721135695
PR Review Comment: https://git.openjdk.org/jdk/pull/29240#discussion_r2740336278
PR Review Comment: https://git.openjdk.org/jdk/pull/29240#discussion_r2740365722
PR Review Comment: https://git.openjdk.org/jdk/pull/29240#discussion_r2740394411
PR Review Comment: https://git.openjdk.org/jdk/pull/29240#discussion_r2740401554
More information about the hotspot-dev
mailing list