RFR: 8341491: Reserve and commit memory operations should be protected by NMT lock
Robert Toyonaga
duke at openjdk.org
Wed Mar 26 14:19:17 UTC 2025
On Tue, 25 Mar 2025 18:55:56 GMT, Stefan Karlsson <stefank at openjdk.org> wrote:
> Are there any code that we know of that doesn't fit into a synchronization pattern similar to the above?
> I can think of some contrived example where Thread B asks the OS for memory mappings and uses that to ascertain that a pre-determined address has been reserved, and how that could lead to an incorrect booking as you described, but do we really have code like that?
>From what I can tell, it doesn't look like that's happening anywhere, someone else may know better though. Similarly, for uncommit, the base address must be passed over from somewhere else in the JVM so relying on some external synchonization seems reasonable here too. If this problem scenario is not present in the current code and it's not expected it to become a possiblity in the future, then I suppose there's no reason to guard against it. Maybe just a comment explaining the reasoning is good enough (and a warning not to use such patterns).
-----------
> When does a release/uncommit fail? Would that be a JVM bug?
On Windows, VirtualFree also looks like it only fails if an invalid set of arguments are passed. So if os::pd_release fails it's probably a JVM bug. Uncommit uses mmap, which could fail for a larger variety of reasons. Some reasons are out of control of the JVM. For example: "The number of mapped regions would exceed an implementation-defined limit (per process or per system)." See [here](https://github.com/openjdk/jdk/blob/jdk-25%2B15/src/hotspot/share/memory/metaspace/virtualSpaceNode.cpp#L191)
> What state is the memory in when such a failure happens? Do we even know if the memory is still committed if an uncommit fails?
If release/uncommit fails, then it would be hard to know what state the target memory is in. If the arguments are invalid (bad base address), the target region may not even be allocated. Or, in the case of uncommit, if the base address is not aligned, maybe the target committed region does indeed exist but the uncommit still fails. So it would be hard to determine how to readjust the NMT accounting afterward.
> I don't understand why we don't treat that as a fatal error OR make sure that all call-sites handles that error, which they don't do today.
I think release/uncommit failures should be handled by the callers. Currently, uncommit failure is handled by most places in the caller, release failure seems mostly not. If we expect that release/uncommit could sometimes fail for valid reasons, then we cannot fail fatally in the os:: functions. Since, at least for uncommit, we could reasonably fail without it being a JVM bug, I think we shouldn't fatally crash when that happens.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24084#issuecomment-2754589497
More information about the hotspot-dev
mailing list