RFR: 8330144: Revise os::free_memory()
Thomas Stuefe
stuefe at openjdk.org
Tue Jul 9 18:29:17 UTC 2024
On Mon, 8 Jul 2024 17:33:41 GMT, Robert Toyonaga <duke at openjdk.org> wrote:
> ### Summary
> On linux, change `os::free_memory(char *addr, size_t bytes, size_t alignment_hint)` so that it uses `madvise(MADV_DONTNEED)` (similar to the BSD implementation) instead of recommitting over the existing committed memory to discard the existing pages. This function should free the underlying memory without uncommitting. The benefit of this change is that we can get rid of conditional logic dependent on whether we're dealing with huge pages, `madvise` can't fail, and we can also get rid of the "alignment_hint" parameter.
>
> `os::free_memory(char *addr, size_t bytes, size_t alignment_hint)` has also been renamed to `os::free_memory_without_uncommit(char *addr, size_t bytes)` to differentiate it from `os::free_memory()` which reports the size of free memory instead of actually releasing memory.
>
> **Transparent huge pages:**
> `madvise(MADV_DONTNEED)` works with THP. As with small pages, `madvise(MADV_DONTNEED)` results in the memory being freed, RSS decreasing, and the addresses can be re-touched without being explicitly recommitted.
>
> To determine this, I set /sys/kernel/mm/transparent_hugepage/enabled to "always" and allocated a large amount of memory. Then /proc/PID/smaps shows that THP are being used to back that memory. After calling `free_memory_without_uncommit`, RSS decreases indicating the memory is no longer live. The `os::committed_in_range function` also reports that the memory has been freed (This function should probably be renamed to `live_in_range`). Touching the addresses again afterward is fine as well.
>
> **Explicit huge pages:**
> `madvise(MADV_DONTNEED)` does not result in memory being freed when used on explicit huge pages. However, the pages are not lost either. Additionally, after `madvise(MADV_DONTNEED)`, we can retouch the addresses without any problems. In conclusion, `madvise(MADV_DONTNEED)` has no affect on huge pages. This means the behavior of of this function with respect to huge pages remains the same. We can remove the "alignment_hint" parameter.
>
> To determine this, I allocated some huge pages via /proc/sys/vm/nr_hugepages. Successful allocation was confirmed with /proc/meminfo. After calling `free_memory_without_uncommit`, /proc/meminfo shows no change in the number of huge pages in use. Explicit huge pages are not reflected in RSS so I used the `os::committed_in_range function` instead. After calling `free_memory_without_uncommit`, the `os::committed_in_range` function reports that the memory is still live. Unfortu...
Great, thanks @roberttoyonaga. The main work was the analysis work beforehand.
About naming, I would name the thing "os::disclaim_memory". free_without_uncommit is a mouthful. There is a precedence in the "disclaim" API on AIX, which in a future RFE may be used to implement os::disclaim_memory.
test/hotspot/gtest/runtime/test_os.cpp line 988:
> 986: const size_t size = pages * page_sz;
> 987:
> 988: char *base = os::reserve_memory(size, false, mtTest);
I prefer char* base (star at type) syntax, and its much more common in hotspot.
test/hotspot/gtest/runtime/test_os.cpp line 1002:
> 1000: size_t committed_size;
> 1001: address committed_start;
> 1002: ASSERT_FALSE(os::committed_in_range((address) base, size, committed_start, committed_size));
Is there a chance of this generating false positives? Do we know if the madvise effect immediate or delayed?
-------------
Changes requested by stuefe (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/20080#pullrequestreview-2167064443
PR Review Comment: https://git.openjdk.org/jdk/pull/20080#discussion_r1670980361
PR Review Comment: https://git.openjdk.org/jdk/pull/20080#discussion_r1670985051
More information about the hotspot-dev
mailing list