RFR: JDK-8293114: JVM should trim the native heap [v9]
Aleksey Shipilev
shade at openjdk.org
Thu Jul 13 18:57:05 UTC 2023
On Thu, 13 Jul 2023 18:08:03 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085.
>>
>> ---------------
>>
>> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process.
>>
>> ### Background:
>>
>> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS.
>>
>> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases.
>>
>> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim.
>>
>> #### GLIBC internals
>>
>> The following information I took from the glibc source code and experimenting.
>>
>> ##### Why do we need to trim manually? Does the Glibc not trim on free?
>>
>> Upon `free()`, glibc may return memory to the OS if:
>> - the returned block was mmap'ed
>> - the returned block was not added to tcache or to fastbins
>> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case:
>> a) for the main arena, glibc attempts to lower the brk()
>> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap.
>> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed.
>>
>> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely.
>>
>> To increase the ...
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision:
>
> - Fix windows build
> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap
> - Alekseys patch
> - Make test spikes more pronounced
> - Dont query procfs if logging is off
> - rename logtag again
> - When probing for safepoint end, use the smaller of (interval, 250ms)
> - Remove TrimNativeHeap and expand TrimNativeHeapInterval
> - Improve comments for non-supportive platforms
> - Aleksey cosmetics
> - ... and 25 more: https://git.openjdk.org/jdk/compare/d177dba7...e821d518
Want to replace "Native heap trimmer" with "Periodic native heap trimmer" too? Would be clear that we are suspending only the periodic one. The DCmd command would still be accepted and acted upon. Thinking about it, maybe we should do a follow-up PR and just forward that request to this thread? If so, we don't need to rename it to "Periodic".
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634733370
More information about the serviceability-dev
mailing list