RFR: JDK-8293114: JVM should trim the native heap [v8]

Thomas Stuefe stuefe at openjdk.org
Mon Jul 10 13:53:36 UTC 2023


> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085.
>  
> ---------------
> 
> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process.
> 
> ### Background:
> 
> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS.
> 
> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases.
> 
> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim.
> 
> #### GLIBC internals
> 
> The following information I took from the glibc source code and experimenting.
> 
> ##### Why do we need to trim manually? Does the Glibc not trim on free?
> 
> Upon `free()`, glibc may return memory to the OS if:
> - the returned block was mmap'ed
> - the returned block was not added to tcache or to fastbins
> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case:
>   a) for the main arena, glibc attempts to lower the brk()
>   b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap.
> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed.
> 
> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely.
> 
> To increase the chance of auto-reclamation happening, one can do one or more t...

Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision:

 - Make test spikes more pronounced
 - Dont query procfs if logging is off
 - rename logtag again
 - When probing for safepoint end, use the smaller of (interval, 250ms)
 - Remove TrimNativeHeap and expand TrimNativeHeapInterval
 - Improve comments for non-supportive platforms
 - Aleksey cosmetics
 - suspend count return 16 bits
 - Fix linker errors
 - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap
 - ... and 22 more: https://git.openjdk.org/jdk/compare/a892b88f...15566761

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/14781/files
  - new: https://git.openjdk.org/jdk/pull/14781/files/aa4dbc0b..15566761

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=06-07

  Stats: 2080 lines in 82 files changed: 1021 ins; 878 del; 181 mod
  Patch: https://git.openjdk.org/jdk/pull/14781.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781

PR: https://git.openjdk.org/jdk/pull/14781


More information about the serviceability-dev mailing list