RFR: JDK-8268893: jcmd to trim the glibc heap

Severin Gehwolf sgehwolf at openjdk.java.net
Thu Jun 17 08:34:11 UTC 2021


On Wed, 16 Jun 2021 12:57:44 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

> The glibc is somewhat notorious for retaining released C Heap memory: calling free(3) returns memory to the glibc, and most libc variants will return at least a portion of it back to the Operating System, but the glibc often does not.
> 
> This depends on the granularity of the allocations and a number of other factors, but we found that many small allocations in particular may cause the process heap segment (hence RSS) to get bloaty. This can cause the VM to not recover from C-heap usage spikes.
> 
> The glibc offers an API, "malloc_trim", which can be used to cause the glibc to return free'd memory back to the Operating System.
> 
> This may cost performance, however, and therefore I hesitate to call malloc_trim automatically. That may be an idea for another day.
> 
> Instead of an automatic trim I propose to add a jcmd which allows to manually trigger a libc heap trim. Such a command would have two purposes:
> - when analyzing cases of high memory footprint, it allows to distinguish "real" footprint, e.g. leaks, from a cases where the glibc just holds on to memory
> - as a stop gap measure it allows to release pressure from a high footprint scenario.
> 
> Note that this command also helps with analyzing libc peaks which had nothing to do with the VM - e.g. peaks created by customer code which just happens to share the same process as the VM. Such memory does not even have to show up in NMT.
> 
> I propose to introduce this command for Linux only. Other OSes (apart maybe AIX) do not seem to have this problem, but Linux is arguably important enough in itself to justify a Linux specific jcmd.
> 
> If this finds agreement, I will file a CSR.
> 
> =========
> 
> This patch:
> 
> - introduces a new jcmd, "VM.trim_libc_heap", no arguments, which trims the glibc heap on glibc platforms.
> - includes a (rather basic) test
> - the command calls malloc_trim(3), and additionally prints out its effect (changes caused in virt size, rss and swap space)
> - I refactored some code in os_linux.cpp to factor out scanning /proc/self/status to get kernel memory information.
> 
> =========
> 
> Example:
> 
> A programm causes a temporary peak in C-heap usage (in this case, triggered via Unsafe.allocateMemory), right away frees the memory again, so its not leaky. The peak in RSS was ~8G (even though the user allocation was way smaller - glibc has a lot of overhead). The effects of this peak linger even after returning that memory to the glibc:
> 
> 
> 
> thomas at starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
> Resident Set Size: 8685896K (peak: 8685896K) (anon: 8648680K, file: 37216K, shmem: 0K)
>                    ^^^^^^^^
> 
> 
> We execute the new trim command via jcmd:
> 
> 
> thomas at starfish:~$ jjjcmd AllocCHeap VM.trim_libc_heap
> 18770:
> Attempting trim...
> Done.
> Virtual size before: 28849744k, after: 28849724k, (-20k)
> RSS before: 8685896k, after: 920740k, (-7765156k)  <<<<
> Swap before: 0k, after: 0k, (0k)
> 
> 
> It prints out reduction in virtual size, rss and swap. The virtual size did not decrease since no mappings had been unmapped by the glibc. However, the process heap was shrunk heavily by the glibc, resulting in a large drop in RSS (8.5G->900M), freeing >7G of memory:
> 
> 
> thomas at starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
> Resident Set Size: 920740K (peak: 8686004K) (anon: 883460K, file: 37280K, shmem: 0K)
>                    ^^^^^^^
> 
> 
> When the VM is started with -Xlog:os, this is also logged:
> 
> 
> [139,068s][info][os] malloc_trim:
> [139,068s][info][os] Virtual size before: 28849744k, after: 28849724k, (-20k)
> RSS before: 8685896k, after: 920740k, (-7765156k)
> Swap before: 0k, after: 0k, (0k)

src/hotspot/os/linux/os_linux.hpp line 186:

> 184:     ssize_t rssanon;    // resident set size
> 185:     ssize_t rssfile;    // resident set size
> 186:     ssize_t rssshmem;   // resident set size

Are these comments intionally the same for all three? Seems weird.

-------------

PR: https://git.openjdk.java.net/jdk/pull/4510


More information about the serviceability-dev mailing list