RFR: JDK-8268893: jcmd to trim the glibc heap [v3]
Thomas Stuefe
stuefe at openjdk.java.net
Fri Jul 9 04:57:35 UTC 2021
On Fri, 25 Jun 2021 06:22:37 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> Proposal to add a Linux+glibc-only jcmd to manually induce malloc_trim(3) in the VM process.
>>
>>
>> The glibc is somewhat notorious for retaining released C Heap memory: calling free(3) returns memory to the glibc, and most libc variants will return at least a portion of it back to the Operating System, but the glibc often does not.
>>
>> This depends on the granularity of the allocations and a number of other factors, but we found that many small allocations in particular may cause the process heap segment (hence RSS) to get bloaty. This can cause the VM to not recover from C-heap usage spikes.
>>
>> The glibc offers an API, "malloc_trim", which can be used to cause the glibc to return free'd memory back to the Operating System.
>>
>> This may cost performance, however, and therefore I hesitate to call malloc_trim automatically. That may be an idea for another day.
>>
>> Instead of an automatic trim I propose to add a jcmd which allows to manually trigger a libc heap trim. Such a command would have two purposes:
>> - when analyzing cases of high memory footprint, it allows to distinguish "real" footprint, e.g. leaks, from a cases where the glibc just holds on to memory
>> - as a stop gap measure it allows to release pressure from a high footprint scenario.
>>
>> Note that this command also helps with analyzing libc peaks which had nothing to do with the VM - e.g. peaks created by customer code which just happens to share the same process as the VM. Such memory does not even have to show up in NMT.
>>
>> I propose to introduce this command for Linux only. Other OSes (apart maybe AIX) do not seem to have this problem, but Linux is arguably important enough in itself to justify a Linux specific jcmd.
>>
>> CSR for this command: https://bugs.openjdk.java.net/browse/JDK-8269345
>>
>> Note that an alternative to a Linux-only jcmd would be a command which would trim the C-heap on all platforms, with implementations to be filled out later.
>>
>> =========
>>
>> This patch:
>>
>> - introduces a new jcmd, "VM.trim_libc_heap", no arguments, which trims the glibc heap on glibc platforms.
>> - includes a (rather basic) test
>> - the command calls malloc_trim(3), and additionally prints out its effect (changes caused in virt size, rss and swap space)
>> - I refactored some code in os_linux.cpp to factor out scanning /proc/self/status to get kernel memory information.
>>
>> =========
>>
>> Example:
>>
>> A programm causes a temporary peak in C-heap usage (in this case, triggered via Unsafe.allocateMemory), right away frees the memory again, so its not leaky. The peak in RSS was ~8G (even though the user allocation was way smaller - glibc has a lot of overhead). The effects of this peak linger even after returning that memory to the glibc:
>>
>>
>>
>> thomas at starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
>> Resident Set Size: 8685896K (peak: 8685896K) (anon: 8648680K, file: 37216K, shmem: 0K)
>> ^^^^^^^^
>>
>>
>> We execute the new trim command via jcmd:
>>
>>
>> thomas at starfish:~$ jjjcmd AllocCHeap VM.trim_libc_heap
>> 18770:
>> Attempting trim...
>> Done.
>> Virtual size before: 28849744k, after: 28849724k, (-20k)
>> RSS before: 8685896k, after: 920740k, (-7765156k) <<<<
>> Swap before: 0k, after: 0k, (0k)
>>
>>
>> It prints out reduction in virtual size, rss and swap. The virtual size did not decrease since no mappings had been unmapped by the glibc. However, the process heap was shrunk heavily by the glibc, resulting in a large drop in RSS (8.5G->900M), freeing >7G of memory:
>>
>>
>> thomas at starfish:~$ jjjcmd AllocCHeap VM.info | grep Resident
>> Resident Set Size: 920740K (peak: 8686004K) (anon: 883460K, file: 37280K, shmem: 0K)
>> ^^^^^^^
>>
>>
>> When the VM is started with -Xlog:os, this is also logged:
>>
>>
>> [139,068s][info][os] malloc_trim:
>> [139,068s][info][os] Virtual size before: 28849744k, after: 28849724k, (-20k)
>> RSS before: 8685896k, after: 920740k, (-7765156k)
>> Swap before: 0k, after: 0k, (0k)
>
> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision:
>
> - Volker feedback
> - Merge
> - Feedback Severin; renamed query function
> - start
I renamed the command as agreed upon in the CSR discussion.
-------------
PR: https://git.openjdk.java.net/jdk/pull/4510
More information about the serviceability-dev
mailing list