RFR: JDK-8293114: JVM should trim the native heap [v4]
Aleksey Shipilev
shade at openjdk.org
Thu Jul 6 17:53:08 UTC 2023
On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085.
>>
>> ---------------
>>
>> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process.
>>
>> ### Background:
>>
>> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS.
>>
>> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases.
>>
>> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim.
>>
>> #### GLIBC internals
>>
>> The following information I took from the glibc source code and experimenting.
>>
>> ##### Why do we need to trim manually? Does the Glibc not trim on free?
>>
>> Upon `free()`, glibc may return memory to the OS if:
>> - the returned block was mmap'ed
>> - the returned block was not added to tcache or to fastbins
>> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case:
>> a) for the main arena, glibc attempts to lower the brk()
>> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap.
>> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed.
>>
>> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely.
>>
>> To increase the ...
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
>
> last cleanups and shade feedback
Another read. I think we would need to tighten up style a bit too: the newline spacing style is different across the change.
src/hotspot/share/runtime/globals.hpp line 1990:
> 1988: "If TrimNativeHeap is enabled: interval, in ms, at which " \
> 1989: "the GC will attempt to trim the native heap.") \
> 1990: range(100, UINT_MAX) \
Still mentions "GC". Should we accept 1ms as valid interval? 100ms might be too big for short-running workloads. Very aggressive trim at 1ms would also be useful for stress-testing trim code.
src/hotspot/share/runtime/trimNative.cpp line 44:
> 42: Monitor* const _lock;
> 43: bool _stop;
> 44: unsigned _suspend_count;
`uint16_t`, maybe?
src/hotspot/share/runtime/trimNative.cpp line 49:
> 47: uint64_t _num_trims_performed;
> 48:
> 49: bool suspended() const {
Style: `is_suspended()`.
src/hotspot/share/runtime/trimNative.cpp line 66:
> 64: }
> 65:
> 66: bool stopped() const {
`is_stopped()`?
src/hotspot/share/runtime/trimNative.cpp line 75:
> 73: SafepointSynchronize::is_at_safepoint() ||
> 74: SafepointSynchronize::is_synchronizing();
> 75: }
Suggestion:
bool at_or_nearing_safepoint() const {
return SafepointSynchronize::is_at_safepoint() ||
SafepointSynchronize::is_synchronizing();
}
src/hotspot/share/runtime/trimNative.cpp line 84:
> 82: run_inner();
> 83: log_info(trim)("NativeTrimmer stop.");
> 84: }
Do we need this logging? We can simplify and just inline `run_inner` here.
src/hotspot/share/runtime/trimNative.cpp line 99:
> 97:
> 98: if (trim_result) {
> 99: _num_trims_performed++;
Simplification: let's just use `Atomic::*` for `_num_trims_performed`, and this `trim_result` dance (which I think is only needed to get the update under lock?) is not needed.
src/hotspot/share/runtime/trimNative.cpp line 149:
> 147: return true;
> 148: } else {
> 149: log_info(trim)("Trim native heap (no details)");
Consistency: `Trim native heap: complete, no details`.
src/hotspot/share/runtime/trimNative.cpp line 177:
> 175: // No need to wakeup trimmer
> 176: }
> 177: log_debug(trim)("NativeTrimmer pause (%s) (%u)", reason, n);
Suggestion:
log_debug(trim)("NativeTrimmer pause for %s (%u suspend requests)", reason, n);
src/hotspot/share/runtime/trimNative.cpp line 190:
> 188: }
> 189: }
> 190: log_debug(trim)("NativeTrimmer unpause (%s) (%u)", reason, n);
Suggestion:
log_debug(trim)("NativeTrimmer unpause for %s (%u suspend requests)", reason, n);
src/hotspot/share/runtime/trimNative.hpp line 27:
> 25: */
> 26:
> 27: #ifndef SHARE_GC_SHARED_TRIMNATIVE_HPP
This is now `SHARE_RUNTIME_TRIMNATIVE_HPP`.
test/hotspot/jtreg/runtime/os/TestTrimNative.java line 243:
> 241: }
> 242:
> 243: if (args[0].equals("RUN")) {
The usual trick is to pull this "internal" main into a class, and then reference it. Like:
public Test {
public void test() {
runWith("...", "Test$Workload");
}
public class Workload {
public static void main(String... args) {}
}
}
test/hotspot/jtreg/runtime/os/TestTrimNative.java line 247:
> 245: System.out.println("Will spike now...");
> 246: for (int i = 0; i < numAllocations; i++) {
> 247: ptrs[i] = Unsafe.getUnsafe().allocateMemory(szAllocations);
Pull `Unsafe.getUnsafe()` into a local or a field?
test/hotspot/jtreg/runtime/os/TestTrimNative.java line 270:
> 268: runTest(Arrays.copyOfRange(args, 1, args.length));
> 269: } else if (args[0].equals("testOffOnNonCompliantPlatforms")) {
> 270: testOffOnNonCompliantPlatforms();
Inline these `test*`? This would give some symmetry against `runTest`.
-------------
PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1516837594
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254621393
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254632878
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254622666
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254665880
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254633258
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254639542
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254653161
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254657016
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678136
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678539
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254749047
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254746310
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254743705
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254744410
More information about the serviceability-dev
mailing list