RFR: JDK-8293114: JVM should trim the native heap [v4]

Aleksey Shipilev shade at openjdk.org
Thu Jul 6 17:53:08 UTC 2023


On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:

>> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085.
>>  
>> ---------------
>> 
>> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process.
>> 
>> ### Background:
>> 
>> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS.
>> 
>> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases.
>> 
>> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim.
>> 
>> #### GLIBC internals
>> 
>> The following information I took from the glibc source code and experimenting.
>> 
>> ##### Why do we need to trim manually? Does the Glibc not trim on free?
>> 
>> Upon `free()`, glibc may return memory to the OS if:
>> - the returned block was mmap'ed
>> - the returned block was not added to tcache or to fastbins
>> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case:
>>   a) for the main arena, glibc attempts to lower the brk()
>>   b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap.
>> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed.
>> 
>> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely.
>> 
>> To increase the ...
>
> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision:
> 
>   last cleanups and shade feedback

Another read. I think we would need to tighten up style a bit too: the newline spacing style is different across the change.

src/hotspot/share/runtime/globals.hpp line 1990:

> 1988:           "If TrimNativeHeap is enabled: interval, in ms, at which "        \
> 1989:           "the GC will attempt to trim the native heap.")                   \
> 1990:           range(100, UINT_MAX)                                              \

Still mentions "GC". Should we accept 1ms as valid interval? 100ms might be too big for short-running workloads. Very aggressive trim at 1ms would also be useful for stress-testing trim code.

src/hotspot/share/runtime/trimNative.cpp line 44:

> 42:   Monitor* const _lock;
> 43:   bool _stop;
> 44:   unsigned _suspend_count;

`uint16_t`, maybe?

src/hotspot/share/runtime/trimNative.cpp line 49:

> 47:   uint64_t _num_trims_performed;
> 48: 
> 49:   bool suspended() const {

Style: `is_suspended()`.

src/hotspot/share/runtime/trimNative.cpp line 66:

> 64:   }
> 65: 
> 66:   bool stopped() const {

`is_stopped()`?

src/hotspot/share/runtime/trimNative.cpp line 75:

> 73:         SafepointSynchronize::is_at_safepoint() ||
> 74:         SafepointSynchronize::is_synchronizing();
> 75:   }

Suggestion:

  bool at_or_nearing_safepoint() const {
    return SafepointSynchronize::is_at_safepoint() ||
           SafepointSynchronize::is_synchronizing();
  }

src/hotspot/share/runtime/trimNative.cpp line 84:

> 82:     run_inner();
> 83:     log_info(trim)("NativeTrimmer stop.");
> 84:   }

Do we need this logging? We can simplify and just inline `run_inner` here.

src/hotspot/share/runtime/trimNative.cpp line 99:

> 97: 
> 98:         if (trim_result) {
> 99:           _num_trims_performed++;

Simplification: let's just use `Atomic::*` for `_num_trims_performed`, and this `trim_result` dance (which I think is only needed to get the update under lock?) is not needed.

src/hotspot/share/runtime/trimNative.cpp line 149:

> 147:         return true;
> 148:       } else {
> 149:         log_info(trim)("Trim native heap (no details)");

Consistency: `Trim native heap: complete, no details`.

src/hotspot/share/runtime/trimNative.cpp line 177:

> 175:       // No need to wakeup trimmer
> 176:     }
> 177:     log_debug(trim)("NativeTrimmer pause (%s) (%u)", reason, n);

Suggestion:

    log_debug(trim)("NativeTrimmer pause for %s (%u suspend requests)", reason, n);

src/hotspot/share/runtime/trimNative.cpp line 190:

> 188:       }
> 189:     }
> 190:     log_debug(trim)("NativeTrimmer unpause (%s) (%u)", reason, n);

Suggestion:

    log_debug(trim)("NativeTrimmer unpause for %s (%u suspend requests)", reason, n);

src/hotspot/share/runtime/trimNative.hpp line 27:

> 25:  */
> 26: 
> 27: #ifndef SHARE_GC_SHARED_TRIMNATIVE_HPP

This is now `SHARE_RUNTIME_TRIMNATIVE_HPP`.

test/hotspot/jtreg/runtime/os/TestTrimNative.java line 243:

> 241:         }
> 242: 
> 243:         if (args[0].equals("RUN")) {

The usual trick is to pull this "internal" main into a class, and then reference it. Like:


public Test {
 
  public void test() {
    runWith("...", "Test$Workload");
  }  

  public class Workload {
    public static void main(String... args) {} 
  }
}

test/hotspot/jtreg/runtime/os/TestTrimNative.java line 247:

> 245:             System.out.println("Will spike now...");
> 246:             for (int i = 0; i < numAllocations; i++) {
> 247:                 ptrs[i] = Unsafe.getUnsafe().allocateMemory(szAllocations);

Pull `Unsafe.getUnsafe()` into a local or a field?

test/hotspot/jtreg/runtime/os/TestTrimNative.java line 270:

> 268:             runTest(Arrays.copyOfRange(args, 1, args.length));
> 269:         } else if (args[0].equals("testOffOnNonCompliantPlatforms")) {
> 270:             testOffOnNonCompliantPlatforms();

Inline these `test*`? This would give some symmetry against `runTest`.

-------------

PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1516837594
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254621393
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254632878
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254622666
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254665880
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254633258
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254639542
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254653161
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254657016
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678136
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678539
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254749047
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254746310
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254743705
PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254744410


More information about the serviceability-dev mailing list