RFR: JDK-8293114: GC should trim the native heap [v10]

Thomas Stuefe stuefe at openjdk.org
Wed Jul 5 17:42:05 UTC 2023


On Mon, 3 Jul 2023 17:39:59 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

>> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
>> 
>>  - wip
>>  - Merge branch 'master' into JDK-8293114-GC-trim-native
>>  - wip
>>  - merge master
>>  - wip
>>  - wip
>>  - rename GCTrimNative TrimNative
>>  - rename NativeTrimmer
>>  - rename
>>  - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp
>>  - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e
>
>> My main concern with this change is increased latency. You wrote "_..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena.._". Not sure what "_usually_" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications.
>> 
>> The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "_..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`.._" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself.
> 
> The trim performed automatically on some free() is one done in the 'chunk' you were freeing in.
> While the explicit call visits all 'chunks'. @jdksjolen can explain this more deeply.
> 
> I share your concern.
> But as this is a opt-in and the benefits for a certain set of workloads overwhelms the risk of latency increases I'm for this change.

@robehn @zhengyu123 @shipilev @simonis 
Thank you all for your support and input. I dusted off the patch and simplified it:
- removed the adaptive step-down logic, since that was overly involved and in my tests did not work that well
- removed the expedite-trim logic.
- Pauses now stack.

So, in very few words, this patch
- adds an optional thread to execute trims at periodic intervals
- can be temporarily paused.
- I guarded sections that are vulnerable against concurrent work (GC STW phases) or that are doing build C-heap operations (e.g. monitor bulk deletion, stringtable cleanups, arena cleanups etc) with pauses.
- 
I'll do some more benchmarks over the next days, but honestly don't expect to see this raising above background noise. If I have time, I also will simulate heavy C-Heap activity to give the trim something to do.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1622198358


More information about the hotspot-gc-dev mailing list