RFR: JDK-8293114: GC should trim the native heap [v10]
Thomas Stuefe
stuefe at openjdk.org
Wed Jul 5 17:42:05 UTC 2023
On Mon, 3 Jul 2023 17:39:59 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
>> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits:
>>
>> - wip
>> - Merge branch 'master' into JDK-8293114-GC-trim-native
>> - wip
>> - merge master
>> - wip
>> - wip
>> - rename GCTrimNative TrimNative
>> - rename NativeTrimmer
>> - rename
>> - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp
>> - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e
>
>> My main concern with this change is increased latency. You wrote "_..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena.._". Not sure what "_usually_" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications.
>>
>> The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "_..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`.._" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself.
>
> The trim performed automatically on some free() is one done in the 'chunk' you were freeing in.
> While the explicit call visits all 'chunks'. @jdksjolen can explain this more deeply.
>
> I share your concern.
> But as this is a opt-in and the benefits for a certain set of workloads overwhelms the risk of latency increases I'm for this change.
@robehn @zhengyu123 @shipilev @simonis
Thank you all for your support and input. I dusted off the patch and simplified it:
- removed the adaptive step-down logic, since that was overly involved and in my tests did not work that well
- removed the expedite-trim logic.
- Pauses now stack.
So, in very few words, this patch
- adds an optional thread to execute trims at periodic intervals
- can be temporarily paused.
- I guarded sections that are vulnerable against concurrent work (GC STW phases) or that are doing build C-heap operations (e.g. monitor bulk deletion, stringtable cleanups, arena cleanups etc) with pauses.
-
I'll do some more benchmarks over the next days, but honestly don't expect to see this raising above background noise. If I have time, I also will simulate heavy C-Heap activity to give the trim something to do.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1622198358
More information about the hotspot-gc-dev
mailing list