hotspot-gc-dev Digest, Vol 168, Issue 38

Rodrigo Bruno rbruno at gsd.inesc-id.pt
Tue Jul 20 13:34:16 UTC 2021


Dear Jonathan, Thomas, and Man,

(regarding Status of JEP-8204088/JDK-8236073)

Ruslan and I have been following this thread and agree with the points
raised by Jonathan and Man. From our perspective, JVM heaps need to be
elastic to adapt to the available memory. Containers and VMs can already
increase/decrease the available memory on the fly but JVM heaps can't adapt
to it.

We believe that having a manageable flag to regulate the maximum memory
(such as CurrentMaxHeapSize) is the way to go. The SoftMaxHeapSize helps
but may fail to prevent OOMs that might happen if the GC expands the heap
too aggressively past the soft limit. Having external components such as a
Java agent to manage the memory limit does not solve the issue as the
memory can still grow beyond SoftMaxHeapSize.

Just for completeness, a prototype of CurrentMaxHeapSize is available here
http://cr.openjdk.java.net/~tschatzl/jelastic/cmx/. The idea was that users
can set -Xmx to a very high value (maximum memory available locally) and
then control the current limit via a manageable JVM flag
(CurrentMaxHeapSize) that would enforce the memory utilization. This new
(hard) limit would not have an impact on existing GC heap sizing heuristics
IMHO.

Jonathan and Man, would the CurrentMaxHeapSize be helpful? Thomas, do you
think the proposed patch is feasible in terms of its design and
implementation? We are open to suggestions/feedback about the patch and on
how to make merging it possible.

Best,
rodrigo

<hotspot-gc-dev-request at openjdk.java.net> escreveu no dia segunda,
21/06/2021 à(s) 13:00:

> Send hotspot-gc-dev mailing list submissions to
>         hotspot-gc-dev at openjdk.java.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-dev
> or, via email, send a message with subject or body 'help' to
>         hotspot-gc-dev-request at openjdk.java.net
>
> You can reach the person managing the list at
>         hotspot-gc-dev-owner at openjdk.java.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of hotspot-gc-dev digest..."
>
>
> Today's Topics:
>
>    1. Re: RFR: 8268458: Add verification type for evacuation
>       failures (Thomas Schatzl)
>    2. Integrated: 8268952: Automatically update heap sizes in
>       G1MonitoringScope (Thomas Schatzl)
>    3. Re: RFR: 8269077: TestSystemGC uses "require vm.gc.G1" for
>       large pages subtest (Thomas Schatzl)
>    4. Integrated: 8268458: Add verification type for evacuation
>       failures (Thomas Schatzl)
>    5. Re: RFR: 8268952: Automatically update heap sizes in
>       G1MonitoringScope (Thomas Schatzl)
>    6. Re: Status of JEP-8204088/JDK-8236073 (Thomas Schatzl)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 21 Jun 2021 11:12:28 GMT
> From: Thomas Schatzl <tschatzl at openjdk.java.net>
> To: <hotspot-gc-dev at openjdk.java.net>
> Subject: Re: RFR: 8268458: Add verification type for evacuation
>         failures
> Message-ID:
>         <PjXEe8uwc9uV0Kf2Nk_Pc5Ts81e5jFdCrDx9G6RzdY0=.
> 78e83bfe-0814-487d-a599-3494f8d54a61 at github.com>
>
> Content-Type: text/plain; charset=utf-8
>
> On Sat, 19 Jun 2021 05:59:10 GMT, Kim Barrett <kbarrett at openjdk.org>
> wrote:
>
> >> Hi all,
> >>
> >>   can I have reviews for this change that adds a new verification type
> (argument for `-XX:VerifyGCType` for G1) that only enables verification
> after an evacuation failure?
> >>
> >> The reasons is that time and time again we have issues with evacuation
> failure as it's by far not tested as much as regular collection, and
> reproducing issues then is often hampered by that there is no way to just
> verify after verification failure. Enabling it just for all young
> collections is possible, but typically does not help much.
> >>
> >> Fwiw, this change requires a small semantics change in how the current
> `VerifyGCType` is compared to the one stored as active (i.e. in
> `G1HeapVerifier::_enabled_verification_types`). Since the situations that
> can be enabled are not distinct any more (any young gc can have an
> evacuation failure), the existing check for a given set bit in
> `G1HeapVerifier::should_verify()` does not work any more.
> >>
> >> This also means that the previous assumption that
> `G1VerifyType::G1VerifyAll` is not the same as all flags enabled can not be
> checked any more. I do not think this is any loss in functionality (see the
> gtests for removed checks).
> >>
> >> The same functionality could also have been implemented by injecting
> all of the young gen type bits into the existing `type` on evacuation
> failure at the cost of remembering that the user selected evacuation
> failures for evacuation somewhere else. Not sure if that would be simpler.
> >>
> >> Testing: tier1-2 (still running), updated test
> >>
> >> Thanks,
> >>   Thomas
> >
> > Looks good.
> >
> > G1VerifyType seems poorly named.  The name suggests a single value, but
> it's really a selection bitmask.  Perhaps a followup RFE?
>
> Thanks @kimbarrett @walulyai for your reviews.
>
> Integrate
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/4473
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 21 Jun 2021 11:14:31 GMT
> From: Thomas Schatzl <tschatzl at openjdk.java.net>
> To: <hotspot-gc-dev at openjdk.java.net>
> Subject: Integrated: 8268952: Automatically update heap sizes in
>         G1MonitoringScope
> Message-ID:
>         <fSA_-ASTWNVcGC7cu3p6Y1v5yi4W09h0Sqk1KbpuDVY=.
> ec5157ea-f99d-4b08-b177-1df95d782cab at github.com>
>
> Content-Type: text/plain; charset=utf-8
>
> On Fri, 18 Jun 2021 12:06:33 GMT, Thomas Schatzl <tschatzl at openjdk.org>
> wrote:
>
> > Hi,
> >
> >   can I have reviews to factor out the call to
> `G1MonitoringScope::update_sizes()` into the destructor of that class?
> Currently all users seem to call this manually close to the end of the
> scope `G1MonitoringScope` is in.
> >
> > Testing: tier1-5
> >
> > Thanks,
> >   Thomas
>
> This pull request has now been integrated.
>
> Changeset: a58c477c
> Author:    Thomas Schatzl <tschatzl at openjdk.org>
> URL:
> https://git.openjdk.java.net/jdk/commit/a58c477c49ca595c65f7a2fca2512ff2adea99be
> Stats:     18 lines in 4 files changed: 7 ins; 11 del; 0 mod
>
> 8268952: Automatically update heap sizes in G1MonitoringScope
>
> Reviewed-by: kbarrett, iwalulya
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/4529
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 21 Jun 2021 11:15:31 GMT
> From: Thomas Schatzl <tschatzl at openjdk.java.net>
> To: <hotspot-gc-dev at openjdk.java.net>
> Subject: Re: RFR: 8269077: TestSystemGC uses "require vm.gc.G1" for
>         large pages subtest
> Message-ID:
>         <w7vkMF1CB4GNdzU2UxYsihH1UJkqQ9otEwslM0V4BAk=.
> b08378d1-d9ab-457e-9929-1116ad970de4 at github.com>
>
> Content-Type: text/plain; charset=utf-8
>
> On Mon, 21 Jun 2021 10:17:38 GMT, Stefan Karlsson <stefank at openjdk.org>
> wrote:
>
> > The invocation that runs with large pages are guarded with `@require
> vm.gc.G1` and doesn't explicitly state that G1 should be used.
> >
> > This means two things:
> > 1) We are not running the large pages subtest when other GCs are
> specified
> > 2) Under some circumstances another GC is ergonomically selected and we
> run the test with that GC even though the test was guarded by `@require
> vm.gc.G1`.
> >
> > I propose that we move the subtest to its own run section, without any
> requirement about the used GC.
>
> Lgtm.
>
> -------------
>
> Marked as reviewed by tschatzl (Reviewer).
>
> PR: https://git.openjdk.java.net/jdk/pull/4538
>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 21 Jun 2021 11:15:37 GMT
> From: Thomas Schatzl <tschatzl at openjdk.java.net>
> To: <hotspot-gc-dev at openjdk.java.net>
> Subject: Integrated: 8268458: Add verification type for evacuation
>         failures
> Message-ID:
>         <irqmUxOC0igK_mbCrrDOUK4xhryOVNEng7fu-STo554=.
> b3f3bb60-c523-491e-abcb-0bf27b88e60a at github.com>
>
> Content-Type: text/plain; charset=utf-8
>
> On Fri, 11 Jun 2021 12:23:30 GMT, Thomas Schatzl <tschatzl at openjdk.org>
> wrote:
>
> > Hi all,
> >
> >   can I have reviews for this change that adds a new verification type
> (argument for `-XX:VerifyGCType` for G1) that only enables verification
> after an evacuation failure?
> >
> > The reasons is that time and time again we have issues with evacuation
> failure as it's by far not tested as much as regular collection, and
> reproducing issues then is often hampered by that there is no way to just
> verify after verification failure. Enabling it just for all young
> collections is possible, but typically does not help much.
> >
> > Fwiw, this change requires a small semantics change in how the current
> `VerifyGCType` is compared to the one stored as active (i.e. in
> `G1HeapVerifier::_enabled_verification_types`). Since the situations that
> can be enabled are not distinct any more (any young gc can have an
> evacuation failure), the existing check for a given set bit in
> `G1HeapVerifier::should_verify()` does not work any more.
> >
> > This also means that the previous assumption that
> `G1VerifyType::G1VerifyAll` is not the same as all flags enabled can not be
> checked any more. I do not think this is any loss in functionality (see the
> gtests for removed checks).
> >
> > The same functionality could also have been implemented by injecting all
> of the young gen type bits into the existing `type` on evacuation failure
> at the cost of remembering that the user selected evacuation failures for
> evacuation somewhere else. Not sure if that would be simpler.
> >
> > Testing: tier1-2 (still running), updated test
> >
> > Thanks,
> >   Thomas
>
> This pull request has now been integrated.
>
> Changeset: cd20c019
> Author:    Thomas Schatzl <tschatzl at openjdk.org>
> URL:
> https://git.openjdk.java.net/jdk/commit/cd20c01942dd8559a31e51ef2a595c6eba44b8ad
> Stats:     58 lines in 6 files changed: 46 ins; 5 del; 7 mod
>
> 8268458: Add verification type for evacuation failures
>
> Reviewed-by: kbarrett, iwalulya
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/4473
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 21 Jun 2021 11:14:30 GMT
> From: Thomas Schatzl <tschatzl at openjdk.java.net>
> To: <hotspot-gc-dev at openjdk.java.net>
> Subject: Re: RFR: 8268952: Automatically update heap sizes in
>         G1MonitoringScope
> Message-ID:
>         <76o4isNx_5_nPB1qE7HLq5Dr3RDqV8llyYosEGADRNU=.
> 19e05f27-e7dd-4c39-ac52-2aca954d72e3 at github.com>
>
> Content-Type: text/plain; charset=utf-8
>
> On Sat, 19 Jun 2021 05:48:38 GMT, Kim Barrett <kbarrett at openjdk.org>
> wrote:
>
> >> Hi,
> >>
> >>   can I have reviews to factor out the call to
> `G1MonitoringScope::update_sizes()` into the destructor of that class?
> Currently all users seem to call this manually close to the end of the
> scope `G1MonitoringScope` is in.
> >>
> >> Testing: tier1-5
> >>
> >> Thanks,
> >>   Thomas
> >
> > Looks good.
>
> Thanks @kimbarrett @walulyai for your reviews.
>
> -------------
>
> PR: https://git.openjdk.java.net/jdk/pull/4529
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 21 Jun 2021 13:40:23 +0200
> From: Thomas Schatzl <thomas.schatzl at oracle.com>
> To: Man Cao <manc at google.com>
> Cc: Jonathan Joo <jonathanjoo at google.com>, hotspot-gc-dev
>         <hotspot-gc-dev at openjdk.java.net>, Java Platform Team
>         <java-platform-team at google.com>
> Subject: Re: Status of JEP-8204088/JDK-8236073
> Message-ID: <35ff0404-9cb0-f9bf-0d37-2c3d7b4a0f67 at oracle.com>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Hi,
>
> On 15.06.21 21:45, Man Cao wrote:
> > Hi all,
> >
> > Thank you for the feedback!
> >
> [...]>
> > The problem we are trying to solve for the "Container RAM limit is
> > fixed" case,
> > is actually orthogonal to the relationship between the two flags.
> > Basically, we need a flag that can be adjusted dynamically (i.e.
> > "manageable" in HotSpot).
>
> I think something like GCTimeRatio could be made manageable and it would
> be useful to do so. That came up internally in some discussions already
> a fair amount of time ago afair, but with the current heap sizing broken
> it's just one more (small) todo...
>
> > Then we can make either a JVM feature, or a non-JVM approach such as via
> > an agent,
> > to automatically set either?GCTimeRatio/GCCpuPercentage
> > or?SoftMaxHeapSize/CurrentMaxHeapSize
> > based on the container RAM usage/limit ratio.
> >
> > Assuming using the GCCpuPercentage and CurrentMaxHeapSize flag names,
> > suppose we have a
> > JVM feature -XX:+StriveToStayWithinContainerRAMLimit, its behavior could
> be:
> > - If container RAM usage/limit ratio is below 90%, nothing needs to be
> > done and just use
> >  ? the default values for?GCCpuPercentage and?CurrentMaxHeapSize.
> > - If container RAM usage/limit ratio is 90%-95%, it could start trying
> > to reduce the heap size,
> >  ? either by increasing?GCCpuPercentage, or shrinking?CurrentMaxHeapSize.
> > - If container?RAM usage/limit ratio is above 95%, it could try even
> > harder by further increasing
> >  ? GCCpuPercentage or shrinking?CurrentMaxHeapSize.
> > In the above cases, there will be a limit on how
> > far?-XX:+StriveToStayWithinContainerRAMLimit
> > could increase?GCCpuPercentage or shrink?CurrentMaxHeapSize.
> > We don't want to cause GC thrashing, as it is better to be killed by the
> > container manager and
> > restart the program, than to be stuck in GC thrashing. We could rely on
> > UseGCOverheadLimit
> > (JDK-8212084) for this purpose as well.
> >
> > I'm not sure if we could make GCCpuPercentage manageable, but
> > CurrentMaxHeapSize will definitely be manageable. If
> > GCCpuPercentage is manageable, then
> -XX:+StriveToStayWithinContainerRAMLimit
> > could be built solely by changing GCCpuPercentage, without relying on
> > setting?CurrentMaxHeapSize.
>
> My general opinion about these kind of fairly complex heuristics not
> directly related how the VM should operate "right now" is to try to keep
> them outside the VM :) This heuristic in particular would need to know
> the container RAM limit (ok, that one the VM knows already) and in
> particular its own total RAM usage which is not the case today.
>
> Both variables seem to be something that is very easily (and probably
> better) determined by some outside entity and any response to that could
> be more flexibly managed there at first glance; similar external agents
> tend to be running anyway already somewhere for container/VM management
> too.
>
> Another reason to not immediately jump on it is that while I can see
> your point, first I would prefer to have the current issues fixed :)
>
> >
> > For getting CPU usage
> >  > The problem is that apart from internal prototypes we never got around
> >  > to add that. There's JDK-8027759 (and one more I think) though, even
> >  > with a very very old patch.
> >  >
> >  > Another issue related to getting cpu usage I remember is support on
> some
> >  > systems, and it may be spotty on others (i.e. granularity wise).
> >  >
> >  > Do you have any experience on that outside of Linux?
> >
> > Great point. I haven't thought of this problem yet.
> > https://github.com/caoman/jdk/tree/G1ThreadsCPUTime
> > contains a patch on how
> > we get the CPU times.
> > I see os::is_thread_cpu_time_supported() could return false on Windows
> > and BSD.
> > We will dig further to see how this could be implemented for these OSes.
>
> Having looked a bit about that, OSX should be fairly easy (thread_info()
> call), but probably unsupportable on Windows: its getThreadTimes() seems
> to be unusable and the only way there.
>
> Thomas
>
>
> End of hotspot-gc-dev Digest, Vol 168, Issue 38
> ***********************************************
>


-- 
rodrigo-bruno.github.io



More information about the hotspot-gc-dev mailing list