Container-aware heap sizing for OpenJDK
Thomas Schatzl
thomas.schatzl at oracle.com
Fri Sep 16 14:12:28 UTC 2022
Hi Jonathan,
great to hear from you again :)
On 13.09.22 21:52, Jonathan Joo wrote:
> Hello hotspot-dev and hotspot-gc-dev,
>
>
> My name is Jonathan, and I'm working on the Java Platform Team at
> Google. Here, we are working on a project to address Java container
> memory issues, as we noticed that a significant number of Java servers
[...]
>
> Below (under the dotted line) is a more detailed explanation of our
> initial approach. Does this sound like something that may be useful for
> the general OpenJDK community? If so, would some of you be open to
> further discussion? I would also like to better understand what
Most of these suggestions seem to be fairly consistent with existing
RFEs (e.g. [1], [2], [3], ...) that have been discussed before with you
(e.g. in [4]) and been considered really nice to have iirc.
> container environments look like outside of Google, to see how we could
> modify our approach for the more general case.
>
[...]
> *
>
> A separate thread runs alongside the JVM, querying:
>
> [...]
I am not convinced that having a thread inside the JVM is really the
best solution. Constantly querying the _environment_ for changes seems
to be traditionally outside of the scope of the JVM.
Doing so also opens a quite big can of worms, just re-iterating the
concerns given in other comments like:
* How is the container memory limit being determined? Does that
process take into account non-Java processes running in the container as
well? (Ashutosh Mehra)
* If you have two (multiple) JVM processes running inside the
container, how do they coordinate? (Ioi Lam)
* The properties queried and the policies (e.g. which process should
get preference if there are multiple?) are likely fairly specific to the
deployment too, so is there really some one-size-fits-all policy here?
(Volker Simonis)
* I assume that so far you were only talking about G1/Linux support,
and hoping the rest of the community jumping in...
So at first glance, I question the advantages of putting the coordinator
inside the JVM. The only one I can come up on the spot is: you do not
have to deploy something extra.
Using some external process (however it is distributed) seems to be a
much more flexible option (not only in customizability but also in terms
of the release cycle for it). I would suggest to at least separate this
effort from improving the JVM capabilities.
> *
>
> This thread then uses this information to calculate new values for
> the two new JVM flags, and continually updates them at runtime.
Since these flags were planned as manageable afair, any process could
already change them as needed.
>
> *
>
> The `Current maximum heap expansion size` informs the JVM what is
> the maximum amount we can expand the heap by, while staying within
> container limits. This is a hard limit, and trying to expand more
> than this amount results in behavior equivalent to hitting the Xmx
> limit.
This sounds like [2].
>
> *
>
> The `Current target heap size` is a soft target value, which is used
> to resize the heap (when possible) so as to bring GC CPU overhead
> toward its target value.
>
See [1]. (As Stefan Karlsson mentioned, this functionality is already
available in ZGC).
> *Caveats:
>
> * Enabling this feature might require tuning of the newly
> introduced default GC CPU overhead target to avoid regressions.
>
> * Time spent doing GC for an application may increase significantly
> (though generally we've seen in practice that even if this is the
> case, end-to-end latency does not increase a noticeable amount)
>
From the discussion in [4], JDK-8244603 has actually already been
integrated. I have had some time to re-baseline the impact of [2] a few
weeks ago, and actually I was planning to pick up some of that old work
in the near future, but the impact ranges from nothing to very
significant (-15% throughput for "simple" applications), and
particularly some applications that exhibit a very "phased" behavior
results show very bad behavior. I am seeing like -50% in criticaljops
for SPECjbb2015 due to G1 being more aggressive with giving back memory
to other users....
I am aware that nobody is running SPECjbb all the time, but just to
mention that this is not work to be underestimated.
Maybe your implementation of similar functionality fares much better in
that area though. I think everyone is now already really curious to see
your changes :)
> * Enabling AHS results in frequent heap resizings, but we have not
> seen evidence of any negative effects as a result of these more
> frequent heap resizings.
See above.
>
> * AHS is not necessarily a replacement for proper JVM tuning, but
> should generally work better than an untuned or improperly tuned
> configuration.
Thanks,
Thomas
[1] https://bugs.openjdk.org/browse/JDK-8236073
[2] https://bugs.openjdk.org/browse/JDK-8204088
[3] https://bugs.openjdk.org/browse/JDK-8238687
[4] https://mail.openjdk.org/pipermail/hotspot-gc-dev/2021-May/035092.html
More information about the hotspot-dev
mailing list