Elastic JVM improvements [Was: Re: OpenJDK G1 Patch]

Ruslan Synytsky synytskyy at jelastic.com
Fri May 25 15:56:08 UTC 2018


Dear Thomas, thank you for supporting this initiative, your efforts and
time. Please review my comments inline.

On 24 May 2018 at 12:03, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:

> Hi Rodrigo, Ruslan,
>
>   first, sorry for the late reply. I have been travelling, so a bit
> short on time on thinking about and looking through this.
>
> Thanks for your contribution. I think these ideas are a very
> interesting and generally useful additions to the collector and/or
> community.

A long story short :). The first thoughts about elasticity of JVM came to
our team in January 2011. I reached Aleksey Shipilëv to discuss the idea. I
was looking for an advice how to make it happen. We found no "out-of-box"
solution at that time. Unfortunately Full GC is the only way till now to
tell JVM to give back unused but committed heap. Parallel GC, which was
default, is not friendly with RAM consumption at all. G1 was not production
ready. So we had a changeling time. A little bit later we figured out that
it's possible to achieve the required behavior with a javagent that
monitors ram usage and force GC when the application is not busy. We made
an experiment during several years analyzing how customers react on the
vertical scaling. The result was impressive, better than we expected. We
never had complains, contrary many customers gave a positive feedback
because they simply saved money. Some customers even did not believe that
resizing is possible! Many people still have that perception about
"greedy" Java that never gives RAM back to OS.

Nowadays this topic is even more relevant as Java and containers are a
perfect couple (regardless of cgroups issues). More use cases and
implementations are coming including new garbage collectors. Also, OpenJ9
provides -XX:+IdleTuningCompactOnIdle
<https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/xxidletuningcompactonidle.html>
 and -Xsoftmx
<https://www.ibm.com/support/knowledgecenter/en/SSYKE2_8.0.0/com.ibm.java.vm.80.doc/docs/xsoftmx.html>
 already. We have not tested it yet, but the idea is clear, it looks
similar from the description.


> While they may not be perfect for all use cases, they imho improve the
> collector sufficiently enough. Also, during reviews we may come up with
> smaller improvements that improve its value and catch more use cases.
>
I'm pretty sure collaborating together we will come up with a better
solution.


>
> I will help you getting through the further process.
>
> So the process to get this contribution into mainline would be roughly:
>
> - get OCAs signed. As soon as you show up in the signatories list we
> can actually start accepting patches, i.e. review them and discuss them
> in more detail.
>
Jelastic is listed as a contributor at the OCA page
<http://www.oracle.com/technetwork/community/oca-486395.html>, we should be
fine now.


> Depending on the patches' size it's probably best if you give me a
> webrev when your names show up there and I can make them publicly
> available.
>
My colleague Rodrigo Bruno will take care of it.


>
> - since these two changes seem to be very interesting for a wider
> public it seems that it would be useful to do JEPs for them. That might
> also improve the understanding and their limitations by pointing them
> out there, and facilitate the discussion.
> This is basically describing the functionality a little more formally
> using the template [0].
>
> I can guide you through this, but in the beginning it might be useful
> to just fill out the description in form of email.
>
OK


>
> - since we will add some command line options, we will later need to go
> through the CSR for each of them. This is basically just letting
> everyone know and definition of those [1].
>
OK


> Again, I will help you with most of the "paper"work.
>
Thanks


>
> Following are some initial questions and thoughts to the proposals.
> They may be a bit confusing or somewhat unrelated though, please bear
> with me :)


> On Sat, 2018-05-19 at 19:01 +0100, Rodrigo Bruno wrote:
> > Dear OpenJDK community,
> >
> > Jelastic and INESC-ID have developed a patch for OpenJDK that
> > improves elasticity of JVM with variable loads. The detailed
> > description of the patch can be found below. We would like share this
> > patch with the community and push it to the mainstream. We believe
> > this work will help Java community to make JVM even better and
> > improve the memory resources usage (save money) in the modern cloud
> > environments. A more complete patch description can be found in
> > the paper that will be presented in ISMM 2018.
>
>
> > Elastic JVM Patch Description
> >
> > Elasticity is the key feature of the cloud computing. It enables to
> > scale resources according to application workloads timely. Now we
> > live in the container era. Containers can be scaled vertically on the
> > fly without downtime. This provides much better elasticity and
> > density compared to VMs. However, JVM-based applications are not
> > fully container-ready. The first issue is that HotSpot JVM doesn’t
> > release unused committed Heap memory automatically, and, therefore,
> > JVM can’t scale down without an explicit call of the full GC.
> > Secondly, it is not possible to increase the size of JVM Heap in
> > runtime. If your production application has an unpredictable traffic
> > spike, the only one way to increase the Heap size is to restart the
> > JVM with a new Xmx parameter.
> >
> > To solve these 2 major issues and make JVM more container friendly,
> > we have implemented the following improvements: i) timely reduce the
> > amount of unused committed memory; and ii) dynamically limit how
> > large the used and committed memory can grow. The patch is
> > implemented for the Garbage First collector.
> >
> >
> > Timely Reducing Unused Committed Memory
> >
> > To accomplish this goal, the HotSpot JVM was modified to periodically
> > trigger a full collection. Two full collections should not be
> > separated by more than GCFrequency seconds, a dynamically user-
> > defined variable. The GCFrequency value is ignored and therefore,
> > i.e., no full collection is triggered, if:
> >
> > GCFrequency is zero or below;
>
> A time span seems to be different to a "frequency", this seems to be
> more an interval like CMSTriggerInterval). Also I do not completely
> follow that this interval is the minimum time between two *full*
> collections. I would expect that any collection (or gc related pause)
> would reset that time.
>
Can we end up at the situation when small collections happen more often
than MinTimeBetweenGCs? Like this for example



Then the memory shrinking will not be triggered, as I understand. Because
small collections in blue area do not help, we need a way to reduce the
orange committed ram. So Full GC only?


> The paper also calls this "MinTimeBetweenGCs" if I read it correctly,
> which is a somewhat better name.
>
Ok, I added a note to the original patch description https://docs.googl
e.com/document/d/1wLH6MyLyOZcrMweDUbvK91SZ3lHOBdZ7drNeHznlwlk/edit. It's
open for everyone to comment, just in case.


> > the average load on the host system is above MaxLoadGC. The MaxLoadGC
> > is a dynamically user-defined variable. This check is ignored if
> > MaxLoadGC is zero or below;


> What is the scale for the "load", e.g. ranging from 0.0 to 1.0, and 1.0
> is "full load"? Depending on that this condition makes sense.
>
The logic is using *os::loadavg* and can be found at the link
https://github.com/jelastic/openjdk/blob/jdk9/jdk9/hotspot/
src/share/vm/runtime/vmThread.cpp#L391


>
> The paper does not mention this.
>
> > the committed memory  is above MinCommitted bytes. MinCommitted is a
> > dynamically user-defined variable. This check is ignored if
> > MinCommitted is zero or below;
>
> While this is a different concern, have you ever considered using
> MinHeapSize or InitialHeapSize here?
>
Good point. I think we can replace MinCommitted with Xms. We added it just
in case if Xms is set to a low number (for example 32m), then memory usage
grows up significantly during the time, and you do not want to bring it
back as low as Xms, but keep it higher at a specific level (for example
1g). I believe this case is very rare, we can ignore it.


>
> > the difference between the current heap capacity and the current heap
> > usage is below MaxOverCommitted bytes. The MaxOverCommitted is a
> > dynamically user-defined variable. This check is ignored if
> > MaxOverCommitted is zero or below;
> >
> >
> > The previously mentioned concepts are illustrated in the figure
> > below:
> >
> > [...]
> >
> > The figure above depicts an application execution example where all
> > the aforementioned variables come into play. The default value for
> > all introduced variables (GCFrequency, MaxLoadGC, MaxOverCommitted,
> > and, MinCommitted) is zero. In other words, by default, there are no
> > periodic GCs.
> >
> >
> > With this these modifications, it is possible to periodically
> > eliminate unused committed memory in HotSpot. This is very important
> > for applications that do not trigger collections very frequently and
> > that might hold high amounts of unused committed memory. One example
> > are web servers, whose caches can timeout after some minutes and
> > whose memory might be underutilized (after the caches timeout) at
> > night when the amount of requests is very low.
>
> If I understood this paragraph correctly, the intent is to uncommit if
> the system is idle (has low load for a certain amount of time).
>
> Also, while it will become obvious with the patch, it will be
> interesting to see how that load is defined. One reason is basically
> that we support more systems than linux (the paper only mentions linux)
> and it may be useful to support more than that platform.
>
The vast majority of use cases is in Linux I believe. Load is defined via
*os::loadavg* mentioned above. A feedback from an expert will be helpful
here.


> I have one other question here, similar functionality could have been
> achieved by some external entity periodically polling the vm for heap
> size (a new jcmd or e.g. some MBean, or improving jmap or some other
> tool) and then forcing a system.gc from outside. Did you ever consider
> this?
>
Doing it via tools like jcmd or MBean is possible, but has a hidden issue.
Most likely it will be a cron based task. If you have hundreds / thousands
containers on a node it's better to distribute the check time because if
you launch the check at all containers together at same time you will get a
huge spike on CPU. It's much more complex exercise.

An alternative option is javaagent which is automatically attached to each
java process. Then the check time is distributed naturally as containers
start at different time, plus it's less expensive on CPU because you do not
launch any new process.


>
> The reason is that this idea uses some mechanisms (detecting load of
> system, lots of options) that may be better served and be more flexible
> if mostly implemented outside the VM.
>
In Jelastic we do it in a similar way, as I described above. But it's too
complex for a wider adoption, because it's not something easy to use / easy
to enable (outside of Jelastic). So many java users will not be able to
enjoy it.


> Having read Kirk P.'s concern about the mechanism to actually uncommit
> memory being too simplistic, I kind of agree. The alternative, to
> trigger a concurrent cycle plus multiple mixed collections (plus
> uncommit heap at the end of that mixed phase) is a bit harder to
> implement. I would certainly help you with that. :)
>
I do believe the code / implementation can be improved. No religion about
Full GC :). I would prefer to avoid it too if we can come with another
solution at reasonable efforts.


>
> Also assuming that at that point the VM is idle, doing a full gc would
> not hurt the application.
>
> Also there is Michal's use case of periodically doing global reference
> processing to clean out weak references regularly. This seems to be a
> different use case, but would seem easy to do given that this change
> probably implements something like CMSTriggerInterval for G1.


> Maybe there is some way to marry these two issues somehow.
>
Does CMSTriggerInterval influence the committed memory shrinking?


>
> > -Xmx Dynamic Limit Update
> >
> > To dynamically limit how large the committed memory (i.e. the heap
> > size) can grow, a new dynamically user-defined variable was
> > introduced: CurrentMaxHeapSize. This variable (defined in bytes)
> > limits how large the heap can be expanded. It can be set at launch
> > time and changed at runtime. Regardless of when it is defined, it
> > must always have a value equal or below to MaxHeapSize (Xmx - the
> > launch time option that limits how large the heap can grow). Unlike
> > MaxHeapSize, CurrentMaxHeapSize, can be dynamically changed at
> > runtime.
> >
> > For example dynamically set 1GB as the new Xmx limit
> >
> > jinfo -flag CurrentMaxHeapSize=1g <java_pid>
> >
> > Setting CurrentMaxHeapSize at runtime will trigger a full collection
> > if the desired value is below the current heap size. After finishing
> > the full collection, a second test is done to verify if the desired
> > value is still above the heap size (note that a full collection will
> > try to shrink the heap as much as possible). If the value is still
> > below the current heap size, then an error is reported to the user.
> > Otherwise, the operation is successful.
>
> One alternative here could be to use a marking cycle + mixed gcs to
> reach that new CurrentMaxHeapSize again, which is again is a bit more
> complicated to achieve. I can help you implementing that if interested.
>
> In some cases you might even get away with just uncommitting empty
> regions and doing nothing else in response to this command.
>
> As Kirk mentioned, as another optimization, triggering a young gc could
> free enough regions too.
>
Ok, I pass this question to Rodrigo Bruno and he has the required technical
knowledge on the implementation.


>
> > The limit imposed by the CurrentMaxHeapSize can be disabled if the
> > variable is unset at launch time or if it is set to zero or below at
> > runtime.
> >
> > This feature is important to cope with changes in workload demands
> > and to avoid having to restart JVMs to cope with workload changes.
>
> I have only one question about this here at this time: is this
> CurrentMaxHeapSize a new "hard" heap size (causing OOME in the worst
> case), or could this be temporarily exceeded and any excess memory
> given back asap?
>
For now it's the new hard limit.


> Would it be useful to have G1 more slowly adapt to that that new goal
> size?
>
We may have a problem here, as CurrentMaxHeapSize will be usually bound to
container/VM limits. If you resize container/VM you need to get back a
clear answer - yes or no (not possible to decrease), then you can continue
or cancel the resizing action. Adding async behavior can complicate the
logic and code. I would prefer to keep it simple at the first
implementation. We can adjust it later based on the feedback from more use
cases.


>
> As you can see I am pretty interested in the changes... :)
>
Good! Just to clarify one more time that at Jelastic we are fine already.
The ultimate goal is to evangelize elasticity of JVM. The cloud world is
more dynamic now than it was in the last.


> So overall, if you agree, I will open two JEPs in our bug tracker and
> we can start discussing and filling out the details.
>
Yep, let's move on! Thank you!


>
> Thanks,
>   Thomas
>
> [0] http://openjdk.java.net/jeps/2
> [1] https://wiki.openjdk.java.net/display/csr/Main
>
>


-- 
Ruslan
CEO @ Jelastic <https://jelastic.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20180525/a098b59e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 130810 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20180525/a098b59e/image.png>


More information about the hotspot-gc-dev mailing list