<div dir="ltr">Hi all,<div><br></div><div>Apologies for the late response - I had missed the replies due to the way I had set up my inbox filtering 😓. I may have also missed entire emails due to this, so please feel free to re-reply to this email if I have not addressed your questions.</div><div><br></div><div>I'll try to address all the email comments in one go, so please bear with a long email ahead!</div><div><br></div><div>-----</div><div>@ Thomas Schatzl<br></div><div><br></div><div>Long time no talk :) Thanks for your detailed response. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><pre style="white-space:pre-wrap;color:rgb(0,0,0)">Most of these suggestions seem to be fairly consistent with existing
RFEs (e.g. [1], [2], [3], ...) that have been discussed before with you
(e.g. in [4]) and been considered really nice to have iirc.</pre></blockquote><div>Agreed that there is a lot of overlap between AHS and the currently open RFEs. Happy to converge on these now that I have a better understanding of this area from working on it! Notably, [1] and [2] are very similar to the two manageable flags that are part of AHS, so I agree that there is room for collaboration here. For [3], I think that is something that is somewhat orthogonal (in terms of implementation) to AHS, but would be very helpful for AHS, since the more frequently we can uncommit memory, the closer we can get the heap to our target heap size.<br></div><div><br></div><div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><font face="monospace">I am not convinced that having a thread inside the JVM is really the <br>best solution. Constantly querying the _environment_ for changes seems <br>to be traditionally outside of the scope of the JVM.</font> </blockquote><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><font face="monospace">[...]</font></blockquote><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><span style="font-family:monospace">Using some external process (however it is distributed) seems to be a</span></blockquote><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><font face="monospace">much more flexible option (not only in customizability but also in terms <br>of the release cycle for it). I would suggest to at least separate this <br>effort from improving the JVM capabilities.</font></blockquote><div> </div></div><div>Just to clarify - the AHS thread is not part of the JVM itself. It is a separate thread that is kicked off in our Java launcher at the same time as the JVM, but is a completely separate process. Thus the functionality that pulls from the environment and sets the manageable flags is not part of hotspot/the JVM. The actual amount of changes to the JVM are not actually that intrusive and actually follow somewhat similar logic to <a href="https://github.com/tschatzl/jdk/tree/8238687-investigate-memory-uncommit-during-young-gc2" style="white-space:pre-wrap">https://github.com/tschatzl/jdk/tree/8238687-investigate-memory-uncommit-during-young-gc2</a><font color="#000000"><span style="white-space:pre-wrap">.</span></font></div><div><div><br></div><blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex" class="gmail_quote"><font face="monospace">some applications that exhibit a very "phased" behavior <br>results show very bad behavior.</font></blockquote><div><br></div><div>This is helpful to know - as someone who has never heard of <span style="color:rgb(0,0,0);white-space:pre-wrap">SPECjbb2015, would it be easy for me to try running it with my prototype?</span></div></div><div><br></div><div>-----</div><div>@ Severin:<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> 1. How is AHS enabled? Is it on by default or is it opt-in?<br></blockquote><div><br></div><div>AHS is controlled by a flag in Google's version of the JDK launcher. Right now at Google it is opt-in, but we plan to enable it by default for certain subsets of jobs (namely those already enrolled in a service meant to make tuning more hands-off). If that rollout goes smoothly, we will broaden adoption (but probably still leave as opt-in).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> 2. Is the prototype working for all GCs available in OpenJDK or<br> specific to G1?<br></blockquote><div><br></div><div>This prototype currently only works for G1 GC, and we don't currently have the bandwidth to extend the prototype to other GCs :(</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> 3. Would this be a Linux only feature?<br></blockquote><div><br></div><div>Currently yes - it is dependent on things like cgroups which I'm not sure is available in other platforms. That being said, If an equivalent feature exists in other platforms, I don't see why it wouldn't work!</div><div><br></div><div>-----</div><div><font face="arial, sans-serif">@ Thomas <span style="color:rgb(31,31,31);white-space:nowrap">Stüfe</span></font></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>Can you describe the adjustment logic in more detail or is there a public prototypec?</div></blockquote><div><br></div><div>There isn't currently a public prototype - I'll have to double check with legal about the level of detail I can include in a public forum and get back to you. (I imagine it should be fine, but a public-facing doc hasn't been written yet :P)</div><div><br></div><div>-----</div><div>@ Fazil<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Any chance to be considered as a JEP in OpenJDK?<br></blockquote><div><br></div><div> I'm not familiar with the process of turning an idea/prototype into a JEP - do you have any suggestions on how to go about doing this? Is there some approval process for creating a JEP, or is anyone open to making one?</div><div><br></div><div>-----</div><div>@ Volker<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">1. Is there a public prototype available?<br></blockquote><div><br></div><div>Not yet, unfortunately! </div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">2. For which GCs have you implemented it?<br></blockquote><div><br></div><div>Just G1 GC - the implementation I would say is fairly GC-dependent, so it may be a bit of work to get it to work on other GCs. (That being said I'm not familiar with the other GCs so maybe it won't be as bad as I think?)</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">3. On which platforms does it currently work?<br></blockquote><div><br></div><div>Just Linux at the moment.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">4. Do you use information from procfs on Linux (or what else) to get<br>the memory state of the system?<br></blockquote><div><br></div><div>I can provide more details once I get approval from legal to share more specifics publicly! </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">5. How often do you query the system and JVM state and update the<br>settings? Is this (or can it be) correlated to the JVMs allocation<br>rate?<br></blockquote><div><br></div><div>Right now it defaults to once every 5 seconds, but is configurable to run at whatever frequency is appropriate for the server. That's a good idea to make it correlated with the JVMs allocation rate, that should definitely be doable. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">6. Can you please explain your approach in some more detail (e.g. how<br>does heap shrinking work)? I suppose you have "Current target heap<br>size" <= "Current maximum heap expansion size" <= Xmx. I can<br>understand how your monitoring thread updates "Current maximum heap<br>expansion size" based on the systems memory usage. But it's harder to<br>understand how you set "Current target heap size" based GC CPU<br>overhead and when and how do you trigger heap resizing (both<br>increasing and shrinking) based on these values? And why do you need<br>two new variables? Wouldn't "Current target heap size" be enough"<br>(also see next question for more context)?<br></blockquote><div><br></div><div>Will provide more details regarding these questions when legal approves! But on a high level, the target heap size is a soft target that we try to get the heap to, but if say we have a large amount of legitimate heap usage that cannot be cleaned up (but is still higher than the target heap size), we will allow the heap size to stay above the target heap size indefinitely. </div><div><br></div><div>This heap size target is determined by having some GC CPU overhead target, and if we are spending more CPU time on GC than the GC CPU target, then we increase the heap size target, and vice versa. </div><div><br></div><div>The second flag (Current maximum heap expansion size) is a hard limit that prevents allocations no matter what that would cause us to hit container OOM. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">7. What you propose is very similar to the CurrentMaxHeapSize proposed<br>by "JEP draft: Dynamic Max Memory Limit" [1] except that the latter<br>proposal misses the part about how to automatically set and update<br>this value. </blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">8. We already have "JEP 346: Promptly Return Unused Committed Memory<br>from G1" [2] since JDK 12. Can you explain why this isn't enough? I.e.<br>if the GC already returns as much heap memory as possible back to the<br>system, what else can you do to further improve the situation?</blockquote><div> </div><div>Agreed! When initially sketching out the proposal, I had looked into both of the JEPs you mentioned in 7. and 8. JEP 346 is not sufficient for our use case since it is not available for JDK11, and IIRC there was trouble backporting that to JDK11, hence it was not usable from our end at least for a while.<br></div><div><br></div><div>Furthermore, while periodic GC would help to some extent, but that leaves us trying to determine the optimal periodicity of GC to achieve what we want per server. This seems to be a less informed decision than specifically forcing more GC at times of low container free space. We actually have experimented with similar features within Google and saw that it was not sufficient for preventing container OOMs.</div><div><br></div><div><br></div><div>-----</div><div>@ Stefan <br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-family:monospace">It would probably be good to use the same name for the other GCs.</span><br></blockquote><div><br></div><div>Acknowledged - will change before upstreaming.</div><div><br></div><div>-----</div><div>@ Kirk<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">>From your description here, you're using CPU (GC overheads) to help you resize. Do you mind elaborating on how this works?<br></blockquote><div><br></div><div>We haven't spent much time focusing on the distribution of heap size between young and old generations, but I agree that this is an area that needs more active investigation for our prototype. Currently our model is just doing the simplest way of reducing and expanding the heap when necessary and not modifying the ratio from young and old gen. But we plan to look more into this as we encounter more types of workloads. </div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Might I suggest that a quicker way is to start large and then resize to smaller. The reason for doing this is because small clips the signals you need to look at to know how big things need to be. Starting big should give you a cleaner, unclipped signal to work with.<br></blockquote><div><br></div><div>Thank you for the suggestion -- noted! One issue we've run into is that G1 tends to use as much heap as it is given, so often times, starting large does not give us the right signals as to why we should reduce the container. But I think with AHS, this becomes a viable approach. </div><div><br></div><div>-----</div><div>@ Ioi<br></div><div><pre style="white-space:pre-wrap;color:rgb(0,0,0)"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">- In the simplest case, you have a single JVM process running inside the <br>container. How do you balance its Java heap vs non-Java heap usage?</blockquote><div><br></div><div><font face="arial, sans-serif">We don't bound non-Java heap usage -- we assume that there are no memory leaks and that non-Java heap usage is valid usage. Thus, with AHS enabled, the onus becomes on the JVM heap to adapt to increases in non-Java heap usage. </font></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">- If you have two JVM processes running inside the container, how do <br>they coordinate?</blockquote><div> </div><div><span style="font-family:arial,sans-serif">We haven't yet really tried AHS on multiple JVM processes running inside the same container, but I imagine it should work really mostly the same. Assuming that there is indeed enough space in the container for both processes to work with the target GC CPU overhead, then heap usage for both should stay fairly constant, and both AHS threads should prevent each JVM from exceeding a heap usage that would result in container OOMs. </span><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">- If the fluctuation is caused by other processes, can the JVM react <br>quickly (run GC and free up caches) to respond to quick spikes? Do we <br>need to configure the container to allow temporarily over-budget <br>(something like "you can be 100MB over budget for less than 20ms") so <br>the JVM has time to shrink itself?</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">- Conversely, how can a spiky process request the JVM to temporarily <br>give up some memory? </blockquote><div><br></div><div><span style="font-family:arial,sans-serif">The JVM can react pretty quickly - basically if free space decreases quickly, then the next time the JVM tries to expand the heap due to an allocation, it will fail to expand, and thus runs GC to free up some space. During this GC it will then do its best to shrink the heap to its target size. Rather than allowing the container to go over-budget, we have some buffer so that we don't allow expansions if it would cause container usage to exceed 95% of the container limit. </span><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">It seems to me that for the more complex scenarios, it's not enough for <br>each individual JVM to make decisions on its own. We may need some sort <br>of intra-process coordination.</blockquote></pre><pre style="white-space:pre-wrap;color:rgb(0,0,0)"><span style="font-family:arial,sans-serif">Agreed that this is not a one-size-fits-all solution for all possible scenarios, especially the more complex ones. </span></pre></div><div><br></div><div> -----</div><div><br></div><div>Again, apologies for the delay in responding to the questions, but I hope I answered everything here. Will be more diligent about monitoring this discussion thread.</div><div><br></div><div>Appreciate all the thoughtful questions and discussions!</div><div><br></div><div>~ Jonathan</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 16, 2022 at 10:41 AM Kirk Pepperdine <<a href="mailto:kirk.pepperdine@gmail.com" target="_blank">kirk.pepperdine@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi Jonathan,<div><br></div><div>Very interesting experiment. This sizing issue is something that is befuddling a significant portion of those responsible for deploying containerized Java applications. Lio nicely points out that the old goal of "play nice" when configuring memory is in conflict with the new goal of "be greedy". Thus a re-visiting of memory sizing ergonomics is something that I certainly welcome. The cloud providers have been interested in better (for some weakly definition of better) memory resizing dynamics for quite some time so also a hot button topic.</div><div><br></div><div>I'm not sure how much I have to add over what others have commented on but, I don't believe we need an inter-process communication, at least not in the first instance nor do we need a watcher thread (again, at least not in the first instance). The one thing that I see here, if I'm reading this correctly, is that there is a focus on total heap size. For generational collectors, like G1, young and tenured play two different roles and thus require different tuning strategies. Tuning young is about controlling the promotion of transients into tenured. The two big things that drive transients into tenured are undersized survivor space and frequency collections (accelerated aging). Thus young sizing should be heavily influenced by allocation rates. This is considerably different than tenured where the driving metric is live set size (LSS). Thus tenured should be LSS + some working space. From this, it follows that max heap will be the sum of the parts. From your description here, you're using CPU (GC overheads) to help you resize. Do you mind elaborating on how this works?<br></div><div><br></div><div>Another side note is that you mention sizing is trial and error where you start small and then make bigger as needed. Might I suggest that a quicker way is to start large and then resize to smaller. The reason for doing this is because small clips the signals you need to look at to know how big things need to be. Starting big should give you a cleaner, unclipped signal to work with.</div><div><br></div><div>Kind regards,</div><div>Kirk</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Sep 13, 2022 at 12:17 PM Jonathan Joo <<a href="mailto:jonathanjoo@google.com" target="_blank">jonathanjoo@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><span id="m_1185205381368761500m_4612799699648765748m_-6311002785416629527m_868923612161290581m_8562993548920983672m_8455307370397098922m_-3409757497080515672m_3372880435617861444m_-4076470599618768914gmail-docs-internal-guid-7dcbd15e-7fff-350d-ddd7-0a12df3b0610"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Hello hotspot-dev and hotspot-gc-dev,</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">My name is Jonathan, and I'm working on the Java Platform Team at Google. Here, we are working on a project to address Java container memory issues, as we noticed that a significant number of Java servers hit container OOM issues due to people incorrectly tuning their heap size with respect to the container size. Because our containers have other RAM consumers which fluctuate over time, it is often difficult to determine a priori what is an appropriate Xmx to set for a particular server. </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We set about trying to solve this by dynamically adjusting the Java heap/gc behavior based on the container usage information that we pass into the JVM. We have seen promising results so far, reducing container OOMs by a significant amount, and oftentimes also reducing average heap usage (with the tradeoff of more CPU time spent doing GC). </span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Below (under the dotted line) is a more detailed explanation of our initial approach. Does this sound like something that may be useful for the general OpenJDK community? If so, would some of you be open to further discussion? I would also like to better understand what container environments look like outside of Google, to see how we could modify our approach for the more general case.</span></p><br><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thank you!</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-size:11pt;white-space:pre-wrap;color:rgb(0,0,0);font-family:Arial"><br></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;font-size:11pt;white-space:pre-wrap;color:rgb(0,0,0);font-family:Arial">Jonathan</span></p><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">------------------------------------------------------------------------</span></h3><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Introduction:</span></h3><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Adaptable Heap Sizing (AHS) is a project internal to Google that is meant to simplify configuration and improve the stability of applications in container environments. The key is that in a containerized environment, we have access to container usage and limit information. This can be used as a signal to modify Java heap behavior, helping prevent container OOMs.</span></p><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Problem:</span></h3><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Containers at Google must be properly sized to not only the JVM heap, but other memory consumers as well. These consumers include non-heap Java (e.g. native code allocations), and simultaneously running non-Java processes. </span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Common antipattern we see here at Google: </span></p></li><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We have an application running into container OOMs. </span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">An engineer raises both container memory limit and Xmx by the same amount, since there appears to be insufficient memory.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The application has reduced container OOMs, but is still prone to them, since G1 continues to use most of Xmx.</span></p></li></ul><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">This results in many jobs being configured with much more RAM than they need, but still running into container OOM issues.</span></p></li></ul><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Hypothesis:</span></h3><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">For preventing container OOM: Why can't heap expansions be bounded by the remaining free space in the container?</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">For preventing the `unnecessarily high Xmx` antipattern: Why can't target heap size be set based on GC CPU overhead?</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">From our work on Adaptable Heap Sizing, it appears they can!</span></p></li></ul><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Design:</span></h3><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We add two manageable flags in the JVM</span></p></li><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Current maximum heap expansion size</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Current target heap size</span></p></li></ul><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">A separate thread runs alongside the JVM, querying:</span></p></li><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Container memory usage/limits</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">GC CPU overhead metrics from the JVM.</span></p></li></ul><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">This thread then uses this information to calculate new values for the two new JVM flags, and continually updates them at runtime.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The `Current maximum heap expansion size` informs the JVM what is the maximum amount we can expand the heap by, while staying within container limits. This is a hard limit, and trying to expand more than this amount results in behavior equivalent to hitting the Xmx limit.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">The `Current target heap size` is a soft target value, which is used </span><span style="font-size:10.5pt;font-family:Roboto,sans-serif;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">to resize the heap (when possible) so as to bring GC CPU overhead toward its target value.</span><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span></p></li></ul><br><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Results:</span></h3><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">At Google, we have found that this design works incredibly well in our initial rollout, even for large and complex workloads.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">After deploying this to dozens of applications:</span></p></li><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Significant memory savings for previously misconfigured jobs (many of which reduced their heap usage by 50% or more)</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Significantly reduced occurrences of container OOM (100% reduction in vast majority of cases)</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">No correctness issues</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">No latency regressions*</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">We plan to deploy AHS across a much wider subset of applications by EOY '22.</span></p></li></ul></ul><br><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:4pt"><span style="font-size:14pt;font-family:Arial;color:rgb(67,67,67);background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">*Caveats: </span></h3><ul style="margin-top:0px;margin-bottom:0px"><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><h3 dir="ltr" style="line-height:1.38;margin-top:16pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-weight:400;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Enabling this feature might require tuning of the newly introduced default GC CPU overhead target to avoid regressions.</span></h3></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Time spent doing GC for an application may increase significantly (though generally we've seen in practice that even if this is the case, end-to-end latency does not increase a noticeable amount)</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:10.5pt;font-family:Roboto,sans-serif;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Enabling AHS results in frequent heap resizings, but we have not seen evidence of any negative effects as a result of these more frequent heap resizings.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">AHS is not necessarily a replacement for proper JVM tuning, but should generally work better than an untuned or improperly tuned configuration.</span></p></li><li dir="ltr" style="list-style-type:disc;font-size:11pt;font-family:Arial;color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt" role="presentation"><span style="font-size:11pt;background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">AHS is not intended for every possible workload, and there could be pathological cases where AHS results in worse behavior.</span></p></li></ul></span></div>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr">Kind regards,<br>Kirk Pepperdine<br><br><a href="http://www.kodewerk.com" target="_blank">http://www.kodewerk.com</a><br><a href="http://www.javaperformancetuning.com" target="_blank">http://www.javaperformancetuning.com</a></div>
</blockquote></div>