Additional JEPS 8182070 requirement: avoiding container OOM

Glyn Normington gnormington at pivotal.io
Thu Dec 14 09:12:03 UTC 2017


On Wed, Dec 13, 2017 at 5:34 PM, Bob Vandette <bob.vandette at oracle.com>
wrote:

>
> On Dec 13, 2017, at 12:00 PM, Glyn Normington <gnormington at pivotal.io>
> wrote:
>
> On Wed, Dec 13, 2017 at 4:51 PM, Bob Vandette <bob.vandette at oracle.com>
> wrote:
>
>>
>> On Dec 13, 2017, at 11:01 AM, Glyn Normington <gnormington at pivotal.io>
>> wrote:
>>
>> Hi Bob
>>
>> Thanks very much for that information. Some comments inline. This one
>> could be relevant to the refocussed JEP: the JVM should monitor its
>> approach to the container memory limit and trigger "out of memory"
>> processing at some threshold close to the limit.
>>
>>
>> Memory allocations in hotspot are difficult to track since there are many
>> different allocators
>> and we don’t control them all.  Thread stacks get dynamically allocated
>> upon use, the C++ runtime allocates
>> memory for native code and third party JNI library could be calling
>> malloc.  These are all in addition
>> to the allocators we do control - Java Heap and internal Memory Arenas
>> for runtime metadata.
>>
>> One possible solution is to at least attempt threshold checking during
>> allocations we
>> do control and to do OOM processing at that time.  This wouldn’t catch
>> every case but could
>> increase the robustness of the Java runtime.  With the change I
>> introduced in JDK-8146155,
>> we now at least have a way of finding out how much free memory is
>> available in the container.
>>
>
> Another solution you might like to mull over is for the JVM to poll the
> memory cgroup value memory.usage_in_bytes (assuming the memory cgroup is
> mounted read-only inside the container) and trigger "out of memory" on some
> threshold.
>
>
> That is exactly what I was referring to in "we now at least have a way of
> finding out how much free memory is available in the container."
>

Ah, thanks for clarifying. That's great.


>
> I added several cgroup access functions inside hotspot (
> http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/
> hotspot/os/linux/osContainer_linux.cpp)
> including these two.
>
> /* memory_usage_in_bytes <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l456> * <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l457> * Return the amount of used memory for this process. <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l458> * <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l459> * return: <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l460> *    memory usage in bytes or <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l461> *    -1 for unlimited <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l462> *    OSCONTAINER_ERROR for not supported <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l463> */ <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l464>jlong OSContainer::memory_usage_in_bytes() { <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l465>  GET_CONTAINER_INFO(jlong, memory, "/memory.usage_in_bytes", <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l466>                     "Memory Usage is: " JLONG_FORMAT, JLONG_FORMAT, memusage); <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l467>  return memusage; <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l468>} <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l469> <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l470>/* memory_max_usage_in_bytes <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l471> * <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l472> * Return the maximum amount of used memory for this process. <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l473> * <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l474> * return: <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l475> *    max memory usage in bytes or <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l476> *    OSCONTAINER_ERROR for not supported <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l477> */ <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l478>jlong OSContainer::memory_max_usage_in_bytes() { <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l479>  GET_CONTAINER_INFO(jlong, memory, "/memory.max_usage_in_bytes", <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l480>                     "Maximum Memory Usage is: " JLONG_FORMAT, JLONG_FORMAT, memmaxusage); <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l481>  return memmaxusage; <http://hg.openjdk.java.net/jdk/jdk/file/a559b7cd1dea/src/hotspot/os/linux/osContainer_linux.cpp#l482>}
>
>
> Bob.
>
>
>
>>
>> Bob.
>>
>>
>> Regards,
>> Glyn
>>
>> On Wed, Dec 13, 2017 at 2:58 PM, Bob Vandette <bob.vandette at oracle.com>
>> wrote:
>>
>>> Hi Glyn,
>>>
>>> The JEP you mention is going to be updated in a week or so to refocus
>>> it’s goals
>>> on monitoring of the JVM running in containers.  This is due to the fact
>>> that I was
>>> able to split out the work of dynamically configuring the VM based on
>>> container
>>> settings and integrate this work into JDK 10.  This work was integrated
>>> under jira issue
>>> https://bugs.openjdk.java.net/browse/JDK-8146115.
>>>
>>> Although this change does not completely address your requirement, it
>>> should help.
>>> This change causes the JVM to correctly configure the number of threads
>>> and memory
>>> allocations based on the resources configured in the container.  This
>>> change also
>>> introduces a new JVM option (-XX:ActiveProcessorCount=xx) which allows
>>> the user
>>> to specify how many processors the JVM should use.  We will now honor
>>> cpu shares, quotas
>>> in addition to cpusets for determining the number of threads that the
>>> JVM will use.
>>>
>>
>> Yes, that does sound like a useful set of changes.
>>
>>
>>>
>>> Prior to this change, I also added https://bugs.openjdk.jav
>>> a.net/browse/JDK-8186315.
>>> This change allows the user to specify the percentage of container
>>> memory that
>>> should be used by the Java Heap.  This allows the users to reserve
>>> memory outside
>>> of the Java heap for other classes of memory allocations (C-Heap for
>>> example).
>>>
>>> Both of these changes have been integrated into JDK 10 and are available
>>> in the
>>> latest early access release of Linux x64 (http://jdk.java.net/10/).
>>>
>>
>> How interesting! The previous version of our Java memory calculator
>> allowed the heap size to be specified as a proportion of container memory,
>> but we could only guess at a default proportion and users found it quite
>> hard to know what value to use when they needed to increase or decrease
>> heap size.
>>
>> The current version adopts a different approach: we try to calculate JVM
>> memory consumption (apart from heap) from the JVM options and then subtract
>> that value from the container memory to get the maximum heap size. This
>> approach suffers from the difficulty of predicting the memory consumption
>> of the JVM with any accuracy (see my recent thread entitled 'Excessive "GC"
>> memory area size'). The up-side of the new approach is that the user can
>> override the behaviour using standard JVM options rather than proportions.
>>
>>
>>>
>>> I will talk to the Hotspot runtime folks about your requirement
>>>
>>
>> Thanks very much.
>>
>>
>>> but other than these
>>> options, have you considered configuring your containers to use some
>>> swap space
>>> in addition to memory configurations so that temporary spikes in memory
>>> consumption
>>> won’t cause an OOM ?
>>>
>>
>> We talked about this but we need to bound the amount of swap space
>> because our environment is a PaaS and we can't allow users to consume more
>> than their fair share of system resources. If we did enable a bounded swap
>> space, I think we'd hit the same issue when the RAM+swap limit was exceeded.
>>
>>
>>> There are also ways of disabling the OOM killer in containers.
>>>
>>
>> We actually tried that. We started the container with the OOM killer
>> disabled and when the container hit OOM, we gathered diagnostics and
>> re-enabled the OOM killer to terminate the container. The main issue with
>> that is the container-level diagnostics (number of committed pages of RAM
>> etc.) don't tell the Java programmer what JVM options need changing to
>> avoid the problem in future.
>>
>> A more sophisticated approach along those lines would be to disable the
>> OOM killer and when the container *nears* its limit somehow trigger the JVM
>> to do "out of memory" processing. Then our JVMTI agent would be driven and
>> produce plenty of diagnostics suitable for a Java programmer. This is a bit
>> speculative though as it requires fairly deep changes to the container
>> configuration as well as solving the problem of how to trigger the JVM to
>> do its "out of memory" processing. Maybe a more practical solution there is
>> for the JVM itself to monitor its approach to the container memory limit
>> and trigger "out of memory" processing at some threshold close to the
>> limit. This could be relevant to the JEP.
>>
>>
>>> If we get a failed memory allocation, we will try to deliver a Java
>>> OutOfMemory
>>> exception which would at least allow you to figure out what went wrong.
>>> I haven’t looked
>>> into this yet but it is something worth considering.
>>>
>>
>> Yeah, OutOfMemory exceptions provide some useful information, although we
>> found it was insufficiently detailed which is why we wrote a JVMTI agent to
>> dump out detailed diagnostics.
>>
>>
>>>
>>> Bob.
>>>
>>>
>>>
>>> On Dec 13, 2017, at 4:13 AM, Glyn Normington <gnormington at pivotal.io>
>>> wrote:
>>>
>>> I wonder if someone involved in JEPS 8182070 (Container aware Java) would
>>> care to comment on the additional requirement described below?
>>>
>>> On Mon, Nov 13, 2017 at 9:30 AM, Glyn Normington <gnormington at pivotal.io
>>> >
>>> wrote:
>>>
>>> I would like to mention an additional requirement for JEPS 8182070 (
>>> http://openjdk.java.net/jeps/8182070): avoid the JVM hitting container
>>> OOM by strictly bounding the amount of (physical) memory that the JVM
>>> consumes. This may be implicit in the document, but I think it should be
>>> made an explicit goal.
>>>
>>> If a java application hits container OOM, no detailed diagnostics, such
>>> as
>>> those associated with an OutOfMemoryError or a JVMTI resource exhaustion
>>> event, are presented to the user, so the user finds it very difficult to
>>> know how to fix the problem.
>>>
>>> The Cloud Foundry OSS project has done quite a bit of work on this
>>> problem
>>> and provides a couple of utilities which help when running a JVM in a
>>> container:
>>>
>>> * Java memory calculator ([1], [2]) to determine JVM memory settings,
>>> * jvmkill JVMTI agent ([3]) to report detailed diagnostics on a resource
>>> exhaustion event.
>>>
>>> Regards,
>>> Glyn
>>>
>>> [1] https://github.com/cloudfoundry/java-buildpack-memory-calculator
>>>
>>> [2] Design doc: https://docs.google.com/document/d/
>>> 1vlXBiwRIjwiVcbvUGYMrxx2Aw1RVAtxq3iuZ3UK2vXA/edit#heading=h.uy41ishpv9zc
>>>
>>> [3] https://github.com/cloudfoundry/jvmkill
>>>
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Glyn
>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Glyn
>>
>>
>>
>
>
> --
> Regards,
> Glyn
>
>
>


-- 
Regards,
Glyn


More information about the hotspot-dev mailing list