JEP [DRAFT]: Container aware Java

Baesken, Matthias matthias.baesken at sap.com
Tue Jul 18 14:11:27 UTC 2017


Hi Bob,  
   VMWare  has the option to install  additional  tools, in case they are installed you get a number of metrics
 from  the guests.  Some info can be found here :

https://www.vmware.com/support/developer/guest-sdk/

Running in the guests  you  would load libvmGuestLib.so / vmGuestLib.dll  and then you have access to a number of functions for querying metrics.
Of course it is all optional ,   when you run in a guest  without  the  additional  tools/ guestlib   installed ,  you get nothing  (or not much at least ).


For AIX /  IBM PowerVM Virtualization   there is something similar available via  the perfstat API.

https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.prftools/idprftools_perfstat.htm

( libperfstat.a  is the related library )


Best regards, Matthias


-----Original Message-----
From: Bob Vandette [mailto:bob.vandette at oracle.com] 
Sent: Dienstag, 18. Juli 2017 15:43
To: Volker Simonis <volker.simonis at gmail.com>
Cc: hotspot-dev Source Developers <hotspot-dev at openjdk.java.net>; core-libs-dev <core-libs-dev at openjdk.java.net>; Baesken, Matthias <matthias.baesken at sap.com>
Subject: Re: JEP [DRAFT]: Container aware Java 


> On Jul 17, 2017, at 11:55 AM, Volker Simonis <volker.simonis at gmail.com> wrote:
> 
> On Fri, Jul 14, 2017 at 6:54 PM, Bob Vandette <bob.vandette at oracle.com> wrote:
>> 
>>> On Jul 14, 2017, at 12:15 PM, Volker Simonis <volker.simonis at gmail.com> wrote:
>>> 
>>> Hi Bob,
>>> 
>>> thanks for starting this JEP. I think it is long overdue and will be
>>> very helpful for many people.
>>> 
>> Thanks for taking a look at the JEP.
>> 
>>> That said, I don't see any mentioning of the targeted Java release. As
>>> JEPs usually go into the development version first, I suppose the work
>>> for this JEP will be done in JDK10. However, taking into account the
>>> current release cycles, the usefulness of this JEP for the Java
>>> community will significantly depend on the its availability in Java 9
>>> (at least) and Java 8 (hopefully?). Any thoughts on this?
>> 
>> I am certainly going to do my best to try to get this feature into JDK10 and
>> I understand your concern about supporting Java 8 and 9.  The decision to
>> backport this feature will have to be done based on the usual criteria (complexity,
>> usefulness, resource availability and compatibility).  It may be possible to
>> add some subset of this functionality in a backport but we’ll have to wait until
>> the design is further along.
>> 
>>> 
>>> You also mention that this JEP is only about Docker support on
>>> Linux/x86_64. I'd argue that we should at least address at least all
>>> Docker supported Linux platforms. I'm pretty sure our colleagues from
>>> IBM would like to support this on Linux/ppc64 and Linux/s390 as well.
>> 
>> Right now I don’t expect to have to write CPU specific logic.
>> Therefore supporting ppc64, s390 or even aarch64 should boil down to testing.
>> 
>>> 
>>> Moreover, the suggested API and the new JVM_xxxxxx functions should be
>>> at least designed in such a way that other virtualisation environment
>>> such as VMWare/VirtualBox/Xen/KVM can take advantage of it. I'm also
>>> not sure if a completely new API solely for Docker containers is
>>> making sense and is worth the effort? I understand that containers and
>>> virtualisation are two different things, but they both impose similar
>>> restrictions to the VM (i.e. virtualisation may present a huge number
>>> of CPUs to the OS/VM which in reality are mapped to a single, physical
>>> CPU) so maybe we should design the new API in such a way that it can
>>> be used to tackle both problems.
>> 
>> I will try to keep that in mind.  I will have to look into the feasibility of accessing
>> host configuration from within a container or virtual machine.  For containers,
>> it seems to me that would break the intent of isolation.
> 
> Did you wanted to say "for virtualisation, it seems to me that would
> break the intent of isolation." here?

Well the comment applies to both technologies but cgroups/containers
were specifically designed to isolate or limit processes from accessing host 
resources.

> 
> So you're right, that you can not easily query the host/guest
> information in a virtualised environment (like for examples VmWare)
> because that would break isolation. But as far as I know, every
> virtualisation solution has it's own library/API which can be used to
> query the host (if there is any) and get information about the host
> system and the resources allocated for the guest. We've done some
> basic virtualisation detection for common systems like VmWare or Xen
> (cc'ed Matthias who may know more details.

If you can point me at some API docs, I’d appreciate it.

Bob.

> 
>> I know that host information
>> does leak into  containers through the proc file system but I suspect this was done
>> for simplicity and compatibility reasons.
>> 
>>> 
>>> Finally, you mention in section D. that all the information should be
>>> logged the to error crash logs. That's good but maybe you could add
>>> that all the detection and all the findings should also be made
>>> available through the new Unified JVM Logging facility (JEP 158)?
>> 
>> That’s a good idea.  I will add that to the JEP.
>> 
>> Thanks,
>> Bob.
>> 
>>> 
>>> Regards,
>>> Volker
>>> 
>>> 
>>> On Fri, Jul 14, 2017 at 4:22 PM, Bob Vandette <bob.vandette at oracle.com> wrote:
>>>> I’d like to propose the following JEP that will enhance the Java runtime to be more container aware.
>>>> This will allow the Java runtime to performance better ergonomics and provide better
>>>> execution time statistics when running in a container.
>>>> 
>>>> 
>>>> JEP Issue:
>>>> 
>>>> https://bugs.openjdk.java.net/browse/JDK-8182070 <https://bugs.openjdk.java.net/browse/JDK-8182070>
>>>> 
>>>> Here’s a Text dump of the JEP contents for your convenience:
>>>> 
>>>> 
>>>> Summary
>>>> -------
>>>> 
>>>> Container aware Java runtime
>>>> 
>>>> Goals
>>>> -----
>>>> 
>>>> Enhance the JVM and Core libraries to detect running in a container and adapt to the system resources available to it.  This JEP will only support Docker on Linux-x64 although the design should be flexible enough to allow support for other platforms and container technologies.  The initial focus will be on Linux low level container technology such as cgroups so that we will be able to easily support other container technologies running on Linux in addition to Docker.
>>>> 
>>>> Non-Goals
>>>> ---------
>>>> 
>>>> It is not a goal of this JEP to support any platform other than Docker container technology running on Linux x64.
>>>> 
>>>> Success Metrics
>>>> ---------------
>>>> 
>>>> Success will be measured by the improved efficiency of running multiple Java containers on a host system with out of the box options.
>>>> 
>>>> Motivation
>>>> ----------
>>>> 
>>>> Container technology is becoming more and more prevalent in Cloud based applications.  This technology provides process isolation and allows the platform vendor to specify limits and alter the behavior of a process running inside a container that the Java runtime is not aware of.  This causes the Java runtime to potentially attempt to use more system resources than are available to it causing performance degradation or even termination.
>>>> 
>>>> 
>>>> Description
>>>> -----------
>>>> 
>>>> This enhancement will be made up of the following work items:
>>>> 
>>>> A. Detecting if Java is running in a container.
>>>> 
>>>> The Java runtime, as well as any tests that we might write for this feature, will need to be able to detect that the current Java process is running in a container. I propose that we add a new JVM_ native function that returns a boolean true value if we are running inside a container.
>>>> 
>>>> JVM_InContainer will return true if the currently running process is running in a container, otherwise false will be returned.
>>>> 
>>>> B. Exposing container resource limits and configuration.
>>>> 
>>>> There are several configuration options and limits that can be imposed upon a running container.  Not all of these
>>>> are important to a running Java process.  We clearly want to be able to detect how many CPUs have been allocated to our process along with the maximum amount of memory that we be allocated but there are other options that we might want to base runtime decisions on.
>>>> 
>>>> In addition, since Container typically impose limits on system resources, they also provide the ability to easily access the amount of consumption of these resources.  I intent on providing this information in addition to the configuration data.
>>>> 
>>>> I propose adding a new jdk.internal.Platform class that will allow access to this information.  Since some of this information is needed during the startup of the VM, I propose that much of the implementation of the methods in the Platform class be done in the VM and exposed as JVM_xxxxxx functions.  In hotspot, the JVM_xxxxxx function will be implemented via the os.hpp interface.
>>>> 
>>>> Here are the categories of configuration and consumption statistics that will be made available (The exact API is TBD):
>>>> 
>>>>   isContainerized
>>>>   Memory Limit
>>>>   Total Memory Limit
>>>>   Soft Memory Limit
>>>>   Max Memory Usage
>>>>   Current Memory Usage
>>>>   Maximum Kernel Memory
>>>>   CPU Shares
>>>>   CPU Period
>>>>   CPU Quote
>>>>   Number of CPUs
>>>>   CPU Sets
>>>>   CPU Set Memory Nodes
>>>>   CPU Usage
>>>>   CPU Usage Per CPU
>>>>   Block I/O Weight
>>>>   Block I/O Device Weight
>>>>   Device I/O Read Rate
>>>>   Device I/O Write Rate
>>>>   OOM Kill Enabled
>>>>   OOM Score Adjustment
>>>>   Memory Swappiness
>>>>   Shared Memory Size
>>>> 
>>>> TODO:
>>>> 1. Need to specify the exact arguments and return format for these accessor functions.
>>>> 
>>>> C. Adjusting Java runtime configuration based on limits.
>>>> 
>>>> Java startup normally queries the operating system in order to setup runtime defaults for things such as the number of GC threads and default memory limits.  When running in a container, the operating system functions used provide information about the host and does not include the containers configuration and limits.  The VM and core libraries will be modified as part of this JEP to first determine if the current running process is running in a container. It will then cause the runtime to use the container values rather than the general operating system functions for configuring and managing the Java process.  There have been a few attempts to correct some of these issue in the VM but they are not complete. The CPU detection in the VM currently only handles a container that limits cpu usage via CPU sets.  If the  Docker --cpu or --cpu-period along with --cpu-quota options are specified, it currently has no effect on the VMs configuration.
>>>> 
>>>> The experimental memory detection that has been implemented only impacts the Heap selection and does not apply to the os::physical_memory or os::available_memory low level functions.  This leaves other parts of the VM and
>>>> core libraries to believe there is more memory available than there actually is.
>>>> 
>>>> The Numa support available in the VM is also not correct when running in a container.  The number of available memory nodes and enabled nodes as reported by the libnuma library does not take into account the impact of the Docker --cpuset-mems option which restricts which memory nodes the container can use.  Inside the container, the file /proc/{pid}/self does report the correct Cpus_allowed and Mems_Allowed but libnuma doesn't respect this.  This has been verified via the numactl utility.
>>>> 
>>>> To correct these shortcomings and make this support more robust, here's a list of the current cgroup subsystems that we be examined in order to update the internal VM and core library configuration.
>>>> 
>>>> **Number of CPUs**
>>>> 
>>>> Use a combination of number_of_cpus() and cpu_sets() in order to determine how many processors are available to the process and adjust the JVMs os::active_processor_count appropriately.  The number_of_cpus() will be calculated based on the cpu_quota() and cpu_period() using this formula: number_of_cpus() = cpu_quota() / cpu_period().   Since it's not currently possible to understand the relative weight of the running container against all other containers, altering the cpu_shares of a running container will have no affect on altering Java's configuration.
>>>> 
>>>> **Total available memory**
>>>> 
>>>> Use the memory_limit() value from the cgroup file system to initialize the os::physical_memory() value in the VM.  This value will propagate to all other parts of the Java runtime.
>>>> 
>>>> We might also consider examining the soft_memory_limit and total_memory_limit in addition to the memory_limit during the ergonomics startup processing in order to fine tuning some of the other VM settings.
>>>> 
>>>> **CPU Memory Nodes**
>>>> 
>>>> Use cpu_set_memory_nodes() to configure the os::numa support.
>>>> 
>>>> **Memory usage**
>>>> 
>>>> Use memory_usage_in_bytes() for providing os::available_memory() by subtracting the usage from the total available memory allocated to the container.
>>>> 
>>>> D. Adding container configuration to error crash logs.
>>>> 
>>>> As as troubleshooting aid, we will dump any available container statistics to the hotspot error log.
>>>> 
>>>> E. Adding a startup flag to enable/disable this support.
>>>> 
>>>> Add a -XX:+UseContainerSupport VM option that will be used to enable this support.  The default will be off until this feature is proven.
>>>> 
>>>> Alternatives
>>>> ------------
>>>> 
>>>> There are a few existing RFE's filed that could be used to enhance the current experimental implementation rather than taking the JEP route.
>>>> 
>>>> Testing
>>>> -------
>>>> 
>>>> Docker/container specific tests should be added in order to validate the functionality being provided with this JEP.
>>>> 
>>>> Risks and Assumptions
>>>> ---------------------
>>>> 
>>>> Docker is currently based on cgroups v1. Cgroups v2 is also available but is incomplete and not yet supported by Docker. It's possible that v2 could replace v1 in an incompatible way rendering this work unusable until it is upgraded.
>>>> 
>>>> Dependencies
>>>> -----------
>>>> 
>>>> None at this time.
>>>> 
>> 



More information about the core-libs-dev mailing list