RFR: 8226575: OperatingSystemMXBean should be made container aware

Bob Vandette bob.vandette at oracle.com
Thu Dec 5 20:50:49 UTC 2019


In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html

Shouldn’t you keep the IOException catch clauses in case the file is not found?


> On Dec 5, 2019, at 2:31 PM, Daniil Titov <daniil.x.titov at oracle.com> wrote:
> 
> Hi Mandy and Bob,
> 
> Please review a new version of the webrev that addresses  the most of these issues:
> 
> 1) The interface and spec [3] were updated to use default methods. CSR [3] was re-approved.
> 
> 2) Security-sensitive operations in j.i.p.cgroupv1.Metrics and in j.i.p.cgroupv1. SubSystem
>   were wrapped with doPrivileged
> 
> 3) getCpuLoad () method was optimized to fallback to  getSystemCpuLoad0  if the cpuset is identical to the host's one. 
>   It uses sysconf(_SC_NPROCESSORS_CONF) to retrieve the number of CPUs configured on the host . Testing with
>   different  --cpuset-cpus settings inside a Docker container proved  that it always returns the same number of  hosts configured
>   CPUs regardless of --cpuset-cpus settings while the same settings affect getEffectiveCpuSetCpus and getCpuSetCpus() metrics.
> 
>    In addition, getCpuLoad () method  now returns -1 in the cases when quotas are active but cpu usage and cpu period metrics are not available and
>   in the case when  for some reason it fails to retrieve a valid CPU load for one of CPUs while iteration over them 

Shouldn't you do the same for getCpuLoad

149                     int[] cpuSet = containerMetrics.getEffectiveCpuSetCpus();
150                     if (cpuSet != null && cpuSet.length > 0) {

If cpuSet.length == 0?


> 
>>> CheckOperatingSystemMXBean.java
>>>    System.out.println(String.format(...)) can simply be replaced with System.out.format.
> I had to leave it unchanged since replacing  it with System.out.format results in the tests instability as  it makes the trace output
> occasionally Intervene  here (the trace message sometimes is printed inside this message)  and tests cannot find the expected
> pattern in the output.
> 
>>> It may worth considering adding Metrics::getSwapLimit and 
>>> Metrics::getSwapUsage and move the computation to the implementation of 
>>> Metrics.  Bob may have an opinion.
> 
> There was no any new input regarding this so I decided to leave it unchanged.

Sorry, I didn’t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
due to the access of multiple changing values, I think it’s best to leave things as they are so the caller of
these methods understands the limitations of the API.

Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
online documentation so it’s probably best to be consistent.

> 
>>> Also it seems correct for the memory related methods to check if 
>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
>>> BTW what does it mean if limit == 0?
> 
> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
> if something went wrong while retrieving this metric.

That is true but shouldn’t you return -1 in that case?

I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.

You should only fall back to the original logic (host values) if container values are set to unlimited.

Bob.

> 
> Testing: Mach5 tier1-tier6 tests (that include open/test/hotspot/jtreg/containers/docker  and : jdk_management  tests) passed. 
> 
> [1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/ 
> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575    
> [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428
> 
> Thank you,
> Daniil
> 
> On 12/4/19, 4:09 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:
> 
> 
> 
>    On 12/3/19 9:40 PM, Daniil Titov wrote:
>> 
>>>> Under what circumstance that limit or memLimit is < 0?
>> The memory limit metrics is not available if JVM runs on Linux host ( not in a docker container) or if a docker container was started without
>> specifying a memory limit ( without '--memory='  Docker option) . In latter there is no limit on how much memory the container can use and
>> it can use as much memory as the host's OS allows.
>> 
> 
>    OK.  Please add a comment to the code.
> 
>    It may worth considering adding Metrics::getSwapLimit and 
>    Metrics::getSwapUsage and move the computation to the implementation of 
>    Metrics.  Bob may have an opinion.
> 
>    Also it seems correct for the memory related methods to check if 
>    (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).  
>    BTW what does it mean if limit == 0?
> 
>>>> Is it worth  specifying this case?
>> I believe yes, since it covers the cases when JVM runs  on a Linux host or a docker container was started without memory limitation.
>> 
> 
>    I was wondering if the javadoc should specify that.
>>>> It fallbacks to return the system's total swap space size - this is not really what it should report.
>> For the case when JVM runs on a Linux host it is exactly what we want. The only problematic case is if JVM runs inside a docker container without a memory limit set.
>> However, I am not sure how we could differentiate these 2 cases.
> 
>    As this is the case when the limit is not set in the container, it 
>    returns the system metrics which sounds appropriate.
> 
>> 
>>>> Similarly, getFreeMemorySize and getTotalMemorySize and getCpuLoad.
>> For  getTotalMemorySize I think we are good here. If limit is not set then all memory the host's OS have is available.
>> For getFreeMemorySize the problematic case is if is the memory limit is set but the memory usage for some reason is not available (containerMetrics.getMemoryUsage() returns 0).
> 
>    Will zero memory usage happen?
> 
>> Probably in this case we should just return -1 as currently getFreePhysicalMemorySize0() does if it cannot retrieve a valid result.
>> 
> 
>> For getCpuLoad() the problematic case if CPU quotas are active but CpuPeriod,  CpuNumPeriods , or getCpuUsage are unavailable or if a valid  CPU load for some CPU was
>> not retrieved, or if all retrieved CPU load values happen to be zeros. Probably we should just  return -1 in these cases rather then falling back to getSystemCpuLoad0()
>> 
> 
>    returning -1 sounds right.
>>>> src/jdk.management/windows/classes/com/sun/management/internal/OperatingSystemImpl.java
>>>>    There is no strong need to make the deprecated methods as default methods.  If they were default methods, they only need to be implemented once as opposed to in all OS-specific implementations.
>> I could make these methods defaults if you feel it is a better approach here.
>> 
>> 
> 
>    It's not strictly needed but I can go either way.
> 
> 
>>>> CheckOperatingSystemMXBean.java
>>>>     System.out.println(String.format(...)) can simply be replaced with System.out.format.
>> I will include this change in the next webrev, thank you!
>> 
>> 
> 
>    thanks
>    Mandy
> 
> 
> 



More information about the serviceability-dev mailing list