RFR: 8226575: OperatingSystemMXBean should be made container aware

Daniil Titov daniil.x.titov at oracle.com
Fri Dec 6 01:03:21 UTC 2019


Hi Mandy and Bob,

Thank you for your comments. Please review a new version of the fix [1] that makes 
OperatingSystemImpl methods return -1 if one of the metric has value 0.

As Mandy recommended I also updated the Javadoc for OperatingSystemMXBean 
indicating that methods could return -1 if the information is not available. 
There were no changes in CSR [3] yet, I plan to proceed with them after the fix is
reviewed.

> In http://cr.openjdk.java.net/~dtitov/8226575/webrev.03/src/java.base/linux/classes/jdk/internal/platform/cgroupv1/SubSystem.java.sdiff.html
> Shouldn’t you keep the IOException catch clauses in case the file is not found?

There is no need in keeping IOException catch  in these 2 places where it used to be (getLongValueMatchingLine and getLongEntry methods). 
As I understand IOException catch was required only because File.lines() and File. readAllLines() can throw IOException.
Now these calls are performed inside  AccessController.doPrivileged(PrivilegedExceptionAction) that wraps
 all checked exceptions in  PrivilegedActionException  that we are catching now instead of IOException.

Here is the sampe of the stacktrace:
java.security.PrivilegedActionException: java.io.FileNotFoundException
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:558)
	at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValueMatchingLine(SubSystem.java:113)
	at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:390)
	at jdk.management/com.sun.management.internal.OperatingSystemImpl.getTotalMemorySize(OperatingSystemImpl.java:109)
	at CheckOperatingSystemMXBean.main(CheckOperatingSystemMXBean.java:36)
Caused by: java.io.FileNotFoundException
	at java.base/jdk.internal.platform.cgroupv1.SubSystem.lambda$getLongValueMatchingLine$1(SubSystem.java:116)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:554)


In getStringValue method the whole code block is now executed inside AccessController.doPrivileged() so we still need either catch 
IOException inside this code block or convert this  block  to  PrivilegedExceptionAction and then put AccessController.doPrivileged  
call inside new try/catch Block to catch PrivilegedExceptionAction. The former approach looked more preferable.

Testing: Mach5 tier1-tier3 and open/test/hotspot/jtreg/containers/docker tests passed. Tier4-tier6 tests are still running.

[1] Webrev:  http://cr.openjdk.java.net/~dtitov/8226575/webrev.04/ 
[2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8226575     
[3] CSR: https://bugs.openjdk.java.net/browse/JDK-8228428 

Thanks,
Daniil

On 12/5/19, 12:59 PM, "Mandy Chung" <mandy.chung at oracle.com> wrote:

    
    
    On 12/5/19 12:50 PM, Bob Vandette wrote:
    >
    >>>> It may worth considering adding Metrics::getSwapLimit and
    >>>> Metrics::getSwapUsage and move the computation to the implementation of
    >>>> Metrics.  Bob may have an opinion.
    >> There was no any new input regarding this so I decided to leave it unchanged.
    > Sorry, I didn’t respond to this.  Since the calculation required for getFreeSwapSpaceSize requires retries
    > due to the access of multiple changing values, I think it’s best to leave things as they are so the caller of
    > these methods understands the limitations of the API.
    
    OK with me.
    > Also, the fact that swap size metrics include memory sizes is fully documented in both the cgroup and docker
    > online documentation so it’s probably best to be consistent.
    >
    >>>> Also it seems correct for the memory related methods to check if
    >>>> (containerMetrics != null && containerMetrics.getMemoryLimit() >= 0).
    >>>> BTW what does it mean if limit == 0?
    >> Per Docker docs the minimum allowed value for  memory limit (--memory option) is 4 megabytes.
    >> And if memory limit is unset the return value is -1.  Thus, in my understanding the value 0 is only possible
    >> if something went wrong while retrieving this metric.
    > That is true but shouldn’t you return -1 in that case?
    >
    > I originally thought it was ok to fall back to the host data for 0 values but I think its better to return unavailable (-1)
    > I think you might want to change all >= 0 to > 0 and return -1 if any of the values are 0.  This would be more consistent.
    
    +1
    
    The javadoc should be changed and returns -1 when it's unavailable and 
    the CSR should also be updated to reflect this.    I'm sure Joe can 
    re-approve the CSR quickly when the fix is reviewed and approved.
    
    > You should only fall back to the original logic (host values) if container values are set to unlimited.
    >
    +1
    
    Mandy
    




More information about the serviceability-dev mailing list