RFR(XS): 8194232: Container memory not properly recognized.

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Thu Jan 4 17:28:22 UTC 2018


Hi, Since this is shared code, someone from Oracle needs to sponsor this 
(for the time being) so they can check on our continuous integration and 
nightly test results.

I'll help Bob sponsor this.
thanks,
Coleen

On 1/4/18 12:13 PM, Lindenmaier, Goetz wrote:
> Hi Bob and Karen,
>
> Do i unterstand right, i shall push myself once Bobs tests are done?  That's fine.
>
> We have our own testing. I run the patch with our nightbuild through all our nightly tests, which includes all Hotspot jtreg tests.
> But there sometimes is a difference in test results if you run it.
>
> Best regards
>    Götz
>
>> Am 04.01.2018 um 18:05 schrieb Karen Kinnear <karen.kinnear at oracle.com>:
>>
>> Code looks good - I am a hotspot runtime Reviewer.
>> So once Bob’s testing is done, you are all set.
>>
>> thanks so much for making and testing this change,
>> Karen
>>
>>> On Jan 4, 2018, at 10:38 AM, Lindenmaier, Goetz <goetz.lindenmaier at sap.com> wrote:
>>>
>>> That's great, thanks a lot!
>>>
>>> Best,
>>> Goetz.
>>>
>>>> -----Original Message-----
>>>> From: Bob Vandette [mailto:bob.vandette at oracle.com]
>>>> Sent: Donnerstag, 4. Januar 2018 16:36
>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>>> Cc: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-
>>>> dev at openjdk.java.net
>>>> Subject: Re: RFR(XS): 8194232: Container memory not properly recognized.
>>>>
>>>> Sure I’ll take care of it.
>>>>
>>>> Bob.
>>>>
>>>>
>>>>> On Jan 4, 2018, at 10:34 AM, Lindenmaier, Goetz
>>>> <goetz.lindenmaier at sap.com> wrote:
>>>>> Hi Bob,
>>>>>
>>>>> neither Martin nor I can push hotspot changes. We don't have
>>>>> access to jprt.
>>>>> Could you sponsor this? I would like it to get pushed to 10.
>>>>>
>>>>> Best regards,
>>>>> Goetz.
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Bob Vandette [mailto:bob.vandette at oracle.com]
>>>>>> Sent: Donnerstag, 4. Januar 2018 16:27
>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>>>>> Cc: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-
>>>>>> dev at openjdk.java.net
>>>>>> Subject: Re: RFR(XS): 8194232: Container memory not properly
>>>> recognized.
>>>>>> The new webrev looks good.
>>>>>>
>>>>>> I see that Martin is a JDK reviewer so unless you need a hotspot reviewer
>>>>>> you should be good to go.
>>>>>>
>>>>>> Bob.
>>>>>>
>>>>>>
>>>>>>> On Jan 4, 2018, at 4:09 AM, Lindenmaier, Goetz
>>>>>> <goetz.lindenmaier at sap.com> wrote:
>>>>>>> Hi Bob,
>>>>>>>
>>>>>>> A new webrev using your version:
>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8194232-
>>>>>> ppcle_unlimited/webrev.02/
>>>>>>> It fixes the issue I saw, and traces the output you describe below.
>>>>>>> Actually, in my case gc/arguments/TestAgressiveHeap.java failed
>>>>>>> and this also passes now. The machine has cgroups but no docker
>>>>>>> which is why the test failed.
>>>>>>> The five tests below don't run on ppc because docker.support is
>>>>>>> false.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Goetz
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: Bob Vandette [mailto:bob.vandette at oracle.com]
>>>>>>>> Sent: Mittwoch, 3. Januar 2018 20:12
>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; Doerr, Martin
>>>>>>>> <martin.doerr at sap.com>; hotspot-runtime-dev at openjdk.java.net
>>>>>>>> Subject: Re: RFR(XS): 8194232: Container memory not properly
>>>>>> recognized.
>>>>>>>> I built and ran a VM with the changes below (with jong -> jlong type
>>>> fixed)
>>>>>>>> and it passed
>>>>>>>> the jtreg container tests.
>>>>>>>>
>>>>>>>> Passed: runtime/containers/docker/DockerBasicTest.java
>>>>>>>> Passed: runtime/containers/docker/TestCPUAwareness.java
>>>>>>>> Passed: runtime/containers/docker/TestCPUSets.java
>>>>>>>> Passed: runtime/containers/docker/TestMemoryAwareness.java
>>>>>>>> Passed: runtime/containers/docker/TestMisc.java
>>>>>>>>
>>>>>>>> If you run this command on your OS and look in the logs, you should
>>>> see
>>>>>> the
>>>>>>>> Unlimited message:
>>>>>>>>
>>>>>>>> ./java -Xlog:os+containers=trace -version
>>>>>>>>
>>>>>>>> …..
>>>>>>>> [0.001s][trace][os,container] Memory Limit is: 9223372036854771712
>>>>>> <<——
>>>>>>>> This number will be different for you.
>>>>>>>> [0.001s][trace][os,container] Memory Limit is: Unlimited
>>>>>>>> [0.001s][debug][os,container] container memory unlimited, using host
>>>>>> value
>>>>>>>> …..
>>>>>>>>
>>>>>>>> Bob.
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Jan 3, 2018, at 1:54 PM, Bob Vandette
>>>> <bob.vandette at oracle.com>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> On Jan 3, 2018, at 2:43 AM, Lindenmaier, Goetz
>>>>>>>> <goetz.lindenmaier at sap.com> wrote:
>>>>>>>>>> Hi Bob,
>>>>>>>>>>
>>>>>>>>>> The value is read from the file system as jlong.
>>>>>>>>>> I think making the reading of the value as jlong or julong
>>>>>>>>>> depend on the kernel version is not a good idea.
>>>>>>>>> I agree that we shouldn't make the solution based on the running
>>>> kernel
>>>>>>>> version.
>>>>>>>>>> So should I test for
>>>>>>>>>> memlimit < 0 || memlimit >  LONG_MAX / os::vm_page_size() *
>>>>>>>> os::vm_page_size()
>>>>>>>>>> and get rid of the constant altogether?
>>>>>>>>> I was thinking that we could change the type read from the file
>>>> system
>>>>>> to a
>>>>>>>> julong and then just
>>>>>>>>> use memlimit >  LONG_MAX / os::vm_page_size() *
>>>>>> os::vm_page_size().
>>>>>>>> You’ll need to
>>>>>>>>> add a (jlong) cast but since we’re checking for > LONG_MAX, there
>>>> will
>>>>>> be
>>>>>>>> no precision
>>>>>>>>> lost.
>>>>>>>>>
>>>>>>>>> Something like this:
>>>>>>>>>
>>>>>>>>> --- a/src/hotspot/os/linux/osContainer_linux.cpp
>>>>>>>>> +++ b/src/hotspot/os/linux/osContainer_linux.cpp
>>>>>>>>> @@ -31,16 +31,12 @@
>>>>>>>>> #include "logging/log.hpp"
>>>>>>>>> #include "osContainer_linux.hpp"
>>>>>>>>>
>>>>>>>>> -/*
>>>>>>>>> - * Warning: Some linux distros use 0x7FFFFFFFFFFFF000
>>>>>>>>> - * and others use 0x7FFFFFFFFFFFFFFF for unlimited.
>>>>>>>>> - */
>>>>>>>>> -#define UNLIMITED_MEM CONST64(0x7FFFFFFFFFFFF000)
>>>>>>>>>
>>>>>>>>> #define PER_CPU_SHARES 1024
>>>>>>>>>
>>>>>>>>> bool  OSContainer::_is_initialized   = false;
>>>>>>>>> bool  OSContainer::_is_containerized = false;
>>>>>>>>> +julong _unlimited_memory;
>>>>>>>>>
>>>>>>>>> class CgroupSubsystem: CHeapObj<mtInternal> {
>>>>>>>>> friend class OSContainer;
>>>>>>>>> @@ -217,6 +213,8 @@
>>>>>>>>> _is_initialized = true;
>>>>>>>>> _is_containerized = false;
>>>>>>>>>
>>>>>>>>> +  _unlimited_memory = (LONG_MAX / os::vm_page_size()) *
>>>>>>>> os::vm_page_size();
>>>>>>>>> +
>>>>>>>>> log_trace(os, container)("OSContainer::init: Initializing Container
>>>>>>>> Support");
>>>>>>>>> if (!UseContainerSupport) {
>>>>>>>>> log_trace(os, container)("Container Support not enabled");
>>>>>>>>> @@ -419,37 +417,37 @@
>>>>>>>>> *    OSCONTAINER_ERROR for not supported
>>>>>>>>> */
>>>>>>>>> jlong OSContainer::memory_limit_in_bytes() {
>>>>>>>>> -  GET_CONTAINER_INFO(jlong, memory, "/memory.limit_in_bytes",
>>>>>>>>> -                     "Memory Limit is: " JLONG_FORMAT, JLONG_FORMAT,
>>>>>>>> memlimit);
>>>>>>>>> +  GET_CONTAINER_INFO(julong, memory,
>>>> "/memory.limit_in_bytes",
>>>>>>>>> +                     "Memory Limit is: " JULONG_FORMAT, JULONG_FORMAT,
>>>>>>>> memlimit);
>>>>>>>>> -  if (memlimit >= UNLIMITED_MEM) {
>>>>>>>>> +  if (memlimit >= _unlimited_memory) {
>>>>>>>>> log_trace(os, container)("Memory Limit is: Unlimited");
>>>>>>>>> return (jlong)-1;
>>>>>>>>> }
>>>>>>>>> else {
>>>>>>>>> -    return memlimit;
>>>>>>>>> +    return (jong)memlimit;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> jlong OSContainer::memory_and_swap_limit_in_bytes() {
>>>>>>>>> -  GET_CONTAINER_INFO(jlong, memory,
>>>>>>>> "/memory.memsw.limit_in_bytes",
>>>>>>>>> -                     "Memory and Swap Limit is: " JLONG_FORMAT,
>>>>>>>> JLONG_FORMAT, memswlimit);
>>>>>>>>> -  if (memswlimit >= UNLIMITED_MEM) {
>>>>>>>>> +  GET_CONTAINER_INFO(julong, memory,
>>>>>>>> "/memory.memsw.limit_in_bytes",
>>>>>>>>> +                     "Memory and Swap Limit is: " JULONG_FORMAT,
>>>>>>>> JULONG_FORMAT, memswlimit);
>>>>>>>>> +  if (memswlimit >= _unlimited_memory) {
>>>>>>>>> log_trace(os, container)("Memory and Swap Limit is: Unlimited");
>>>>>>>>> return (jlong)-1;
>>>>>>>>> } else {
>>>>>>>>> -    return memswlimit;
>>>>>>>>> +    return (jlong)memswlimit;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> jlong OSContainer::memory_soft_limit_in_bytes() {
>>>>>>>>> -  GET_CONTAINER_INFO(jlong, memory,
>>>>>> "/memory.soft_limit_in_bytes",
>>>>>>>>> -                     "Memory Soft Limit is: " JLONG_FORMAT,
>>>> JLONG_FORMAT,
>>>>>>>> memsoftlimit);
>>>>>>>>> -  if (memsoftlimit >= UNLIMITED_MEM) {
>>>>>>>>> +  GET_CONTAINER_INFO(julong, memory,
>>>>>>>> "/memory.soft_limit_in_bytes",
>>>>>>>>> +                     "Memory Soft Limit is: " JULONG_FORMAT,
>>>>>> JULONG_FORMAT,
>>>>>>>> memsoftlimit);
>>>>>>>>> +  if (memsoftlimit >= _unlimited_memory) {
>>>>>>>>> log_trace(os, container)("Memory Soft Limit is: Unlimited");
>>>>>>>>> return (jlong)-1;
>>>>>>>>> } else {
>>>>>>>>> -    return memsoftlimit;
>>>>>>>>> +    return (jlong)memsoftlimit;
>>>>>>>>> }
>>>>>>>>> }
>>>>>>>>> Bob.
>>>>>>>>>
>>>>>>>>>> Instead, I would implement a method
>>>>>> is_unlimited_memory(memlimit).
>>>>>>>>>> I don't want to make the change too big, as I need to get it fixed in
>>>> 10.
>>>>>>>>>> Best regards,
>>>>>>>>>> Goetz.
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Bob Vandette [mailto:bob.vandette at oracle.com]
>>>>>>>>>>> Sent: Dienstag, 2. Januar 2018 17:10
>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>
>>>>>>>>>>> Cc: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-
>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>> Subject: Re: RFR(XS): 8194232: Container memory not properly
>>>>>>>> recognized.
>>>>>>>>>>> I just read that unlimited may be one of these possible values
>>>>>> depending
>>>>>>>> on
>>>>>>>>>>> Kernel version.
>>>>>>>>>>>
>>>>>>>>>>> LONG_MAX (Linux Kernel Version < 3.1.2)
>>>>>>>>>>> ULONG_MAX (3.12 <= Linux Kernel Version < 3.19)
>>>>>>>>>>> LONG_MAX / pagesize * pagesize (Linux Kernel Version >= 3.19)
>>>>>>>>>>>
>>>>>>>>>>> Assuming os::page_size() is initialized before the container init, we
>>>>>>>> should
>>>>>>>>>>> calculate
>>>>>>>>>>> the unlimited value based on this and take the ULONG_MAX
>>>> situation
>>>>>>>> into
>>>>>>>>>>> consideration.
>>>>>>>>>>>
>>>>>>>>>>> Bob.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On Dec 27, 2017, at 6:24 AM, Lindenmaier, Goetz
>>>>>>>>>>> <goetz.lindenmaier at sap.com> wrote:
>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, I also figured that the constant is one system page less than
>>>> max
>>>>>>>> long.
>>>>>>>>>>>> If the system pages are > 64K, this constant will be too small,
>>>> again.
>>>>>>>>>>>> But I'm not sure whether adding 8 zeros is right thing to do in
>>>> ramp
>>>>>>>> down
>>>>>>>>>>> phase.
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Goetz.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: Doerr, Martin
>>>>>>>>>>>>> Sent: Mittwoch, 27. Dezember 2017 12:10
>>>>>>>>>>>>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>;
>>>> hotspot-
>>>>>>>> runtime-
>>>>>>>>>>>>> dev at openjdk.java.net
>>>>>>>>>>>>> Subject: RE: RFR(XS): 8194232: Container memory not properly
>>>>>>>>>>> recognized.
>>>>>>>>>>>>> Hi Götz,
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks for fixing it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I wonder if 4 zeroes will be sufficient on all linux distros in the
>>>> long
>>>>>>>> run.
>>>>>>>>>>> Does
>>>>>>>>>>>>> anything speak against using e.g. 8 zeroes?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: hotspot-runtime-dev [mailto:hotspot-runtime-dev-
>>>>>>>>>>>>> bounces at openjdk.java.net] On Behalf Of Lindenmaier, Goetz
>>>>>>>>>>>>> Sent: Mittwoch, 27. Dezember 2017 11:38
>>>>>>>>>>>>> To: hotspot-runtime-dev at openjdk.java.net
>>>>>>>>>>>>> Subject: RFR(XS): 8194232: Container memory not properly
>>>>>>>> recognized.
>>>>>>>>>>>>> Hi
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please review and sponsor this tiny fix. It needs to go to jdk10.
>>>>>>>>>>>>> http://cr.openjdk.java.net/~goetz/wr17/8194232-
>>>>>>>>>>>>> ppcle_unlimited/webrev.01/
>>>>>>>>>>>>>
>>>>>>>>>>>>> TestAggressiveHeap.java fails because the container recognition
>>>>>>>>>>>>> misinterprets the available memory size. On SLES 12.1 ppc64le,
>>>>>>>>>>>>> GET_CONTAINER_INFO() sets memlimit to 0x7FFFFFFFFFFF0000.
>>>>>> This
>>>>>>>>>>>>> is compared to UNLIMITED_MEM == 0x7FFFFFFFFFFFF000,
>>>> making
>>>>>>>>>>>>> the VM believe memory is _not_ unlimited.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Goetz.



More information about the hotspot-runtime-dev mailing list