RFR: 8197408: Bad pointer comparison and small cleanup in os_linux.cpp

Robbin Ehn robbin.ehn at oracle.com
Thu Feb 15 12:36:45 UTC 2018


Hi David,

On 2018-02-15 12:13, David Holmes wrote:
> Hi Robbin,
> 
> On 15/02/2018 7:59 PM, Robbin Ehn wrote:
>> Hi,
>>
>> I notice there is something strange about container test.
>> They are not in any test group and mach5 don't run them when specified via path.
> 
> That's because we don't have containerized test environments generally available and the tests have:

In the actual tests we only care about cgroups as far as I can see?
So (most/all?) of the tests should work perfectly fine outside docker with just cgroups.
And since cgroups are default ON (as in the scheduler will put all task into an unlimited group) ~all Linuxes are _default_ containerized.
Plus in mach5 we run in mesos which uses cgroups, so test would work for that reason also!

mach5:
jib > [0.001s][trace][os,container] OSContainer::init: Initializing Container Support
jib > [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/system.slice/mesos-slave.service/memory.limit_in_bytes
jib > [0.001s][trace][os,container] Memory Limit is: 9223372036854775807
jib > [0.001s][trace][os,container] Memory Limit is: Unlimited
jib > [0.001s][trace][os,container] Path to /cpu.shares is /sys/fs/cgroup/cpu,cpuacct/system.slice/mesos-slave.service/cpu.shares
jib > [0.001s][trace][os,container] CPU Shares is: 1024
...

We are default running in containerized environment.

/Robbin


> 
>   * @requires docker.support
> 
> I've had the chance to look at this more closely and the refactoring seems okay to me.
> 
> Thanks,
> David
> -----
> 
>>
>> *sigh*
>>
>> So I have only manually verified this.
>>
>> Bob, can you have a look at changeset?
>>
>> /Robbin
>>
>> On 2018-02-15 10:37, Robbin Ehn wrote:
>>> Hi Thomas, thanks for having a look.
>>>
>>> On 2018-02-14 15:06, Thomas Stüfe wrote:
>>>> Hi Robin,
>>>>
>>>> Had a short look. Note that I cannot open the issue. Link is wrong, and the link in the source.patch is wrong too. So, I just looked at the webrev.
>>>>
>>>
>>> Sorry bug is:
>>> https://bugs.openjdk.java.net/browse/JDK-8197408
>>>
>>> Inc here:
>>> http://cr.openjdk.java.net/~rehn/8197408/v2/inc/webrev/
>>> Full here:
>>> http://cr.openjdk.java.net/~rehn/8197408/v2/full/webrev/
>>>
>>> Comments on remarks below:
>>>
>>>> Remarks:
>>>>
>>>> -> This is a matter of taste but I prefer pointers to references for output variables. Makes the intent clearer at the calling site, and that way you could get rid of the /* output */ comment.
>>>
>>> We had a debate about this, my opinion was that avoiding the NULL check makes it worth using a ref, but I see I'm pretty lonely in this camp so fixed.
>>>
>>>>
>>>> -> in available_memory_container():
>>>> Could we also add "using host value" to the logging in error case as you do in physical_memory_container()?
>>>
>>> Fixed, and tried to improve the logs.
>>>
>>>>
>>>> -> (Not part of your patch) Can OSContainer::memory_limit_in_bytes() actually ever return "OSCONTAINER_ERROR"? I may be wrong here but:
>>>>
>>>> OSCONTAINER_ERROR = -2. /memory.limit_in_bytes gets returned as julong, so we have (julong)(-2). That gets compared with julong _unlimited_memory which is basically LONG_MAX, so signed long max, which should be smaller than (julong)(-2), or? So OSContainer::memory_limit_in_bytes() should always return -1 for both errors and the unlimited case.
>>>>
>>>
>>> Manually tested this, it seems to work. The -2 gets promoted to unsigned in comparison.
>>> We go from (jlong)-2 to (julong)ULONG_MAX-1 back to (jlong)-2.
>>> Not obviously that it will always work. I will not touch that in this changeset.
>>>
>>>> -> Can we:
>>>> -  st->print("container_type: %s\n", p != NULL ? p : "failed");
>>>> +  st->print_cr("container_type: %s", p != NULL ? p : "failed");
>>>> ?
>>>
>>> All of these use print + \n, you want me to just change this one or ?
>>> I'll rather leave them alone.
>>>
>>>>
>>>> -> char * OSContainer::container_type() - you free() that value but it is not strdup()ed.
>>>
>>> Thanks for seeing this, copy-paste :)
>>>
>>>>
>>>> (Btw I do not like that some functions return strduped values, some do not. If OSContainer::container_type() wants to return a static string, its return type should be at least const char*, not char*. They even added an explicit cast to nonconst char* .)
>>>
>>> Changed to const.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>> On Wed, Feb 14, 2018 at 2:18 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
>>>>
>>>>     Ping!
>>>>
>>>>     /Robbin
>>>>
>>>>
>>>>     On 2018-02-08 13:35, Robbin Ehn wrote:
>>>>
>>>>         Hi David,
>>>>
>>>>         On 02/08/2018 01:19 PM, David Holmes wrote:
>>>>
>>>>             On 8/02/2018 10:08 PM, Robbin Ehn wrote:
>>>>
>>>>                 Hi David,
>>>>
>>>>                 On 02/08/2018 12:43 PM, David Holmes wrote:
>>>>
>>>>                     Hi Robbin,
>>>>
>>>>                     On 8/02/2018 7:03 PM, Robbin Ehn wrote:
>>>>
>>>>                         Hi all,
>>>>
>>>>                         There is a bad pointer comparison in os_linux.cpp while looking at that
>>>>
>>>>
>>>>                     You seem to be missing the fact that OSContainer::cpu_cpuset_memory_nodes() can return a pointer or an error code.
>>>>
>>>>
>>>>                 This is not true for macro:
>>>>                 GET_CONTAINER_INFO_CPTR
>>>>
>>>>                 As far I can see?
>>>>
>>>>
>>>>             Sorry - you're right. I misread current code and misremembered what happened at the initial code review - where I'm sure this "pointer versus error code" issue was also flagged.
>>>>
>>>>                 It returns:
>>>>                 if (err != 0)
>>>>                     return (return_type) NULL;
>>>>                 or:
>>>>                 return os::strdup(mems);
>>>>
>>>>                 If you know a method that returns an integer in a char*, it's broken should be fixed.
>>>>
>>>>
>>>>                         I saw some if statement were missing bracket, a lot of extra scopes and complexity in the scoping.
>>>>
>>>>
>>>>                     You'd better check all this with Bob Vandette as its his container support code.
>>>>
>>>>
>>>>                 Not sure what you mean. It passes container tests.
>>>>
>>>>
>>>>             In what environment did you run the container tests? Most of the code you've been refactoring deals with various errors and misconfigurations that can occur.
>>>>
>>>>             I'm sure Bob will want a chance to check the refactoring still does as he intended.
>>>>
>>>>
>>>>         On mach5 all platforms with the container test (+ hotspot_tier1) and locally.
>>>>         I said before, regarding containers, all Linux have cgroups configured so this logic always thinks we are in a container and does this logging and calculation.
>>>>
>>>>         But no there is no tests for miss-configuration that I found.
>>>>
>>>>         Added Bob!
>>>>
>>>>         Thanks, Robbin
>>>>
>>>>
>>>>
>>>>             Cheers,
>>>>             David
>>>>             -----
>>>>
>>>>
>>>>                 Thanks, Robbin
>>>>
>>>>
>>>>                     Cheers,
>>>>                     David
>>>>
>>>>                         Webrev:
>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>                         Bug:
>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>
>>>>                         Thanks, Robbin
>>>>
>>>>


More information about the hotspot-runtime-dev mailing list