RFR: 8197408: Bad pointer comparison and small cleanup in os_linux.cpp

David Holmes david.holmes at oracle.com
Thu Feb 15 13:07:50 UTC 2018


On 15/02/2018 10:36 PM, Robbin Ehn wrote:
> Hi David,
> 
> On 2018-02-15 12:13, David Holmes wrote:
>> Hi Robbin,
>>
>> On 15/02/2018 7:59 PM, Robbin Ehn wrote:
>>> Hi,
>>>
>>> I notice there is something strange about container test.
>>> They are not in any test group and mach5 don't run them when 
>>> specified via path.
>>
>> That's because we don't have containerized test environments generally 
>> available and the tests have:
> 
> In the actual tests we only care about cgroups as far as I can see?

Everything I see in runtime/containers/docker is utilising Docker not 
(just) cgroups.

> So (most/all?) of the tests should work perfectly fine outside docker 
> with just cgroups.

Wouldn't you need root access to configure the cgroups outside of Docker??

David
----

> And since cgroups are default ON (as in the scheduler will put all task 
> into an unlimited group) ~all Linuxes are _default_ containerized.
> Plus in mach5 we run in mesos which uses cgroups, so test would work for 
> that reason also!
> 
> mach5:
> jib > [0.001s][trace][os,container] OSContainer::init: Initializing 
> Container Support
> jib > [0.001s][trace][os,container] Path to /memory.limit_in_bytes is 
> /sys/fs/cgroup/memory/system.slice/mesos-slave.service/memory.limit_in_bytes 
> 
> jib > [0.001s][trace][os,container] Memory Limit is: 9223372036854775807
> jib > [0.001s][trace][os,container] Memory Limit is: Unlimited
> jib > [0.001s][trace][os,container] Path to /cpu.shares is 
> /sys/fs/cgroup/cpu,cpuacct/system.slice/mesos-slave.service/cpu.shares
> jib > [0.001s][trace][os,container] CPU Shares is: 1024
> ...
> 
> We are default running in containerized environment.
> 
> /Robbin
> 
> 
>>
>>   * @requires docker.support
>>
>> I've had the chance to look at this more closely and the refactoring 
>> seems okay to me.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> *sigh*
>>>
>>> So I have only manually verified this.
>>>
>>> Bob, can you have a look at changeset?
>>>
>>> /Robbin
>>>
>>> On 2018-02-15 10:37, Robbin Ehn wrote:
>>>> Hi Thomas, thanks for having a look.
>>>>
>>>> On 2018-02-14 15:06, Thomas Stüfe wrote:
>>>>> Hi Robin,
>>>>>
>>>>> Had a short look. Note that I cannot open the issue. Link is wrong, 
>>>>> and the link in the source.patch is wrong too. So, I just looked at 
>>>>> the webrev.
>>>>>
>>>>
>>>> Sorry bug is:
>>>> https://bugs.openjdk.java.net/browse/JDK-8197408
>>>>
>>>> Inc here:
>>>> http://cr.openjdk.java.net/~rehn/8197408/v2/inc/webrev/
>>>> Full here:
>>>> http://cr.openjdk.java.net/~rehn/8197408/v2/full/webrev/
>>>>
>>>> Comments on remarks below:
>>>>
>>>>> Remarks:
>>>>>
>>>>> -> This is a matter of taste but I prefer pointers to references 
>>>>> for output variables. Makes the intent clearer at the calling site, 
>>>>> and that way you could get rid of the /* output */ comment.
>>>>
>>>> We had a debate about this, my opinion was that avoiding the NULL 
>>>> check makes it worth using a ref, but I see I'm pretty lonely in 
>>>> this camp so fixed.
>>>>
>>>>>
>>>>> -> in available_memory_container():
>>>>> Could we also add "using host value" to the logging in error case 
>>>>> as you do in physical_memory_container()?
>>>>
>>>> Fixed, and tried to improve the logs.
>>>>
>>>>>
>>>>> -> (Not part of your patch) Can 
>>>>> OSContainer::memory_limit_in_bytes() actually ever return 
>>>>> "OSCONTAINER_ERROR"? I may be wrong here but:
>>>>>
>>>>> OSCONTAINER_ERROR = -2. /memory.limit_in_bytes gets returned as 
>>>>> julong, so we have (julong)(-2). That gets compared with julong 
>>>>> _unlimited_memory which is basically LONG_MAX, so signed long max, 
>>>>> which should be smaller than (julong)(-2), or? So 
>>>>> OSContainer::memory_limit_in_bytes() should always return -1 for 
>>>>> both errors and the unlimited case.
>>>>>
>>>>
>>>> Manually tested this, it seems to work. The -2 gets promoted to 
>>>> unsigned in comparison.
>>>> We go from (jlong)-2 to (julong)ULONG_MAX-1 back to (jlong)-2.
>>>> Not obviously that it will always work. I will not touch that in 
>>>> this changeset.
>>>>
>>>>> -> Can we:
>>>>> -  st->print("container_type: %s\n", p != NULL ? p : "failed");
>>>>> +  st->print_cr("container_type: %s", p != NULL ? p : "failed");
>>>>> ?
>>>>
>>>> All of these use print + \n, you want me to just change this one or ?
>>>> I'll rather leave them alone.
>>>>
>>>>>
>>>>> -> char * OSContainer::container_type() - you free() that value but 
>>>>> it is not strdup()ed.
>>>>
>>>> Thanks for seeing this, copy-paste :)
>>>>
>>>>>
>>>>> (Btw I do not like that some functions return strduped values, some 
>>>>> do not. If OSContainer::container_type() wants to return a static 
>>>>> string, its return type should be at least const char*, not char*. 
>>>>> They even added an explicit cast to nonconst char* .)
>>>>
>>>> Changed to const.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Kind Regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 14, 2018 at 2:18 PM, Robbin Ehn <robbin.ehn at oracle.com 
>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>
>>>>>     Ping!
>>>>>
>>>>>     /Robbin
>>>>>
>>>>>
>>>>>     On 2018-02-08 13:35, Robbin Ehn wrote:
>>>>>
>>>>>         Hi David,
>>>>>
>>>>>         On 02/08/2018 01:19 PM, David Holmes wrote:
>>>>>
>>>>>             On 8/02/2018 10:08 PM, Robbin Ehn wrote:
>>>>>
>>>>>                 Hi David,
>>>>>
>>>>>                 On 02/08/2018 12:43 PM, David Holmes wrote:
>>>>>
>>>>>                     Hi Robbin,
>>>>>
>>>>>                     On 8/02/2018 7:03 PM, Robbin Ehn wrote:
>>>>>
>>>>>                         Hi all,
>>>>>
>>>>>                         There is a bad pointer comparison in 
>>>>> os_linux.cpp while looking at that
>>>>>
>>>>>
>>>>>                     You seem to be missing the fact that 
>>>>> OSContainer::cpu_cpuset_memory_nodes() can return a pointer or an 
>>>>> error code.
>>>>>
>>>>>
>>>>>                 This is not true for macro:
>>>>>                 GET_CONTAINER_INFO_CPTR
>>>>>
>>>>>                 As far I can see?
>>>>>
>>>>>
>>>>>             Sorry - you're right. I misread current code and 
>>>>> misremembered what happened at the initial code review - where I'm 
>>>>> sure this "pointer versus error code" issue was also flagged.
>>>>>
>>>>>                 It returns:
>>>>>                 if (err != 0)
>>>>>                     return (return_type) NULL;
>>>>>                 or:
>>>>>                 return os::strdup(mems);
>>>>>
>>>>>                 If you know a method that returns an integer in a 
>>>>> char*, it's broken should be fixed.
>>>>>
>>>>>
>>>>>                         I saw some if statement were missing 
>>>>> bracket, a lot of extra scopes and complexity in the scoping.
>>>>>
>>>>>
>>>>>                     You'd better check all this with Bob Vandette 
>>>>> as its his container support code.
>>>>>
>>>>>
>>>>>                 Not sure what you mean. It passes container tests.
>>>>>
>>>>>
>>>>>             In what environment did you run the container tests? 
>>>>> Most of the code you've been refactoring deals with various errors 
>>>>> and misconfigurations that can occur.
>>>>>
>>>>>             I'm sure Bob will want a chance to check the 
>>>>> refactoring still does as he intended.
>>>>>
>>>>>
>>>>>         On mach5 all platforms with the container test (+ 
>>>>> hotspot_tier1) and locally.
>>>>>         I said before, regarding containers, all Linux have cgroups 
>>>>> configured so this logic always thinks we are in a container and 
>>>>> does this logging and calculation.
>>>>>
>>>>>         But no there is no tests for miss-configuration that I found.
>>>>>
>>>>>         Added Bob!
>>>>>
>>>>>         Thanks, Robbin
>>>>>
>>>>>
>>>>>
>>>>>             Cheers,
>>>>>             David
>>>>>             -----
>>>>>
>>>>>
>>>>>                 Thanks, Robbin
>>>>>
>>>>>
>>>>>                     Cheers,
>>>>>                     David
>>>>>
>>>>>                         Webrev:
>>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ 
>>>>> <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>>                         Bug:
>>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ 
>>>>> <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>>
>>>>>                         Thanks, Robbin
>>>>>
>>>>>


More information about the hotspot-runtime-dev mailing list