RFR: 8197408: Bad pointer comparison and small cleanup in os_linux.cpp
Robbin Ehn
robbin.ehn at oracle.com
Thu Feb 15 12:36:45 UTC 2018
Hi David,
On 2018-02-15 12:13, David Holmes wrote:
> Hi Robbin,
>
> On 15/02/2018 7:59 PM, Robbin Ehn wrote:
>> Hi,
>>
>> I notice there is something strange about container test.
>> They are not in any test group and mach5 don't run them when specified via path.
>
> That's because we don't have containerized test environments generally available and the tests have:
In the actual tests we only care about cgroups as far as I can see?
So (most/all?) of the tests should work perfectly fine outside docker with just cgroups.
And since cgroups are default ON (as in the scheduler will put all task into an unlimited group) ~all Linuxes are _default_ containerized.
Plus in mach5 we run in mesos which uses cgroups, so test would work for that reason also!
mach5:
jib > [0.001s][trace][os,container] OSContainer::init: Initializing Container Support
jib > [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/system.slice/mesos-slave.service/memory.limit_in_bytes
jib > [0.001s][trace][os,container] Memory Limit is: 9223372036854775807
jib > [0.001s][trace][os,container] Memory Limit is: Unlimited
jib > [0.001s][trace][os,container] Path to /cpu.shares is /sys/fs/cgroup/cpu,cpuacct/system.slice/mesos-slave.service/cpu.shares
jib > [0.001s][trace][os,container] CPU Shares is: 1024
...
We are default running in containerized environment.
/Robbin
>
> * @requires docker.support
>
> I've had the chance to look at this more closely and the refactoring seems okay to me.
>
> Thanks,
> David
> -----
>
>>
>> *sigh*
>>
>> So I have only manually verified this.
>>
>> Bob, can you have a look at changeset?
>>
>> /Robbin
>>
>> On 2018-02-15 10:37, Robbin Ehn wrote:
>>> Hi Thomas, thanks for having a look.
>>>
>>> On 2018-02-14 15:06, Thomas Stüfe wrote:
>>>> Hi Robin,
>>>>
>>>> Had a short look. Note that I cannot open the issue. Link is wrong, and the link in the source.patch is wrong too. So, I just looked at the webrev.
>>>>
>>>
>>> Sorry bug is:
>>> https://bugs.openjdk.java.net/browse/JDK-8197408
>>>
>>> Inc here:
>>> http://cr.openjdk.java.net/~rehn/8197408/v2/inc/webrev/
>>> Full here:
>>> http://cr.openjdk.java.net/~rehn/8197408/v2/full/webrev/
>>>
>>> Comments on remarks below:
>>>
>>>> Remarks:
>>>>
>>>> -> This is a matter of taste but I prefer pointers to references for output variables. Makes the intent clearer at the calling site, and that way you could get rid of the /* output */ comment.
>>>
>>> We had a debate about this, my opinion was that avoiding the NULL check makes it worth using a ref, but I see I'm pretty lonely in this camp so fixed.
>>>
>>>>
>>>> -> in available_memory_container():
>>>> Could we also add "using host value" to the logging in error case as you do in physical_memory_container()?
>>>
>>> Fixed, and tried to improve the logs.
>>>
>>>>
>>>> -> (Not part of your patch) Can OSContainer::memory_limit_in_bytes() actually ever return "OSCONTAINER_ERROR"? I may be wrong here but:
>>>>
>>>> OSCONTAINER_ERROR = -2. /memory.limit_in_bytes gets returned as julong, so we have (julong)(-2). That gets compared with julong _unlimited_memory which is basically LONG_MAX, so signed long max, which should be smaller than (julong)(-2), or? So OSContainer::memory_limit_in_bytes() should always return -1 for both errors and the unlimited case.
>>>>
>>>
>>> Manually tested this, it seems to work. The -2 gets promoted to unsigned in comparison.
>>> We go from (jlong)-2 to (julong)ULONG_MAX-1 back to (jlong)-2.
>>> Not obviously that it will always work. I will not touch that in this changeset.
>>>
>>>> -> Can we:
>>>> - st->print("container_type: %s\n", p != NULL ? p : "failed");
>>>> + st->print_cr("container_type: %s", p != NULL ? p : "failed");
>>>> ?
>>>
>>> All of these use print + \n, you want me to just change this one or ?
>>> I'll rather leave them alone.
>>>
>>>>
>>>> -> char * OSContainer::container_type() - you free() that value but it is not strdup()ed.
>>>
>>> Thanks for seeing this, copy-paste :)
>>>
>>>>
>>>> (Btw I do not like that some functions return strduped values, some do not. If OSContainer::container_type() wants to return a static string, its return type should be at least const char*, not char*. They even added an explicit cast to nonconst char* .)
>>>
>>> Changed to const.
>>>
>>> Thanks, Robbin
>>>
>>>>
>>>> Kind Regards, Thomas
>>>>
>>>>
>>>>
>>>> On Wed, Feb 14, 2018 at 2:18 PM, Robbin Ehn <robbin.ehn at oracle.com <mailto:robbin.ehn at oracle.com>> wrote:
>>>>
>>>> Ping!
>>>>
>>>> /Robbin
>>>>
>>>>
>>>> On 2018-02-08 13:35, Robbin Ehn wrote:
>>>>
>>>> Hi David,
>>>>
>>>> On 02/08/2018 01:19 PM, David Holmes wrote:
>>>>
>>>> On 8/02/2018 10:08 PM, Robbin Ehn wrote:
>>>>
>>>> Hi David,
>>>>
>>>> On 02/08/2018 12:43 PM, David Holmes wrote:
>>>>
>>>> Hi Robbin,
>>>>
>>>> On 8/02/2018 7:03 PM, Robbin Ehn wrote:
>>>>
>>>> Hi all,
>>>>
>>>> There is a bad pointer comparison in os_linux.cpp while looking at that
>>>>
>>>>
>>>> You seem to be missing the fact that OSContainer::cpu_cpuset_memory_nodes() can return a pointer or an error code.
>>>>
>>>>
>>>> This is not true for macro:
>>>> GET_CONTAINER_INFO_CPTR
>>>>
>>>> As far I can see?
>>>>
>>>>
>>>> Sorry - you're right. I misread current code and misremembered what happened at the initial code review - where I'm sure this "pointer versus error code" issue was also flagged.
>>>>
>>>> It returns:
>>>> if (err != 0)
>>>> return (return_type) NULL;
>>>> or:
>>>> return os::strdup(mems);
>>>>
>>>> If you know a method that returns an integer in a char*, it's broken should be fixed.
>>>>
>>>>
>>>> I saw some if statement were missing bracket, a lot of extra scopes and complexity in the scoping.
>>>>
>>>>
>>>> You'd better check all this with Bob Vandette as its his container support code.
>>>>
>>>>
>>>> Not sure what you mean. It passes container tests.
>>>>
>>>>
>>>> In what environment did you run the container tests? Most of the code you've been refactoring deals with various errors and misconfigurations that can occur.
>>>>
>>>> I'm sure Bob will want a chance to check the refactoring still does as he intended.
>>>>
>>>>
>>>> On mach5 all platforms with the container test (+ hotspot_tier1) and locally.
>>>> I said before, regarding containers, all Linux have cgroups configured so this logic always thinks we are in a container and does this logging and calculation.
>>>>
>>>> But no there is no tests for miss-configuration that I found.
>>>>
>>>> Added Bob!
>>>>
>>>> Thanks, Robbin
>>>>
>>>>
>>>>
>>>> Cheers,
>>>> David
>>>> -----
>>>>
>>>>
>>>> Thanks, Robbin
>>>>
>>>>
>>>> Cheers,
>>>> David
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>> Bug:
>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/ <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>
>>>> Thanks, Robbin
>>>>
>>>>
More information about the hotspot-runtime-dev
mailing list