RFR: 8197408: Bad pointer comparison and small cleanup in os_linux.cpp
David Holmes
david.holmes at oracle.com
Thu Feb 15 13:07:50 UTC 2018
On 15/02/2018 10:36 PM, Robbin Ehn wrote:
> Hi David,
>
> On 2018-02-15 12:13, David Holmes wrote:
>> Hi Robbin,
>>
>> On 15/02/2018 7:59 PM, Robbin Ehn wrote:
>>> Hi,
>>>
>>> I notice there is something strange about container test.
>>> They are not in any test group and mach5 don't run them when
>>> specified via path.
>>
>> That's because we don't have containerized test environments generally
>> available and the tests have:
>
> In the actual tests we only care about cgroups as far as I can see?
Everything I see in runtime/containers/docker is utilising Docker not
(just) cgroups.
> So (most/all?) of the tests should work perfectly fine outside docker
> with just cgroups.
Wouldn't you need root access to configure the cgroups outside of Docker??
David
----
> And since cgroups are default ON (as in the scheduler will put all task
> into an unlimited group) ~all Linuxes are _default_ containerized.
> Plus in mach5 we run in mesos which uses cgroups, so test would work for
> that reason also!
>
> mach5:
> jib > [0.001s][trace][os,container] OSContainer::init: Initializing
> Container Support
> jib > [0.001s][trace][os,container] Path to /memory.limit_in_bytes is
> /sys/fs/cgroup/memory/system.slice/mesos-slave.service/memory.limit_in_bytes
>
> jib > [0.001s][trace][os,container] Memory Limit is: 9223372036854775807
> jib > [0.001s][trace][os,container] Memory Limit is: Unlimited
> jib > [0.001s][trace][os,container] Path to /cpu.shares is
> /sys/fs/cgroup/cpu,cpuacct/system.slice/mesos-slave.service/cpu.shares
> jib > [0.001s][trace][os,container] CPU Shares is: 1024
> ...
>
> We are default running in containerized environment.
>
> /Robbin
>
>
>>
>> * @requires docker.support
>>
>> I've had the chance to look at this more closely and the refactoring
>> seems okay to me.
>>
>> Thanks,
>> David
>> -----
>>
>>>
>>> *sigh*
>>>
>>> So I have only manually verified this.
>>>
>>> Bob, can you have a look at changeset?
>>>
>>> /Robbin
>>>
>>> On 2018-02-15 10:37, Robbin Ehn wrote:
>>>> Hi Thomas, thanks for having a look.
>>>>
>>>> On 2018-02-14 15:06, Thomas Stüfe wrote:
>>>>> Hi Robin,
>>>>>
>>>>> Had a short look. Note that I cannot open the issue. Link is wrong,
>>>>> and the link in the source.patch is wrong too. So, I just looked at
>>>>> the webrev.
>>>>>
>>>>
>>>> Sorry bug is:
>>>> https://bugs.openjdk.java.net/browse/JDK-8197408
>>>>
>>>> Inc here:
>>>> http://cr.openjdk.java.net/~rehn/8197408/v2/inc/webrev/
>>>> Full here:
>>>> http://cr.openjdk.java.net/~rehn/8197408/v2/full/webrev/
>>>>
>>>> Comments on remarks below:
>>>>
>>>>> Remarks:
>>>>>
>>>>> -> This is a matter of taste but I prefer pointers to references
>>>>> for output variables. Makes the intent clearer at the calling site,
>>>>> and that way you could get rid of the /* output */ comment.
>>>>
>>>> We had a debate about this, my opinion was that avoiding the NULL
>>>> check makes it worth using a ref, but I see I'm pretty lonely in
>>>> this camp so fixed.
>>>>
>>>>>
>>>>> -> in available_memory_container():
>>>>> Could we also add "using host value" to the logging in error case
>>>>> as you do in physical_memory_container()?
>>>>
>>>> Fixed, and tried to improve the logs.
>>>>
>>>>>
>>>>> -> (Not part of your patch) Can
>>>>> OSContainer::memory_limit_in_bytes() actually ever return
>>>>> "OSCONTAINER_ERROR"? I may be wrong here but:
>>>>>
>>>>> OSCONTAINER_ERROR = -2. /memory.limit_in_bytes gets returned as
>>>>> julong, so we have (julong)(-2). That gets compared with julong
>>>>> _unlimited_memory which is basically LONG_MAX, so signed long max,
>>>>> which should be smaller than (julong)(-2), or? So
>>>>> OSContainer::memory_limit_in_bytes() should always return -1 for
>>>>> both errors and the unlimited case.
>>>>>
>>>>
>>>> Manually tested this, it seems to work. The -2 gets promoted to
>>>> unsigned in comparison.
>>>> We go from (jlong)-2 to (julong)ULONG_MAX-1 back to (jlong)-2.
>>>> Not obviously that it will always work. I will not touch that in
>>>> this changeset.
>>>>
>>>>> -> Can we:
>>>>> - st->print("container_type: %s\n", p != NULL ? p : "failed");
>>>>> + st->print_cr("container_type: %s", p != NULL ? p : "failed");
>>>>> ?
>>>>
>>>> All of these use print + \n, you want me to just change this one or ?
>>>> I'll rather leave them alone.
>>>>
>>>>>
>>>>> -> char * OSContainer::container_type() - you free() that value but
>>>>> it is not strdup()ed.
>>>>
>>>> Thanks for seeing this, copy-paste :)
>>>>
>>>>>
>>>>> (Btw I do not like that some functions return strduped values, some
>>>>> do not. If OSContainer::container_type() wants to return a static
>>>>> string, its return type should be at least const char*, not char*.
>>>>> They even added an explicit cast to nonconst char* .)
>>>>
>>>> Changed to const.
>>>>
>>>> Thanks, Robbin
>>>>
>>>>>
>>>>> Kind Regards, Thomas
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 14, 2018 at 2:18 PM, Robbin Ehn <robbin.ehn at oracle.com
>>>>> <mailto:robbin.ehn at oracle.com>> wrote:
>>>>>
>>>>> Ping!
>>>>>
>>>>> /Robbin
>>>>>
>>>>>
>>>>> On 2018-02-08 13:35, Robbin Ehn wrote:
>>>>>
>>>>> Hi David,
>>>>>
>>>>> On 02/08/2018 01:19 PM, David Holmes wrote:
>>>>>
>>>>> On 8/02/2018 10:08 PM, Robbin Ehn wrote:
>>>>>
>>>>> Hi David,
>>>>>
>>>>> On 02/08/2018 12:43 PM, David Holmes wrote:
>>>>>
>>>>> Hi Robbin,
>>>>>
>>>>> On 8/02/2018 7:03 PM, Robbin Ehn wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> There is a bad pointer comparison in
>>>>> os_linux.cpp while looking at that
>>>>>
>>>>>
>>>>> You seem to be missing the fact that
>>>>> OSContainer::cpu_cpuset_memory_nodes() can return a pointer or an
>>>>> error code.
>>>>>
>>>>>
>>>>> This is not true for macro:
>>>>> GET_CONTAINER_INFO_CPTR
>>>>>
>>>>> As far I can see?
>>>>>
>>>>>
>>>>> Sorry - you're right. I misread current code and
>>>>> misremembered what happened at the initial code review - where I'm
>>>>> sure this "pointer versus error code" issue was also flagged.
>>>>>
>>>>> It returns:
>>>>> if (err != 0)
>>>>> return (return_type) NULL;
>>>>> or:
>>>>> return os::strdup(mems);
>>>>>
>>>>> If you know a method that returns an integer in a
>>>>> char*, it's broken should be fixed.
>>>>>
>>>>>
>>>>> I saw some if statement were missing
>>>>> bracket, a lot of extra scopes and complexity in the scoping.
>>>>>
>>>>>
>>>>> You'd better check all this with Bob Vandette
>>>>> as its his container support code.
>>>>>
>>>>>
>>>>> Not sure what you mean. It passes container tests.
>>>>>
>>>>>
>>>>> In what environment did you run the container tests?
>>>>> Most of the code you've been refactoring deals with various errors
>>>>> and misconfigurations that can occur.
>>>>>
>>>>> I'm sure Bob will want a chance to check the
>>>>> refactoring still does as he intended.
>>>>>
>>>>>
>>>>> On mach5 all platforms with the container test (+
>>>>> hotspot_tier1) and locally.
>>>>> I said before, regarding containers, all Linux have cgroups
>>>>> configured so this logic always thinks we are in a container and
>>>>> does this logging and calculation.
>>>>>
>>>>> But no there is no tests for miss-configuration that I found.
>>>>>
>>>>> Added Bob!
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>> -----
>>>>>
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/
>>>>> <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>> Bug:
>>>>> http://cr.openjdk.java.net/~rehn/8197408/webrev/
>>>>> <http://cr.openjdk.java.net/~rehn/8197408/webrev/>
>>>>>
>>>>> Thanks, Robbin
>>>>>
>>>>>
More information about the hotspot-runtime-dev
mailing list