RFR: 8146115 - Improve docker container detection and resource configuration usage

Tue Oct 3 10:45:10 UTC 2017

Hi David, I think we are seen the issue from complete opposite. (this RFE could be pushed as a bug from my POV)

On 10/03/2017 10:42 AM, David Holmes wrote:
> On 3/10/2017 6:00 PM, Robbin Ehn wrote:
>> Hi David,
>>
>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>> Hi Robbin,
>>>
>>> I have some views on this :)
>>>
>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>> Hi Bob,
>>>>
>>>> As I said in your presentation for RT.
>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>
>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>>
>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - 
>>> in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes 
>>> clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>
>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>> Therefore the flag make no sense.
> 
> No that is exactly why it is opt-in! Why should we have to waste startup time reading a bunch of cgroup values just to determine that cgroups are not actually being used!

If you have a cgroup enabled kernel they _are_ being used, no escaping that.
cgroup is not a simple yes and no so for which resources depend on how you configured your kernel.
To find out for what resource and what limits are set is we need to read them.

I rather waste startup time (0.103292989 vs 0.103577139 seconds) and get values correct, so our heuristic works fine out-of-the-box. (and if you must, it opt-out)

Also I notice that we don't read the numa values so the phys mem method does a poor job. Correct would be check at least cgroup and numa bindings.
We also have this option UseCGroupMemoryLimitForHeap which should be removed.

> 
>>>
>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>>
>>> Again the capability is present but may not be enabled/configured.
>>
>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>
>>>
>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>    If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>>
>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and 
>>> someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's 
>>> better IMHO to read everything from the cgroups if configured to use cgroups.
>>
>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>
>> Here is the bug:
>>
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>> [0.002s][debug][os] Initial active processor count set to 4
>> ^C
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>> [0.003s][debug][os] Initial active processor count set to 32
>> ^C
>>
>> _cpu_count already does the right thing.
> 
> But how do you then combine that information with the use of shares and/or quotas?

That I don't know, wild naive guess would be:
active count ~ MIN(OSContainer::pd_active_processor_count(), cpuset); :)

I assume everything we need to know is in: https://www.kernel.org/doc/Documentation/cgroup-v1/cpusets.txt

Thanks, Robbin

> 
> David
> -----
> 
>> Thanks, Robbin
>>
>>
>>>
>>> Cheers,
>>> David
>>>
>>>>
>>>> Thanks for trying to fixing this!
>>>>
>>>> /Robbin
>>>>
>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>> Please review these changes that improve on docker container detection and the
>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>> based on the containers resource limitation settings and metric data files.
>>>>>
>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>
>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>
>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>
>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>> may not satisfy every users needs, I’ve added an additional flag to allow the
>>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>>>>
>>>>>
>>>>> Bob.
>>>>>
>>>>>
>>>>>