RFR: 8146115 - Improve docker container detection and resource configuration usage

Tue Oct 3 12:39:38 UTC 2017

On 10/03/2017 02:25 PM, Bob Vandette wrote:
> After talking to a number of folks and getting feedback, my current thinking is to enable the support by default.

Great.

> 
> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
> in behavior.  I’m trying to make the detection robust so that it will fallback to the current behavior in the event
> that cgroups is not configured as expected but I’d like to have a way of forcing the issue.  JDK 10 is not
> supposed to be a long term support release which makes it a good target for this new behavior.
> 
> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
> source.  There’s more information available for cpusets than just processor affinity that we might want to
> consider when calculating the number of processors to assume for the VM.  There’s exclusivity and
> effective cpu data available in addition to the cpuset string.

cgroup only contains limits, not the real hard limits.
You most consider the affinity mask. We that have numa nodes do:

[rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
[0.001s][debug][os] Initial active processor count set to 16
[rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
[0.001s][debug][os] Initial active processor count set to 32

when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
So the flag actually breaks the little numa support we have now.

Thanks, Robbin

> 
> Bob.
> 
> 
>> On Oct 3, 2017, at 4:00 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>
>> Hi David,
>>
>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>> Hi Robbin,
>>> I have some views on this :)
>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>> Hi Bob,
>>>>
>>>> As I said in your presentation for RT.
>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>
>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>
>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>> Therefore the flag make no sense.
>>
>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>> Again the capability is present but may not be enabled/configured.
>>
>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>
>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>     If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to read everything from the cgroups if configured to use cgroups.
>>
>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>
>> Here is the bug:
>>
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>> [0.002s][debug][os] Initial active processor count set to 4
>> ^C
>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>> [0.003s][debug][os] Initial active processor count set to 32
>> ^C
>>
>> _cpu_count already does the right thing.
>>
>> Thanks, Robbin
>>
>>
>>> Cheers,
>>> David
>>>>
>>>> Thanks for trying to fixing this!
>>>>
>>>> /Robbin
>>>>
>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>> Please review these changes that improve on docker container detection and the
>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>> based on the containers resource limitation settings and metric data files.
>>>>>
>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>
>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>
>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>
>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>> may not satisfy every users needs, I’ve added an additional flag to allow the
>>>>> number of CPUs to be overridden.  This flag is named -XX:ActiveProcessorCount=xx.
>>>>>
>>>>>
>>>>> Bob.
>>>>>
>>>>>
>>>>>
>