RFR: 8146115 - Improve docker container detection and resource configuration usage
Bob Vandette
bob.vandette at oracle.com
Tue Oct 3 14:41:38 UTC 2017
> On Oct 3, 2017, at 8:39 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>
> On 10/03/2017 02:25 PM, Bob Vandette wrote:
>> After talking to a number of folks and getting feedback, my current thinking is to enable the support by default.
>
> Great.
>
>> I still want to include the flag for at least one Java release in the event that the new behavior causes some regression
>> in behavior. I’m trying to make the detection robust so that it will fallback to the current behavior in the event
>> that cgroups is not configured as expected but I’d like to have a way of forcing the issue. JDK 10 is not
>> supposed to be a long term support release which makes it a good target for this new behavior.
>> I agree with David that once we commit to cgroups, we should extract all VM configuration data from that
>> source. There’s more information available for cpusets than just processor affinity that we might want to
>> consider when calculating the number of processors to assume for the VM. There’s exclusivity and
>> effective cpu data available in addition to the cpuset string.
>
> cgroup only contains limits, not the real hard limits.
> You most consider the affinity mask. We that have numa nodes do:
>
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 16
> [rehn at rehn-ws dev]$ numactl --cpunodebind=1 --membind=1 java -Xlog:os=debug -XX:+UseContainerSupport -cp . ForEver | grep proc
> [0.001s][debug][os] Initial active processor count set to 32
>
> when benchmarking all the time and that must be set to 16 otherwise the flag is really bad for us.
> So the flag actually breaks the little numa support we have now.
Thanks for sharing those results. I’ll look into this.
I’m hoping this is due to the fact that I am not yet examining the memory node files in the cgroup file
system.
Bob.
>
> Thanks, Robbin
>
>> Bob.
>>> On Oct 3, 2017, at 4:00 AM, Robbin Ehn <robbin.ehn at oracle.com> wrote:
>>>
>>> Hi David,
>>>
>>> On 10/03/2017 12:46 AM, David Holmes wrote:
>>>> Hi Robbin,
>>>> I have some views on this :)
>>>> On 3/10/2017 6:20 AM, Robbin Ehn wrote:
>>>>> Hi Bob,
>>>>>
>>>>> As I said in your presentation for RT.
>>>>> If kernel if configured with cgroup this should always be read (otherwise we get wrong values).
>>>>> E.g. fedora have had cgroups default on several years (I believe most distros have it on).
>>>>>
>>>>> - No option is needed at all: right now we have wrong values your fix will provide right ones, why would you ever what to turn that off?
>>>> It's not that you would want to turn that off (necessarily) but just because cgroups capability exists it doesn't mean they have actually been enabled and configured - in which case reading all the cgroup info is unnecessary startup overhead. So for now this is opt-in - as was the experimental cgroup support we added. Once it becomes clearer how this needs to be used we can adjust the defaults. For now this is enabling technology only.
>>>
>>> If cgroup are mounted they are on and the only way to know the configuration (such as no limits) is to actual read the cgroup filesystem.
>>> Therefore the flag make no sense.
>>>
>>>>> - log target container would make little sense since almost all linuxes run with croups on.
>>>> Again the capability is present but may not be enabled/configured.
>>>
>>> The capability is on if cgroup are mount and the only way to know the configuration is to read the cgroup filesystem.
>>>
>>>>> - For cpuset, the processes affinity mask already reflect cgroup setting so you don't need to look into cgroup for that
>>>>> If you do, you would miss any processes specific affinity mask. So _cpu_count() should already be returning the right number of CPU's.
>>>> While the process affinity mask reflect cpusets (and we already use it for that reason), it doesn't reflect shares and quotas. And if shares/quotas are enforced and someone sets a custom affinity mask, what is it all supposed to mean? That's one of the main reasons to allow the number of cpu's to be hardwired via a flag. So it's better IMHO to read everything from the cgroups if configured to use cgroups.
>>>
>>> I'm not taking about shares and quotes, they should be read of course, but cpuset should be checked such as in _cpu_count.
>>>
>>> Here is the bug:
>>>
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.002s][debug][os] Initial active processor count set to 4
>>> ^C
>>> [rehn at rehn-ws dev]$ taskset --cpu-list 0-2,6 java -XX:+UseContainerSupport -Xlog:os=debug -cp . ForEver | grep proc
>>> [0.003s][debug][os] Initial active processor count set to 32
>>> ^C
>>>
>>> _cpu_count already does the right thing.
>>>
>>> Thanks, Robbin
>>>
>>>
>>>> Cheers,
>>>> David
>>>>>
>>>>> Thanks for trying to fixing this!
>>>>>
>>>>> /Robbin
>>>>>
>>>>> On 09/22/2017 04:27 PM, Bob Vandette wrote:
>>>>>> Please review these changes that improve on docker container detection and the
>>>>>> automatic configuration of the number of active CPUs and total and free memory
>>>>>> based on the containers resource limitation settings and metric data files.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~bobv/8146115/webrev.00/ <http://cr.openjdk.java.net/~bobv/8146115/webrev.00/>
>>>>>>
>>>>>> These changes are enabled with -XX:+UseContainerSupport.
>>>>>>
>>>>>> You can enable logging for this support via -Xlog:os+container=trace.
>>>>>>
>>>>>> Since the dynamic selection of CPUs based on cpusets, quotas and shares
>>>>>> may not satisfy every users needs, I’ve added an additional flag to allow the
>>>>>> number of CPUs to be overridden. This flag is named -XX:ActiveProcessorCount=xx.
>>>>>>
>>>>>>
>>>>>> Bob.
>>>>>>
>>>>>>
>>>>>>
More information about the hotspot-dev
mailing list