[Containers] Reasoning for cpu shares limits

Severin Gehwolf sgehwolf at redhat.com
Fri Jan 4 18:09:44 UTC 2019


Hi,

Having come across this cloud foundry issue[1], I wonder why the cgroup
cpu shares' value is being used in the JVM as a heuristic for available
processors.

>From the man page from docker-run:

---------------------------------------------------------
       --cpu-shares=0
          CPU shares (relative weight)

       By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running
       containers.

       To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher.

       The proportion will only apply when CPU-intensive processes are running.  When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will
       vary depending on the number of containers running on the system.

       For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first
       container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5%
       and 33% of the CPU.

       On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core.

       For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can
       result in the following division of CPU shares:

              PID    container    CPU CPU share
              100    {C0}     0   100% of CPU0
              101    {C1}     1   100% of CPU1
              102    {C1}     2   100% of CPU2

---------------------------------------------------------

So the cpu shares value (unlike --cpu-quota) is a relative weight.

For example, those three cpu-shares settings are equivalent (C1-C4 are
containers; '-c' is a short-cut for '--cpu-shares'):

A[i]
-------------
C1 => -c=122
C2 => -c=122
C3 => -c=61
C4 => -c=61

B[ii]
-------------
C1 => -c=1026
C2 => -c=1026
C3 => -c=513
C4 => -c=513

C[iii]
-------------
C1 => -c=2048
C2 => -c=2048
C3 => -c=1024
C4 => -c=1024

For A the container CPU heuristics will determine for the JVM to use 1
CPU for C1-C4. For B and C, the container CPU heuristics will determine
for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which
seems rather inconsistent and arbitrary. The reason this is happening
is that 1024 seems to have gotten a questionable meaning in [2]. I
wonder why?

The JVM cannot reasonably determine from the relative weight of --cpu-
shares' value how many CPUs it should use. As it's a relative weight
that's something for the container runtime to take into account. It
appears to me that the container detection code should probably fall
back to the host CPU value and only take CPU quotas into account.

Am I missing something obvious here? All I could find was this in JDK-
8146115:
"""
If cpu_shares has been setup for the container, the number_of_cpus()
will be calculated based on cpu_shares()/1024. 1024 is the default and
standard unit for calculating relative cpu 
"""

"1024 is the default and standard unit for calculating relative cpu"
seems a wrong assumption to me. Thoughts?

Thanks,
Severin

[1]    https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166
[2]    http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43
[i]*   http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log
       http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log
[ii]*  http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log
       http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log
[iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log
       http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log

* Files produced with:

$ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done
$ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java
public class RuntimeProc {
	public static void main(String[] args) {
		int availProc = Runtime.getRuntime().availableProcessors();
		System.out.println(">>> Available processors: " + availProc + " <<<<");
	}
}





More information about the hotspot-dev mailing list