[Containers] Reasoning for cpu shares limits

David Holmes david.holmes at oracle.com
Fri Jan 4 22:27:15 UTC 2019


Hi Severin,

On 5/01/2019 4:09 am, Severin Gehwolf wrote:
> Hi,
> 
> Having come across this cloud foundry issue[1], I wonder why the cgroup
> cpu shares' value is being used in the JVM as a heuristic for available
> processors.

See also:

https://bugs.openjdk.java.net/browse/JDK-8197589

There's quite a bit of history on this, and it may be spread across a 
number of bugs and review threads. Hopefully Bob can provide a neat 
summary :)

Cheers,
David

>  From the man page from docker-run:
> 
> ---------------------------------------------------------
>         --cpu-shares=0
>            CPU shares (relative weight)
> 
>         By default, all containers get the same proportion of CPU cycles. This proportion can be modified by changing the container's CPU share weighting relative to the weighting of all other running
>         containers.
> 
>         To modify the proportion from the default of 1024, use the --cpu-shares flag to set the weighting to 2 or higher.
> 
>         The proportion will only apply when CPU-intensive processes are running.  When tasks in one container are idle, other containers can use the left-over CPU time. The actual amount of CPU time will
>         vary depending on the number of containers running on the system.
> 
>         For example, consider three containers, one has a cpu-share of 1024 and two others have a cpu-share setting of 512. When processes in all three containers attempt to use 100% of CPU, the first
>         container would receive 50% of the total CPU time. If you add a fourth container with a cpu-share of 1024, the first container only gets 33% of the CPU. The remaining containers receive 16.5%, 16.5%
>         and 33% of the CPU.
> 
>         On a multi-core system, the shares of CPU time are distributed over all CPU cores. Even if a container is limited to less than 100% of CPU time, it can use 100% of each individual CPU core.
> 
>         For example, consider a system with more than three cores. If you start one container {C0} with -c=512 running one process, and another container {C1} with -c=1024 running two processes, this can
>         result in the following division of CPU shares:
> 
>                PID    container    CPU CPU share
>                100    {C0}     0   100% of CPU0
>                101    {C1}     1   100% of CPU1
>                102    {C1}     2   100% of CPU2
> 
> ---------------------------------------------------------
> 
> So the cpu shares value (unlike --cpu-quota) is a relative weight.
> 
> For example, those three cpu-shares settings are equivalent (C1-C4 are
> containers; '-c' is a short-cut for '--cpu-shares'):
> 
> A[i]
> -------------
> C1 => -c=122
> C2 => -c=122
> C3 => -c=61
> C4 => -c=61
> 
> B[ii]
> -------------
> C1 => -c=1026
> C2 => -c=1026
> C3 => -c=513
> C4 => -c=513
> 
> C[iii]
> -------------
> C1 => -c=2048
> C2 => -c=2048
> C3 => -c=1024
> C4 => -c=1024
> 
> For A the container CPU heuristics will determine for the JVM to use 1
> CPU for C1-C4. For B and C, the container CPU heuristics will determine
> for the JVM to use 2 CPUs for C1 and C2 and 1 CPU for C3 and C4 which
> seems rather inconsistent and arbitrary. The reason this is happening
> is that 1024 seems to have gotten a questionable meaning in [2]. I
> wonder why?
> 
> The JVM cannot reasonably determine from the relative weight of --cpu-
> shares' value how many CPUs it should use. As it's a relative weight
> that's something for the container runtime to take into account. It
> appears to me that the container detection code should probably fall
> back to the host CPU value and only take CPU quotas into account.
> 
> Am I missing something obvious here? All I could find was this in JDK-
> 8146115:
> """
> If cpu_shares has been setup for the container, the number_of_cpus()
> will be calculated based on cpu_shares()/1024. 1024 is the default and
> standard unit for calculating relative cpu
> """
> 
> "1024 is the default and standard unit for calculating relative cpu"
> seems a wrong assumption to me. Thoughts?
> 
> Thanks,
> Severin
> 
> [1]    https://github.com/cloudfoundry/java-buildpack/issues/650#issuecomment-441777166
> [2]    http://hg.openjdk.java.net/jdk/jdk/rev/7f22774a5f42#l4.43
> [i]*   http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c122.out.log
>         http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c61.out.log
> [ii]*  http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1026.out.log
>         http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c513.out.log
> [iii]* http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c2048.out.log
>         http://cr.openjdk.java.net/~sgehwolf/container-resources-cpu/c1024.out.log
> 
> * Files produced with:
> 
> $ for i in 1026 513 2048 1024 122 61; do sudo docker run -ti -c=$i --rm fedora28-jdks:v1 /jdk-head/bin/java -showversion -Xlog:os+container=trace RuntimeProc > container-resources-cpu/c${i}.out.log; done
> $ sudo docker run -ti --rm fedora28-jdks:v1 cat RuntimeProc.java
> public class RuntimeProc {
> 	public static void main(String[] args) {
> 		int availProc = Runtime.getRuntime().availableProcessors();
> 		System.out.println(">>> Available processors: " + availProc + " <<<<");
> 	}
> }
> 
> 
> 


More information about the hotspot-dev mailing list