[Containers] Reasoning for cpu shares limits

Mon Jan 7 19:08:13 UTC 2019

On Mon, 2019-01-07 at 10:24 -0500, Bob Vandette wrote:
> > Effectively, after JDK-8197589, cpu shares value being ignored by the
> > JVM is what's happening. That's what I'm seeing for JVM containers on
> > k8s anyway.
> 
> cpu-shares are only ignored if there is no cpu-quota set. 

You mean cpu-shares are only ignored if there is a cpu-quota set too,
right?

> I have no way of knowing if it
> is common to have cpu requests without cpu limits but it is possible.

Fair enough.

> Here’s more detail on what cpu requests and limits mean to k8s.
> 
> Pod scheduling is based on requests. A Pod is scheduled to run on a Node only if the Node has
> enough CPU resources available to satisfy the Pod CPU request.
> 
> https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-that-is-too-big-for-your-nodes

Yes, thanks.

> > 
> > > 
> > > https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu
> > > 
> > > 	• The spec.containers[].resources.requests.cpu is converted to its core value, which is potentially fractional, and multiplied by 1024. The greater of this number or 2 is used as the value of the --cpu-shares flag in the docker run command.
> > > 	• The spec.containers[].resources.limits.cpu is converted to its millicore value and multiplied by 100. The resulting value is the total amount of CPU time that a container can use every 100ms. A container cannot use more than its share of CPU time during this interval.
> > > 
> > > There are a few options that can be used if our default behavior doesn’t work for you.
> > > 
> > > 1. Use quotas in addition to or instead of shares.
> > > 2. Specify -XX:ActiveProcessorCount=value
> > 
> > OK. So it's modelled after how Kubernetes does things. What I'm
> > questioning is whether the spec.containers[].resources.requests.cpu
> > setting of Kubernetes should have any bearing on the number of CPUs the
> > *JVM* thinks are available to it, though. It's still just a relative
> > weight a JVM-based container would get. What if k8s decides to use a
> > different magic number? Should this be hard-coded in the JVM? Should
> > this be used in the JVM at all?
> > 
> > Taking the Kubernetes case, it'll usually set CPU shares *and* CPU
> > quota.  The latter very likely being the higher value as k8s models
> > spec.containers[].resources.requests.cpu as a sort of minimal CPU value
> > and spec.containers[].resources.limits.cpu as a maximum, hard limit. In
> > that respect, having CPU shares' value modelled by the k8s case *within
> > the JVM* seems arbitrary as it won't be used anyway. Quotas take
> > precedence. Perhaps that's why JDK-8197589 was done after JDK-8146115?
> > 
> > I'd argue that:
> > 
> > A) Modelling this after the k8s case and enforcing a CPU limit
> >   (within the JVM) based on a relative weight is still wrong. The
> >   common case for k8s is both settings, shares and quota, being
> >   present. After JDK-8197589, there is even a preference to use
> >   quota over CPU shares. I'd argue PreferContainerQuotaForCPUCount
> >   JVM switch wouldn't be needed if CPU shares wouldn't have any
> >   effect on the internal JVM settings to begin with.
> > B) It breaks other frameworks which don't use this convention for no
> >   good reason. Cloudfoundry is a case in point.
> > C) This needs to be at least documented in code as to why that decision
> >   has been made. Specifically "#define PER_CPU_SHARES 1024" in
> >   src/hotspot/os/linux/osContainer_linux.cpp.
> 
> I agree that PER_CPU_SHARES should have a comment documenting its
> meaning and origin.

Sounds good.

> I’d like to get some feedback from the docker, k8s and CloudFoundry
> community before changing this algorithm once again.  Churning it is
> almost as bad as the current situation since developers may be adapting
> to the new behavior.

This seems reasonable. I'll see what I can do.

Thanks,
Severin