[RFC containers] 8281181 JDK's interpretation of CPU Shares causes underutilization

Severin Gehwolf sgehwolf at redhat.com
Mon Feb 7 18:36:25 UTC 2022


On Sun, 2022-02-06 at 20:16 -0800, Ioi Lam wrote:
> Case (4) is the cause for the bug in JDK-8279484
> 
> Kubernetes set the cpu.cfs_quota_us to 0 (no limit) and cpu.shares to 2. 
> This means:
> 
> - This container is guaranteed a minimum amount of CPU resources
> - If no other containers are executing, this container can use as
>    much CPU as available on the host
> - If other containers are executing, the amount of CPU available
>    to this container is (2 / (sum of cpu.shares of all active
>    containers))
> 
> 
> The fundamental problem with the current JVM implementation is that it 
> treats "CPU request" as a maximum value, the opposite of what Kubernetes 
> does. Because of this, in case (4), the JVM artificially limits itself 
> to a single CPU. This leads to CPU underutilization.

I agree with your analysis. Key point is that in such a setup
Kubernetes sets CPU shares value to 2. Though, it's a very specific
case.

In contrast to Kubernetes the JVM doesn't have insight into what other
containers are doing (or how they are configured). It would, perhaps,
be good to know what Kubernetes does for containers when the
environment (i.e. other containers) changes. Do they get restarted?
Restarted with different values for cpu shares?

Either way, what are our options to fix this? Does it need fixing?

 * Should we no longer take cpu shares as a means to limit CPU into
   account? It would be a significant change to how previous JDKs
   worked. Maybe that wouldn't be such a bad idea :)
 * How likely is CPU underutilization to happen in practise?
   Considering the container is not the only container on the node,
   then according to your formula, it'll get one CPU or less anyway.
   Underutilization would, thus, only happen when it's an idle node
   with no other containers running. That would suggest to do nothing
   and let the user override it as they see fit.
 * Something else I'm missing?

Thanks,
Severin



More information about the hotspot-dev mailing list