RFR: 8367319: Add os interfaces to get machine and container values separately [v2]

Tue Oct 7 09:47:04 UTC 2025

On Tue, 7 Oct 2025 01:33:34 GMT, David Holmes <dholmes at openjdk.org> wrote:

> That is a mis-characterization of the API. `active_processor_count()` tells you how many logical processors are available to the JVM process. That can be very different to the "physical" (**) number of processors due to partitioning at various levels (e.g. virtualization, containerization), as well as direct restrictions through API's like `taskset`.
> 
> (**) "physical" actually has no meaning these days. There is some value you can obtain through the operating system that provides the maximum number of processors that the operating system can see (and thus make available to the JVM).

Agreed, I conflated the two here. What I actually should have written is like you said, the number of logical processors available for the JVM to execute on. That is also the value the new  `machine_active_processor_count()` returns.

By contrast, the current container-reported value treats cpu quota and logical processors as the same thing, even though quota only restricts cpu time, not the number of cores we can run on. With a quota of 1, we might still execute on two cores for 50% of the time each, but `os::active_processor_count()` still only reports "1". 

> What is a "machine" here? Historically we have misused "physical" to mean what does a bare-metal OS report on a bare-metal piece of hardware. But that became inaccurate decades ago once virtualization/hypervisors arrived. So we've adjusted API's (e.g. MXBeans) to report whatever the "operating system" reports. The problem there is some things the operating system reports take into account the presence of containers, and others do not. This has always been a problem with these container environments - they should be invisible to software but they are not.

Machine in this context is the values the operating system reports, which could already be limited depending on the configuration. All this is of course Linux-only, as we don't support containers on any other platforms. In many container deployments the cgroup limits do differ from the OS view, but in a fully virtualized environment they can coincide. In that situation none of this makes a difference anyways, and both functions would report the same value.

> For a long time this was an impossible question to answer accurately - we could query whether cgroups were configured on a system but we couldn't ask if the JVM process was running under any cgroup constraints - has that changed?

Our container detection still isn't perfect, but it has improved:
1. We first check whether all cgroup controllers are mounted read-only, which is the default for many container runtimes.
2. If not, we examine the JVM's cgroup path to see if there are any memory/cpu limits present (covers JVMs started in restricted systemd slices).

These heuristics can miss more exotic setups but are pretty accurate for most use cases today.

> I would like to get a better idea of what kinds of "machine" information we need to query and how it will be used. I mean, how does it help to know a "machine" has 256 processors if the various software layers only make 16 available to you?

Like I mentioned above when explaining `os::active_processor_count()`, it can be very relevant to know that the underlying machine (virtualized or not) has 16 cpus available, even though the JVM's cgroup quota restricts us to an average of 2 cpus. In latency-sensitive workloads we might burst onto all 16 cores for a short interval and still stay within the 2-cpu quota. At the moment, the JVM has no way to know that those extra 14 cores even exist, so it cannot make that optimization.

The same argument applies to memory. GC heuristics may want to look at overall memory pressure on the machine, not just the container's limit. Imagine several containers on one host, where one is consuming most of its limit while others are relatively idle. From the host perspective the system is still comfortable, so the GC could be less aggressive compared to if every container was close to its limit. Without access to the host-level numbers we can't distinguish these scenarios.

Admittedly these functions are niche and will only matter for very specialised, performance-critical tasks. Still, the information is already available from the operating system, and the JVM should not hide or overwrite it for those users who can benefit from it.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27646#issuecomment-3376037169