RFR: 8367319: Add os interfaces to get machine and container values separately [v2]

Fri Oct 24 06:16:04 UTC 2025

On Mon, 6 Oct 2025 14:48:29 GMT, Casper Norrbin <cnorrbin at openjdk.org> wrote:

>> Hi everyone,
>> 
>> The current `os::` layer on Linux hides whether the JVM is running inside a container or not. When running inside a container, we replace machine values with container values where applicable, without telling the user of these methods. For most use cases, this is fine, users only care about the returned value. But for other use cases, where the value originated is important. Two examples:
>> 
>> - A user might need the physical cpu count of the machine, but `os::active_processor_count()` only returns the limited container value, which also represents something slightly different.
>> - A user might want the container memory limit and the physical RAM size, but `os::physical_memory()` only gives one number.
>> 
>> To solve this, every function that mixed container/machine values now has to explicit variants, prefixed with `machine_` and `container_`. These use the bool return + out-parameter interface, with the container functions only working on Linux. The original methods remain and continue to return the same mixed values.
>> 
>> In addition, container-specific accessors for the memory soft limit and the memory throttle limit have been added, as these values matter when running in a containerized environment.
>> 
>> `OSContainer::active_processor_count()` has also been changed to return `double` instead of `int`. The previous implementation rounded the quota/period ratio up to produce an integer for `os::active_processor_count()`. Now, when the value is requested directly from the new container API it makes more sense to preserve this fraction rather than rounding it up. We can thus keep the exact value for those that want it, then round it up to keep the same behavior in `os::active_processor_count()`.
>> 
>> Testing:
>> - Oracle tiers 1-5
>> - Container tests on cgroup v1 and v2 hosts.
>
> Casper Norrbin has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fixed print type

This discussion is running off the rails I think - my fault, so let me try and drag it back on. I earlier made a fairly simple statement:

> It might be clearer to just define specific methods for the things that are currently missing that you need to query, rather than trying to generalize the existing poorly-defined API. If you insist that three API's is better then I would like to see clear specifications for what each of the method variants actually returns.

Looking again, and ignoring some earlier comments by others that might have mislead me, what is being proposed is much closer to that than what I initially thought. What I wanted to avoid was having 3 variants of a given method when the original os:: form of that method isn't even well-defined today. But I am okay with a set of "machine/system" methods to ask queries that make sense for the "machine"; and a set of container ones that make sense for containers. (And the implementation of the os:: method can delegate to whichever provides the "right" answer.)

But to clarify my issue with the container notion of available processors ... you state:

>  It exposes what number of processors, on average, the container is limited to run on. 

Yes - but if I am asking how many processors are available I am doing so because I want to know how many threads it makes sense creating, so I know how many threads could potentially be executing in parallel at the same time. Using my example above that could be 50% of the physical cores or 100% depending on how the container decides to implement quotas. That fits in, I think, with the example Casper gave above:

> In latency-sensitive workloads we might burst onto all 16 cores for a short interval and still stay within the 2-cpu quota. At the moment, the JVM has no way to know that those extra 14 cores even exist, so it cannot make that optimization.

Indeed - but if you create 16 threads to take advantage of that and the container only actually gives you 2 cores for 100% of the time, then you have totally messed up. This notion of "average processors" does not help you size accordingly, nor does knowing the number of "machine" processors, if you don't know how the container operates. That is one assumption about the container implementation I was referring to. The other is that the container never defines cpusets, otherwise you have no way that I am aware of to actually determine there are 16 cores on the machine**. Maybe these are safe assumptions to make, but it would be nice if the underlying container subsystem could give you those answers directly based on how it does things (then you wouldn't need to query the "machine".)

** If you assume no hot-swapping of CPUs then `sysconf(_SC_NPROCESSORS_ONLN)` could give you that answer.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27646#issuecomment-3441241899