RFR: 8365606: Container code should not be using jlong/julong [v2]
Thomas Fitzsimmons
fitzsim at openjdk.org
Mon Oct 27 19:58:09 UTC 2025
On Fri, 24 Oct 2025 09:45:25 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:
>> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values.
>>
>> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned.
>>
>> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for.
>>
>> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code.
>>
>> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large.
>>
>> Testing (looking good):
>> - [x] GHA
>> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below.
>> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command.
>>
>> Thoughts? Opinions?
>
> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
>
> - Merge branch 'master' into jdk-8365606-jlong-julong-refactor
> - Fix print_container_info output
> - whitespace clean-ups and other small fixes
> - Fix log format in container macro and scanf format
> - Fix duplicate include in osContainer_linux
> - 8365606: Container code should not be using jlong/julong
_I am not a reviewer._ This looks good to me, overall. I commented with some items to consider.
src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 627:
> 625: *
> 626: * If quotas have not been specified, return the
> 627: * number of active processors in the system.
This paragraph uses the "return" language that you adjusted in the next paragraph. It should probably also refer to the reference argument instead.
src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 629:
> 627: * number of active processors in the system.
> 628: *
> 629: * If quotas have been specified, the resulting number
Tiny nit, but "the resulting number" => "the number", since you say "the result reference" on the next line.
src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 653:
> 651: cpu_count = os::Linux::active_processor_count();
> 652: if (!CgroupUtil::processor_count(contrl->controller(), cpu_count, result)) {
> 653: return false;
`value` will be returned unchanged from its passed-in value here. I wonder if it would be safer to explicitly set it to `0` when returning `false`. Also, could `value` be given an unsigned type, like `uint64_t`?
src/hotspot/os/linux/cgroupSubsystem_linux.cpp line 667:
> 665: * Return the limit of available memory for this process in the provided
> 666: * physical_memory_size_type reference. If there was no limit value set in the underlying
> 667: * interface files value_unlimited is returned.
I think quote `value_unlimited` here to hint that it is a constant defined elsewhere.
Can the limit ever be `0`, and if not, should there be a new assert for `> 0` like for `cpu_count`?
src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 167:
> 165: /* memory_and_swap_limit_in_bytes
> 166: *
> 167: * Determine the memory and swap limit metric. Returns a positive limit value or
"Returns" language should probably be updated here too.
src/hotspot/os/linux/cgroupV1Subsystem_linux.cpp line 470:
> 468: // cast to int since the read value might be negative
> 469: // and we want to avoid logging -1 as a large unsigned value.
> 470: int quota_int = static_cast<int>(quota);
It seems like quota is either a positive number or disabled. I wonder if `result` can be treated as a `uint64_t`, and this log message special-cased to detect `-1` read from `/cpu.cfs_quota_us` as disabled. I guess the calling code would need another way to differentiate "disabled" from other values... maybe with `0`? Just a thought to maybe simplify the type logic here. Likewise for `period` and `shares`.
-------------
Marked as reviewed by fitzsim (Author).
PR Review: https://git.openjdk.org/jdk/pull/27743#pullrequestreview-3383739312
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2465851974
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2465858316
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2465872342
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2465907552
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2466836781
PR Review Comment: https://git.openjdk.org/jdk/pull/27743#discussion_r2466892281
More information about the hotspot-dev
mailing list