RFR: 8365606: Container code should not be using jlong/julong

Andrew Haley aph at openjdk.org
Fri Oct 24 09:39:02 UTC 2025


On Fri, 10 Oct 2025 13:09:48 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

> Please review this revised version of getting rid of `jlong` and `julong` in internal HotSpot code. The single remaining usage is using `os::elapsed_counter()` which I think is still ok. This refactoring is for the container detection code to (mostly) do away with negative return values.
> 
> It gets rid of the trifold-use of return value: 1.) error, 2) unlimited values 3) actual numbers/values/limits. Instead, all container related values are now being read from the interface files as `uint64_t` and afterwards interpreted in the way that make sense for the API implementations. For example, `cpu` values will essentially be treated as `int`s as before, potentially returning a negative value `-1` for unlimited. For memory sizes the type `physical_memory_size_type` has been chosen. When there is no limit for a specific memory size a value `value_unlimited` is being returned.
> 
> All error cases have been changed to returning `false` in the API functions (and no value is being set in the passed in reference for the value). The effect of this is that all container related functions now return a `bool` and require a reference to be passed in for the `value` that is being asked for.
> 
> All usages of the API have been changed to use the revised API. There is no more usages for `OSCONTAINER_ERROR` (`-2) in HotSpot code.
> 
> While working on this, I've noticed that there are still some calls deep in the cgroup subsystem code to query "machine" info (e.g. `os::Linux::active_processor_count()`). I've filed [JDK-8369503](https://bugs.openjdk.org/browse/JDK-8369503) to get this cleaned-up as this patch was already getting large.
> 
> Testing (looking good):
> - [x] GHA
> - [x] All container tests (including problem listed ones) on Linux x86_64 with cg v1 and cg v2. See [this comment](https://github.com/openjdk/jdk/pull/27743#issuecomment-3390060127) below.
> - [x] Some ad-hoc manual testing in containers using JFR (`jdk.SwapSpace` event) and `VM.info` diagnostic command.
> 
> Thoughts? Opinions?

I'm not surprised about the lack of reviews because it's long and the result is rather ugly. (Sorry, but I had to say it.)

This is less jarring to read:


struct Result {
  const int _value; const bool _ok;
  Result(int value, bool ok) : _value(value), _ok(ok) {}
};


...


Result CgroupSubsystem::active_processor_count() {
  ...
  cpu_count = os::Linux::active_processor_count();
  if (!CgroupUtil::processor_count(contrl->controller(), cpu_count, value)) {
    return Result(value, false);
  }
  assert(value > 0 && value <= cpu_count, "must be");
  // Update cached metric to avoid re-reading container settings too often
  cpu_limit->set_value(value, OSCONTAINER_CACHE_TIMEOUT);

  return Result(value, true);
}


called as:


  auto [result, ok] = cgroup_subsystem->active_processor_count();
  if (ok) ...

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27743#issuecomment-3442136772


More information about the hotspot-dev mailing list