RFR: 8322420: [Linux] cgroup v2: Limits in parent nested control groups are not detected [v2]

Thomas Stuefe stuefe at openjdk.org
Wed Aug 28 07:41:21 UTC 2024


On Tue, 27 Aug 2024 19:05:36 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:

>> src/hotspot/os/linux/cgroupUtil_linux.cpp line 67:
>> 
>>> 65:     *last_slash = '\0'; // strip path
>>> 66:     // update to shortened path and try again
>>> 67:     mem->set_subsystem_path(cg_path);
>> 
>> Does this work with zero-length strings? If all that was left was "/" and we just stripped that away?
>
> There is also the loop condition `(last_slash = strrchr(cg_path, '/')) != cg_path` which guards for this case. I.e. for `/foo` we stop and break the loop. Does that make sense?

yes

>> src/hotspot/os/linux/cgroupUtil_linux.cpp line 71:
>> 
>>> 69:     path_iterated = true;
>>> 70:     if (limit > 0) {
>>> 71:       log_trace(os, container)("Adjusted controller path for memory to: %s", mem->subsystem_path());
>> 
>> Please trace out the found limit as well
>
> Do we really need to mention it again? From the PR description here is how it looks like when this code runs:
> 
> 
> [0.001s][trace][os,container] Adjusting controller path for memory: /sys/fs/cgroup/memory/user.slice/user-cg.slice/user-cg-cpu.slice/run-r634adce2617145ea9660623c335cb3db.scope
> [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/user.slice/user-cg.slice/user-cg-cpu.slice/run-r634adce2617145ea9660623c335cb3db.scope/memory.limit_in_bytes
> [0.001s][trace][os,container] Memory Limit is: 9223372036854771712
> [0.001s][debug][os,container] container memory limit ignored: 9223372036854771712, using host value 67163885568
> [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/user.slice/user-cg.slice/user-cg-cpu.slice/memory.limit_in_bytes
> [0.001s][trace][os,container] Memory Limit is: 9223372036854771712
> [0.001s][debug][os,container] container memory limit ignored: 9223372036854771712, using host value 67163885568
> [0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/user.slice/user-cg.slice/memory.limit_in_bytes
> [0.001s][trace][os,container] Memory Limit is: 4294967296
> [0.001s][trace][os,container] Adjusted controller path for memory to: /sys/fs/cgroup/memory/user.slice/user-cg.slice
> 
> 
> I.e. `read_memory_limit_in_bytes()` does that already. Thoughts?

okay

>> src/hotspot/os/linux/os_linux.hpp line 34:
>> 
>>> 32: class os::Linux {
>>> 33:   friend class CgroupSubsystem;
>>> 34:   friend class CgroupUtil;
>> 
>> why?
>
> Because `CgroupUtil::adjust_controller(CgroupCpuController* cpu)` uses `os::Linux::active_processor_count()` for the host cpu value which is protected.

Okay. I dislike friend a bit since it undermines C++ scope protection. Why not make os::Linux::active_processor_count public then? If its needed from outside the Linux namespace, it should be public.

Side note, I assume we cannot just use `os::active_processor_count()` because it handles the containerized case, and what we really want here is the real physical count?

Not for this RFE, but a bit cleaner may be if we had something like os::active_processor_count() - taking care of handling the ActiveProcessorCount switch and reporting container limits if containerized - and something like "os::active_physical_processor_count()" , which would report the real physical number.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20646#discussion_r1734151459
PR Review Comment: https://git.openjdk.org/jdk/pull/20646#discussion_r1734151740
PR Review Comment: https://git.openjdk.org/jdk/pull/20646#discussion_r1734149696


More information about the hotspot-runtime-dev mailing list