RFR: 8370572: Cgroups hierarchical memory limit is not honored after JDK-8322420 [v2]
Aleksey Shipilev
shade at openjdk.org
Mon Oct 27 18:05:03 UTC 2025
On Mon, 27 Oct 2025 17:59:23 GMT, Aleksey Shipilev <shade at openjdk.org> wrote:
>> See the bug for more discussion.
>>
>> We are seeing customer regressions in 21.0.9, notably on ECS Fargate. We root-caused it to [JDK-8322420](https://bugs.openjdk.org/browse/JDK-8322420). That patch removed the handling of `hierarchical_memory_limit`, look at [this hunk](https://github.com/openjdk/jdk/commit/55a7cf14453b6cd1de91362927b2fa63cba400a1#diff-8910f554ed4a7bc465e01679328b3e9bd64ceaa6c85f00f0c575670e748ebba9L118-L131).
>>
>> But at least cgroupv1 still needs them in some conditions, notably in ECS. There is a way to reproduce it with local Docker as well. The key is to set up host cgroup that would not be visible to the container, and so that the only way for container to know the memory limits would be to look into `hierarchical_*` values that kernel computes itself.
>>
>> Unfortunately, it is not easy to revert the offending hunks from 21.0.9, as there were follow-up refactoring backports. So, to make it work, this PR reinstantiates the hunks using the new cgroups support code. It also makes code (subjectively) easier to read, and is in the spirit of past refactorings.
>>
>> We are planning to pick this patch up to 21.0.9, at least into Corretto downstream as soon as possible to unbreak users. Therefore, the patch is also kept as crisp as possible.
>>
>> I tried to come up with a regression test for it, but could not: local reproducers require amending _host_ configuration, which requires superuser privileges, among other hassle it introduces.
>>
>> Additional testing:
>> - [x] Reproducer with local Docker now passes
>> - [ ] Reproducer with ECS Fargate now passes
>> - [x] Linux x86_64 server fastdebug, `containers/` passes on cgroupsv1 host
>> - [x] Linux x86_64 server fastdebug, `containers/` passes on cgroupsv2 host
>
> Aleksey Shipilev has updated the pull request incrementally with two additional commits since the last revision:
>
> - Also no need to touch the other getter
> - Whitespace
Logs from local reproducer (see bug for details) -- asking to run with 1G in parent slice, and 25% of container memory as heap size. The goal is to have 256M heap size then.
Current mainline (broken):
[0.001s][trace][os,container] OSContainer::init: Initializing Container Support
[0.001s][debug][os,container] Detected optional pids controller entry in /proc/cgroups
[0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers
[0.001s][debug][os,container] OSContainer::init: is_containerized() = true because all controllers are mounted read-only (container case)
...
[0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.001s][trace][os,container] Memory Limit is: 9223372036854771712
[0.001s][debug][os,container] container memory limit ignored: 9223372036854771712, upper bound is 264567476224
...
[0.141s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.141s][trace][os,container] Memory Limit is: 9223372036854771712
[0.141s][debug][os,container] container memory limit ignored: 9223372036854771712, upper bound is 264567476224
[0.141s][info ][gc,init ] Memory: 246G
...
[0.141s][info ][gc,init ] Heap Min Capacity: 32M
[0.141s][info ][gc,init ] Heap Initial Capacity: 63104M
[0.141s][info ][gc,init ] Heap Max Capacity: 63104M
...
This fix:
[0.001s][trace][os,container] OSContainer::init: Initializing Container Support
[0.001s][debug][os,container] Detected optional pids controller entry in /proc/cgroups
[0.001s][debug][os,container] Detected cgroups hybrid or legacy hierarchy, using cgroups v1 controllers
[0.001s][debug][os,container] OSContainer::init: is_containerized() = true because all controllers are mounted read-only (container case)
...
[0.001s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.001s][trace][os,container] Memory Limit is: 9223372036854771712
[0.001s][trace][os,container] Path to /memory.use_hierarchy is /sys/fs/cgroup/memory/memory.use_hierarchy
[0.001s][trace][os,container] Use Hierarchy is: 1
[0.001s][trace][os,container] Path to /memory.stat is /sys/fs/cgroup/memory/memory.stat
[0.001s][trace][os,container] Hierarchical Memory Limit is: 1073741824
...
[0.040s][trace][os,container] Path to /memory.limit_in_bytes is /sys/fs/cgroup/memory/memory.limit_in_bytes
[0.040s][trace][os,container] Memory Limit is: 9223372036854771712
[0.040s][trace][os,container] Path to /memory.use_hierarchy is /sys/fs/cgroup/memory/memory.use_hierarchy
[0.041s][trace][os,container] Use Hierarchy is: 1
[0.041s][trace][os,container] Path to /memory.stat is /sys/fs/cgroup/memory/memory.stat
[0.041s][trace][os,container] Hierarchical Memory Limit is: 1073741824
...
[0.041s][info ][gc,init ] Memory: 1024M
[0.041s][info ][gc,init ] Heap Initial Capacity: 256M
[0.041s][info ][gc,init ] Heap Max Capacity: 256M
...
-------------
PR Comment: https://git.openjdk.org/jdk/pull/28006#issuecomment-3452627144
More information about the hotspot-runtime-dev
mailing list