RFR: 8302744: Refactor Hotspot container detection code [v2]
Thomas Stuefe
stuefe at openjdk.org
Wed May 22 14:24:13 UTC 2024
On Tue, 21 May 2024 13:58:22 GMT, Severin Gehwolf <sgehwolf at openjdk.org> wrote:
>> Please review this container detection code refactoring in hotspot. The main point of this is to
>>
>> - get rid of the `GET_CONTAINER_INFO` macros which hide too many things under the hood
>> - prevent refactoring of the code (since `GET_CONTAINER_INFO` macros short-return and are therefore not portable; at least not without some risk)
>> - make the code easier to understand
>> - allow for better testing via `gtest`
>> - separate multi-line parsing from single line parsing for clarity.
>>
>> Testing:
>> - [x] GHA
>> - [x] `gtest:cgroupTest` tests
>> - [x] Container tests on Linux with cgroup v1 (legacy) and cgroup v2. All pass.
>>
>> Thoughts?
>
> Severin Gehwolf has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision:
>
> - Fix TestMemoryAwareness for cgroup v2
>
> It only reads the swap value, thus doesn't print:
> "Memory and Swap Limit is" but rather prints:
> "Swap Limit is". This is fine, since for cg v2
> the memory limit and the swap limits are in different
> files.
> - Fix whitespace
> - Handle cpu quota for cgroups v1 specially
> - Try a different approach.
>
> Hybrid MACRO usage and locked down reading primitives.
> - Add key-value read function
> - Add basic testing for read functions
> - Add generic julong/char* read functions
> - Merge branch 'master' into jdk-8302744-cleanup-getcontainer-info
> - 8302744: Refactor Hotspot container detection code
@jerboaa
A lot better!
If you move the sscanf up to `read_(number|numerical_tuple_value)()`, things become even simpler and you can remove the templatization altogether.
Also, at the moment, when you call read_string, three different buffers are involved, and three buffer copies. And one of the buffers is malloc'ed, which is not ideal for a high frequency call. And it forces you to deal with freeing that buffer.
You have:
- `read_number` -> `read_from_file`
- `read_numerical_tuple_value` -> `read_from_file`
- `read_string` -> `read_from_file`
You can move the sscanf up to the callers:
- Let `read_from_file()` only read the first line from a file and return that. No sscanf. Let it take an output buffer (e.g. `char output[1024]`) as argument.
- Let `read_number()` call `read_from_file` with a local buffer, then call sscanf with its `JULONG_FORMAT`
- Let `read_numerical_tuple_value()` call `read_from_file` with a local buffer, then call sscanf with the TupleFormat
- Let `read_string` call `read_from_file` with a local buffer. No need for sscanf. It can just return the strduped buffer.
No need for templatization of `read_from_file` then.
An extension of that idea would be to not use strdup in `read_string` at all, but to require a caller-provided buffer. Then, we don't have to copy data around, and you could just remove `read_string` altogether, since it adds nothing of value. Callers can call `read_from_file` directly.
Removing the strdup from the read_string path would mean you can also remove all the `os::free` calls in limit_from_str. No need to think about who owns which buffers.
You would introduce local buffers in those functions that call read_string. In those that work on caller-provided buffers, return either the caller-provided buffer or NULL to indicate an error (which is the typical C-style pattern), e.g.:
char* CgroupV2Subsystem::pids_max_val(char buffer[1024]) {
CONTAINER_READ_STRING_CHECKED(_unified, "/pids.max", "Maximum number of tasks", buffer);
return buffer;
}
jlong CgroupV2Subsystem::pids_max() {
char buffer[1024]
char * pidsmax_str = pids_max_val(buffer);
return limit_from_str(buffer);
}
Yet another way to avoid malloc would be using ResourceAreas. But that introduces dependencies to initialization, and I hesitate having cgroups handling depend on VM initialization.
-------------
PR Review: https://git.openjdk.org/jdk/pull/19060#pullrequestreview-2071407841
More information about the hotspot-runtime-dev
mailing list