[PING2!] RFR: 8230305: Cgroups v2: Container awareness
Bob Vandette
bob.vandette at oracle.com
Thu Jan 16 20:18:54 UTC 2020
> On Jan 16, 2020, at 3:08 PM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
>
> Hi Bob, David,
>
> On Thu, 2020-01-16 at 09:13 -0500, Bob Vandette wrote:
>>> On Jan 15, 2020, at 5:13 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>> Hi Bob, Severin,
>>>
>>> On 15/01/2020 6:04 am, Bob Vandette wrote:
>>>> Cgroup V2 is about to go mainstream this year for popular distros such as Oracle Linux 8, Redhat Linux 8 and Fedora
>>>> so this fix it’s important to get into JDK 15 so we can start shaking out this support.
>>>> Please take a look and help get this change reviewed.
>>>
>>> I've taken a look and the overall structure and approach seems fine. I can't attest to the details of cgroups v1 versus v2 but trust the exports.
>>>
>>> Reviewed.
>>>
>>> Please ensure all copyrights updated to 2020.
>>>
>>> Bob: I assume you have tested this on our systems?
>>
>> I’m in the process of setting up a cgroup v2 system to test with but I’ll grab the patch and verify the
>> container tests on my cgroup v1 setup today.
>
> Final webrev I intend to push tomorrow for this:
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8230305/07/webrev/
>
> This passed jdk/submit and testing on cgroups v1 hotspot docker tests.
> On cgroups v2 this fails the following test since it also tests the
> OperatingSystemMXBean's memory awareness which uses core-libs Metrics
> code not updated with this webrev (passes all others):
>
> hotspot/jtreg/containers/docker/TestMemoryAwareness.java
>
> It's caused by JDK-8226575 which went in during the time I've started
> this and intented push time. This test failure will be fixed with JDK-
> 8231111 once I get another review from a Reviewer. I doubt many people
> will notice though since cgroupv2 enabled systems aren't that
> widespread just yet.
>
> Please let me know if there are any objections.
I ran your changes on a local cgroupv1 enabled system and didn’t see any failures
specific to your changes.
I saw one test failure on my system but it was pre-existing. The TestDockerMemoryMetrics.testMemoryFailCount
test gets killed by OOM Killer on my system. This needs more investigation. There was an existing
bug filed for this issue which is resolved but the problem persists on my system.
https://bugs.openjdk.java.net/browse/JDK-8224506 <https://bugs.openjdk.java.net/browse/JDK-8224506>
I’m good with the Metrics problem since that code is not cgroupv2 aware yet.
Bob.
>
> Thanks,
> Severin
>
>
>> Bob.
>>
>>
>>> Thanks,
>>> David
>>>
>>>> Thanks,
>>>> Bob Vandette
>>>>> On Nov 29, 2019, at 4:04 AM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
>>>>>
>>>>> On Fri, 2019-11-15 at 17:56 +0100, Severin Gehwolf wrote:
>>>>>> On Fri, 2019-11-08 at 15:21 +0100, Severin Gehwolf wrote:
>>>>>>> Hi Bob,
>>>>>>>
>>>>>>> On Wed, 2019-11-06 at 10:47 +0100, Severin Gehwolf wrote:
>>>>>>>> On Tue, 2019-11-05 at 16:54 -0500, Bob Vandette wrote:
>>>>>>>>> Severin,
>>>>>>>>>
>>>>>>>>> Thanks for taking on this cgroup v2 improvement.
>>>>>>>>>
>>>>>>>>> In general I like the implementation and the refactoring. The CachedMetric class is nice.
>>>>>>>>> We can add any metric we want to cache in a more general way.
>>>>>>>>>
>>>>>>>>> Is this the latest version of the webrev?
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/cgroupsv2-hotspot/03/webrev/src/hotspot/os/linux/cgroupV2Subsystem_linux.cpp.html
>>>>>>>>>
>>>>>>>>> It looks like you need to add the caching support for active_processor_count (JDK-8227006).
>>>>>>> [...]
>>>>>>>> I'll do a proper rebase ASAP.
>>>>>>>
>>>>>>> Latest webrev:
>>>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/cgroupsv2-hotspot/05/webrev/
>>>>>>>
>>>>>>>>> I’m not sure it’s worth providing different strings for Unlimited versus Max or Scaled shares.
>>>>>>>>> I’d just try to be compatible with the cgroupv2 output so you don’t have to change the test.
>>>>>>>>
>>>>>>>> OK. Will do.
>>>>>>>
>>>>>>> Unfortunately, there is no way of NOT changing TestCPUAwareness.java as
>>>>>>> it expects CPU Shares to be written to the cgroup filesystem verbatim.
>>>>>>> That's no longer the case for cgroups v2 (at least for crun). Either
>>>>>>> way, most test changes are gone now.
>>>>>>>
>>>>>>>>> I wonder if it’s worth trying to synthesize memory_max_usage_in_bytes() by keeping the highest
>>>>>>>>> value ever returned by the API.
>>>>>>>>
>>>>>>>> Interesting idea. I'll ponder this a bit and get back to you.
>>>>>>>
>>>>>>> This has been implemented. I'm not sure this is correct, though. It
>>>>>>> merely piggy-backs on calls to memory_usage_in_bytes() and keeps the
>>>>>>> high watermark value of that.
>>>>>>>
>>>>>>> Testing passed on F31 with cgroups v2 controllers properly configured
>>>>>>> (podman) and hybrid (legacy hierarchy) with docker/podman.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>
>>>>>> Ping?
>>>>>
>>>>> Anyone willing to review this? It would be nice to make some progress.
>>>>>
>>>>> Thanks,
>>>>> Severin
>>>>>
>>>>>> Metrics work proposed for RFR here:
>>>>>> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-November/063464.html
>>>>>>
>>>>>> Thanks,
>>>>>> Severin
>
More information about the hotspot-dev
mailing list