RFR: 8230848: OSContainer: Refactor container detection code

Severin Gehwolf sgehwolf at redhat.com
Tue Oct 1 19:03:12 UTC 2019


On Tue, 2019-10-01 at 14:36 -0400, Bob Vandette wrote:
> > On Oct 1, 2019, at 1:54 PM, Severin Gehwolf <sgehwolf at redhat.com> wrote:
> > 
> > Hi Bob,
> > 
> > Thanks for looking at this! I appreciate that.
> > 
> > On Tue, 2019-10-01 at 10:43 -0400, Bob Vandette wrote:
> > > Severin,
> > > 
> > > I’d prefer if you just added cgroupv2 support with these types of changes all at the same time.
> > > Having to look at two patches is just more work for nothing.
> > 
> > OK. I guess that's reviewer preference :) The reason why I did it that
> > way is to keep the patch sizes smaller. Personally, I tend to find
> > smaller, incremental patches easier to review than one gigantic patch.
> > Do you want me to fold these three bugs/patches together and produce
> > one big patch?
> 
> Others may disagree but I’d prefer one big patch since they are very tightly related.
> 
> > no-op refactoring:
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8230848/03/webrev/
> > 
> > added cgroups v2:
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8230305/05/webrev/
> > 
> > add Metrics.java support for cgroups v2:
> > <no patch as of yet>
> > 
> > > In terms of your factoring approach, I don’t see why you need to move so much logic out of os_linux.cpp.
> > 
> > I do find the code easier to follow if the exposed interfaces of
> > os_container are minimal and kept more broad. To some extent the
> > osContainer interface is a leaky abstraction. Example: after the
> > refactoring we'd generally have (in pseudo code):
> > 
> > julong available_memory() {
> >  if (isContainerized()) {
> >    return container_available_memory();
> >  }
> >  return system_available_memory();
> > }
> > 
> > over having the osContainer impl in osLinux directly:
> > 
> > julong available_memory() {
> >  if (isContainerized()) {
> >     mem_limit = container_mem_limit();
> >     if (error) {
> >        trace_container(...)
> >     }
> >     mem_usage = container_memory_usage();
> >     if (error) {
> >        trace_container(...)
> >     }
> >     if (mem_limit > 0 && mem_usage > 0) {
> >        return mem_limit > mem_usage ? mem_limit - mem_usage : 0;
> >     }
> >  }
> >  return system_available_memory();
> > }
> > 
> > Perhaps that's just me. We should perhaps do a similar refactoring for
> > physical memory…
> 
> I didn’t want to drag a lot of the logic from os_linux.cpp into the container implementation
> and wanted to try to keep a separation of  pure os configuration and container configuration.
> I also envisioned cases were there might be non cgroup containers that we’d need to support.
> There’s already Kata containers but that one looks just like a pure os execution.
> 
> > > The OSContainer interface should be used independent of which cgroup implementation is active.
> > 
> > Yes, agreed. I guess the disagreement is in the granularity of the
> > interface functions.
> > 
> > > os_linux should use the existing OSContainer interface.
> > 
> > OK. If the OSContainer interface is set in stone I will change it back.
> > 
> > My concern was that it exposes too many functions for no good reason
> > not needed by os_linux (the only consumer). Most functions are used by
> > print_container_info() anyway, which could be kept more general.
> > Perhaps I'm missing something?
> 
> The original intent was to provide a container metrics API that would be used by the VM startup,
> JFR events and Management MBeans to provide more visibility into a container.
> I’m still hopeful that we’ll eventually get the last two.  I’ve had discussions with the serviceability
> team and there’s an interested party in doing the JFR events.
> 
> > > OSContainer_linux should include the logic which selects cgroupv1 versus cgroupv2 and
> > > calls through a common cgroupSubsystem_linux class.  This cgroupSubsystem_linux class
> > > should provide common functions that both v1 and v2 use. 
> > 
> > Yes. If you look at the cgroups v2 patch, that's what is being
> > proposed. CgroupSubsystemFactory::create() is being called from
> > OsContainer_linux. The factory does the selection and returns the
> > abstract cgroupSubsystem class (either v1 or v2):
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8230305/05/webrev/
> > 
> > > I’d then expect to see cgroupV1Subsystem_linux.?pp and cgroupV2Subsystem_linux.?pp files which 
> > > extends cgroupSubsystem and contain version specific functions.
> > 
> > Sure, I'd be happy to move things into version specific files if that's
> > desired. Currently, I've kept them in cgroupSubsystem_linux.{c,h}pp. It
> > shouldn't be hard to separeate them out. It makes sense to do it
> > actually.
> > 
> > Thanks again for you input!
> 
> No, problem and thanks for doing this work.

OK. I'll see to get this into a shape you're happy with and come back
to you.

Thanks,
Severin



More information about the hotspot-dev mailing list