[RFC 8285277] - How should the JVM handle container memory limits

Thu Apr 28 10:28:59 UTC 2022

On Thu, 2022-04-28 at 09:50 +0200, Thomas Stüfe wrote:
> > > > 
> > > > - If we want to efficiently use all the memory we paid for in a
> > > > container, we need to tune the VM parameters manually
> > > > 
> > > > - But tuning is very difficult and error prone. With the 100mb heap
> > > > setting, my app could easily be killed if it uses too much memory in
> > > > malloc, native buffers, or even the JIT. Currently we don't fail early
> > > > (no check of -Xms, -Xmx, etc, against available memory), and don't fail
> > > > predictably.
> > > +1 to doing some more sanity checks on JVM startup when some of those
> > > settings exceed physical memory.
> > 
> > I think this makes sense. We need to decide what the behavior should be:
> > 
> > - what flags to check
> > - print warnings?
> > - abort the VM?
> > - override the settings to conform to available memory?
> > - always do this, or only when running inside a container? If the
> > latter, we will have the problem of  JDK-8261242

Always do it. There should be no special casing for the container case.

> > 
> Please only a warning!

I tend to agree.

> There may be valid reasons for starting with a massively overextended
> (uncommitted) heap, e.g. sparse databases.
> 
> Maybe this is just a documentation problem? -Xmx only reserves address
> space. In default overcommit mode, you can happily run with Xmx10000G on a
> modern Linux box. Xms is subject to the overcommit heuristics of the
> underlying system, which by default are somewhat larger than mem+swap but
> not endlessly so.
> 
> If you want an early error when running with too large a heap, you can
> prevent that today by switching off overcommit heuristics
> (vm.overcommit_memory=2), setting overcommit_ratio to something like 100
> (if you are the only big process on that container), and start the VM with
> -Xmx == -Xms. Am I overlooking something?

Good points. I do think, though, that those massive overcommit cases
are the outliers as compared to the average JVM process. Especially in
the container world, where many people get some memory limit via an
orchestration framework and are blissfully unaware that - as a result -
the -Xmx/-Xms settings actually exceed that limit.

A warning (and accepting the warning) seems fine in those overcommit
cases and might actually raise awareness in other cases, IMO.

Thanks,
Severin