PING: RFC for new JEP: Reduce metaspace waste by dynamically merging and splitting metaspace chunks.

Thu Oct 13 13:21:04 UTC 2016

Hi Erik,

On Thu, Oct 13, 2016 at 2:15 PM, Erik Helin <erik.helin at oracle.com> wrote:

> Hi Thomas,
>
> thanks for submitting the JEP and proposing this feature!
>
> On 2016-10-10, Thomas Stüfe wrote:
> > Hi all,
> >
> > May I have please some feedback for this enhancement proposal?
> >
> > https://bugs.openjdk.java.net/browse/JDK-8166690
> >
> >
> > In one very short sentence it proposes a better allocation scheme for
> > Metaspace Chunks in order to reduce fragmentation and metaspace waste.
> >
> > I also added a short presentation which describes the problem and how we
> > solved it in our VM port.
> >
> > https://bugs.openjdk.java.net/secure/attachment/63894/
> Metaspace%20Coalescation%20in%20the%20SAP%20JVM.pdf
>
> Do we really need the flag -XX:+CoalesceMetaspace? Having two differnent
> ways to handle the chunk free lists in Metaspace is unfortunate, it
> might introduce hard to detect bugs and will also require much more
> testing (runnings lots of tests with the flag both on and off).
>

You are right. If the new allocator works well, there is no reason to keep
the old allocator around.

We wanted for a temporary time to be able to switch between both old and
new allocator. Just to have a fallback if problems occur. But if it works,
it makes sense to only have one allocator, and the "CoalesceMetaspace" flag
can be removed, and also the code can be made a lot simpler because we do
not need both code paths.

>
> Do you think your proposed solution has low enough overhead (in terms
> of CPU and memory) to be on "by default"?
>

We decided to switch it on by default in our VM.

Memory overhead can be almost exactly calculated. Bitmasks take 2 bits per
specialized-chunk-sized-area. That means, for specialized-chunk-size = 1k
(128 meta words): metaspace size / 8192. So, for 1G of metaspace we pay
132KB overhead for the bitmasks, or roughly 0.1%.

There is some CPU overhead, but in my tests I could not measure anything
above noise level.

> Thanks,
> Erik
>
>
Btw, I understand that it is difficult to estimate this proposal without a
prototype to play around. As I already mentioned, the patch right now only
exists in our code base and not yet in the OpenJDK. If you guys are
seriously interested in this JEP, I will invest the time to port the patch
to the OpenJDK, so that you can check it out for yourself.

Kind Regards, Thomas

> > Thank you very much!
> >
> > Kind Regards, Thomas
> >
> >
> > On Tue, Sep 27, 2016 at 10:45 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
> > wrote:
> >
> > > Dear all,
> > >
> > > please take a look at this Enhancement Proposal for the metaspace
> > > allocator. I hope these are the right groups for this discussion.
> > >
> > > https://bugs.openjdk.java.net/browse/JDK-8166690
> > >
> > > Background:
> > >
> > > We at SAP see at times at customer installations OOMs in Metaspace
> > > (usually, with compressed class pointers enabled, in Compressed Class
> > > Space). The VM attempts to allocate metaspace and fails, hitting the
> > > CompressedClassSpaceSize limit. Note that we usually set the limit
> lower
> > > than the default, typically at 256M.
> > >
> > > When analyzing, we observed that a large part of the metaspace is
> indeed
> > > free but "locked in" into metaspace chunks of the wrong size: often we
> > > would find a lot of free small chunks, but the allocation request was
> for
> > > medium chunks, and failed.
> > >
> > > The reason was that if at some point in time a lot of class loaders
> were
> > > alive, each with only a few small classes loaded. This would lead to
> the
> > > metaspace being swamped with lots of small chunks. This is because each
> > > SpaceManager first allocates small chunks, only after a certain amount
> of
> > > allocation requests switches to larger chunks.
> > >
> > > These small chunks are free and wait in the freelist, but cannot be
> reused
> > > for allocation requests which require larger chunks, even if they are
> > > physically adjacent in the virtual space.
> > >
> > > We (at SAP) added a patch which allows on-the-fly metaspace chunk
> merging
> > > - to merge multiple adjacent smaller chunk to form a larger chunk.
> This, in
> > > combination with the reverse direction - splitting a large chunk to get
> > > smaller chunks - partly negates the "chunks-are-locked-in-into-
> their-size"
> > > limitation and provides for better reuse of metaspace chunks. It also
> > > provides better defragmentation as well.
> > >
> > > I discussed this fix off-list with Coleen Phillimore and Jon Masamitsu,
> > > and instead of just offering this as a fix, both recommended to open a
> JEP
> > > for this, because its scope would be beyond that of a simple fix.
> > >
> > > So here is my first JEP :) I hope it follows the right form. Please, if
> > > you have time, take a look and tell us what you think.
> > >
> > > Thank you, and Kind Regards,
> > >
> > > Thomas Stüfe
> > >
> > >
> > >
> > >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20161013/26c5ec7b/attachment.htm>