PING: RFC for new JEP: Reduce metaspace waste by dynamically merging and splitting metaspace chunks.

Thomas Stüfe thomas.stuefe at gmail.com
Tue Oct 25 07:33:46 UTC 2016


Hi Erik,

On Mon, Oct 24, 2016 at 4:01 PM, Erik Helin <erik.helin at oracle.com> wrote:

> On 2016-10-13, Thomas Stüfe wrote:
> > Hi Erik,
> >
> > On Thu, Oct 13, 2016 at 2:15 PM, Erik Helin <erik.helin at oracle.com>
> wrote:
> >
> > > Hi Thomas,
> > >
> > > thanks for submitting the JEP and proposing this feature!
> > >
> > > On 2016-10-10, Thomas Stüfe wrote:
> > > > Hi all,
> > > >
> > > > May I have please some feedback for this enhancement proposal?
> > > >
> > > > https://bugs.openjdk.java.net/browse/JDK-8166690
> > > >
> > > >
> > > > In one very short sentence it proposes a better allocation scheme for
> > > > Metaspace Chunks in order to reduce fragmentation and metaspace
> waste.
> > > >
> > > > I also added a short presentation which describes the problem and
> how we
> > > > solved it in our VM port.
> > > >
> > > > https://bugs.openjdk.java.net/secure/attachment/63894/
> > > Metaspace%20Coalescation%20in%20the%20SAP%20JVM.pdf
> > >
> > > Do we really need the flag -XX:+CoalesceMetaspace? Having two
> differnent
> > > ways to handle the chunk free lists in Metaspace is unfortunate, it
> > > might introduce hard to detect bugs and will also require much more
> > > testing (runnings lots of tests with the flag both on and off).
> > >
> >
> > You are right. If the new allocator works well, there is no reason to
> keep
> > the old allocator around.
> >
> > We wanted for a temporary time to be able to switch between both old and
> > new allocator. Just to have a fallback if problems occur. But if it
> works,
> > it makes sense to only have one allocator, and the "CoalesceMetaspace"
> flag
> > can be removed, and also the code can be made a lot simpler because we do
> > not need both code paths.
>
> Yeah, I would strongly prefer to not introduce a new flag for this. Have
> you thought about testing? Do you intend to write new tests to stress
> the coalescing?
>

The current version of my patch contains a lot of verification code, which
is activated by default
for the debug case. Nightly we run lots of test suites and benchmarks on
all our platforms,
so the patch already got stressed a lot. We also have tests specific for
the metaspace.

The patch only makes sense with thorough testing, so I consider writing
tests just a part of
implementing the JEP.


> > >
> > > Do you think your proposed solution has low enough overhead (in terms
> > > of CPU and memory) to be on "by default"?
> > >
> >
> > We decided to switch it on by default in our VM.
> >
> > Memory overhead can be almost exactly calculated. Bitmasks take 2 bits
> per
> > specialized-chunk-sized-area. That means, for specialized-chunk-size = 1k
> > (128 meta words): metaspace size / 8192. So, for 1G of metaspace we pay
> > 132KB overhead for the bitmasks, or roughly 0.1%.
> >
> > There is some CPU overhead, but in my tests I could not measure anything
> > above noise level.
>
> Those numbers seems low enough to me in order to not warrant a new flag.
>
> > > Thanks,
> > > Erik
> > >
> > >
> > Btw, I understand that it is difficult to estimate this proposal without
> a
> > prototype to play around. As I already mentioned, the patch right now
> only
> > exists in our code base and not yet in the OpenJDK. If you guys are
> > seriously interested in this JEP, I will invest the time to port the
> patch
> > to the OpenJDK, so that you can check it out for yourself.
>
> Yes, we are seriously interested :) I think the proposal sounds good. I
> guess
> the devil will be in the details, so I (we) would really appreciate if
> you want to port your internal patch to OpenJDK.
>
>
Ok, thanks Erik. I will implement a prototype then and come back to you
once it is done.

Kind Regards, Thomas



> Thanks,
> Erik
>
> > Kind Regards, Thomas
> >
> >
> >
> >
> > > > Thank you very much!
> > > >
> > > > Kind Regards, Thomas
> > > >
> > > >
> > > > On Tue, Sep 27, 2016 at 10:45 AM, Thomas Stüfe <
> thomas.stuefe at gmail.com>
> > > > wrote:
> > > >
> > > > > Dear all,
> > > > >
> > > > > please take a look at this Enhancement Proposal for the metaspace
> > > > > allocator. I hope these are the right groups for this discussion.
> > > > >
> > > > > https://bugs.openjdk.java.net/browse/JDK-8166690
> > > > >
> > > > > Background:
> > > > >
> > > > > We at SAP see at times at customer installations OOMs in Metaspace
> > > > > (usually, with compressed class pointers enabled, in Compressed
> Class
> > > > > Space). The VM attempts to allocate metaspace and fails, hitting
> the
> > > > > CompressedClassSpaceSize limit. Note that we usually set the limit
> > > lower
> > > > > than the default, typically at 256M.
> > > > >
> > > > > When analyzing, we observed that a large part of the metaspace is
> > > indeed
> > > > > free but "locked in" into metaspace chunks of the wrong size:
> often we
> > > > > would find a lot of free small chunks, but the allocation request
> was
> > > for
> > > > > medium chunks, and failed.
> > > > >
> > > > > The reason was that if at some point in time a lot of class loaders
> > > were
> > > > > alive, each with only a few small classes loaded. This would lead
> to
> > > the
> > > > > metaspace being swamped with lots of small chunks. This is because
> each
> > > > > SpaceManager first allocates small chunks, only after a certain
> amount
> > > of
> > > > > allocation requests switches to larger chunks.
> > > > >
> > > > > These small chunks are free and wait in the freelist, but cannot be
> > > reused
> > > > > for allocation requests which require larger chunks, even if they
> are
> > > > > physically adjacent in the virtual space.
> > > > >
> > > > > We (at SAP) added a patch which allows on-the-fly metaspace chunk
> > > merging
> > > > > - to merge multiple adjacent smaller chunk to form a larger chunk.
> > > This, in
> > > > > combination with the reverse direction - splitting a large chunk
> to get
> > > > > smaller chunks - partly negates the "chunks-are-locked-in-into-
> > > their-size"
> > > > > limitation and provides for better reuse of metaspace chunks. It
> also
> > > > > provides better defragmentation as well.
> > > > >
> > > > > I discussed this fix off-list with Coleen Phillimore and Jon
> Masamitsu,
> > > > > and instead of just offering this as a fix, both recommended to
> open a
> > > JEP
> > > > > for this, because its scope would be beyond that of a simple fix.
> > > > >
> > > > > So here is my first JEP :) I hope it follows the right form.
> Please, if
> > > > > you have time, take a look and tell us what you think.
> > > > >
> > > > > Thank you, and Kind Regards,
> > > > >
> > > > > Thomas Stüfe
> > > > >
> > > > >
> > > > >
> > > > >
> > >
>


More information about the hotspot-dev mailing list