PING: RFC for new JEP: Reduce metaspace waste by dynamically merging and splitting metaspace chunks.

Tue Oct 25 07:39:15 UTC 2016

Hi Coleen,

thank you for feedback and encouragement :) See further comments inline.

On Mon, Oct 24, 2016 at 6:32 PM, Coleen Phillimore <
coleen.phillimore at oracle.com> wrote:

>
> Hi Thomas,
>
> I agree with Erik.  If this works well for you, then it should just be
> implemented without an option.   If done early in JDK10, it'll get a lot of
> good testing.
>

Ok. Was hoping to get this into JDK9, but if I am thinking clearly about
this, I see that maybe the risk is too large. So lets implement this in 10
and if it is stable and works well, it can be backported to jdk9, yes?

>
> This looks like a very good improvement.   We had discussed coalescing
> blocks and other improvements like this early on, but wanted to wait to see
> problems in the field to motivate the changes.   We've seen these sorts of
> problems now too.
>
> One of the things we've considered is that we wanted to use the operating
> system's version of malloc for the chunks, so that it can split and
> coalesce chunks for us.  The Solaris malloc is improved recently.  But I
> don't think it's time to make change to use malloc'ed chunks yet because we
> have to consider all of the operating systems that we and the OpenJDK
> community supports.
>

I do not see how you could get CompressedClassPointers to work with native
malloc? You would have to be sure that the pointers returned by malloc are
within the numerical range for the 32bit class pointers. I thought that was
the reason for using a continuous address range when allocating compressed
class space.

>
> So, yes, I think the JEP looks good and your slides are absolutely
> beautiful.  Everyone should see these slides.
>
>
:) Thanks!

So I will provide a prototype, for now based on jdk9, and we will see how
we go from there.

Thanks, and Kind Regards,

Thomas

> Thanks,
> Coleen
>
>
>
> On 10/24/16 10:01 AM, Erik Helin wrote:
>
>> On 2016-10-13, Thomas Stüfe wrote:
>>
>>> Hi Erik,
>>>
>>> On Thu, Oct 13, 2016 at 2:15 PM, Erik Helin <erik.helin at oracle.com>
>>> wrote:
>>>
>>> Hi Thomas,
>>>>
>>>> thanks for submitting the JEP and proposing this feature!
>>>>
>>>> On 2016-10-10, Thomas Stüfe wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> May I have please some feedback for this enhancement proposal?
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8166690
>>>>>
>>>>>
>>>>> In one very short sentence it proposes a better allocation scheme for
>>>>> Metaspace Chunks in order to reduce fragmentation and metaspace waste.
>>>>>
>>>>> I also added a short presentation which describes the problem and how
>>>>> we
>>>>> solved it in our VM port.
>>>>>
>>>>> https://bugs.openjdk.java.net/secure/attachment/63894/
>>>>>
>>>> Metaspace%20Coalescation%20in%20the%20SAP%20JVM.pdf
>>>>
>>>> Do we really need the flag -XX:+CoalesceMetaspace? Having two differnent
>>>> ways to handle the chunk free lists in Metaspace is unfortunate, it
>>>> might introduce hard to detect bugs and will also require much more
>>>> testing (runnings lots of tests with the flag both on and off).
>>>>
>>>> You are right. If the new allocator works well, there is no reason to
>>> keep
>>> the old allocator around.
>>>
>>> We wanted for a temporary time to be able to switch between both old and
>>> new allocator. Just to have a fallback if problems occur. But if it
>>> works,
>>> it makes sense to only have one allocator, and the "CoalesceMetaspace"
>>> flag
>>> can be removed, and also the code can be made a lot simpler because we do
>>> not need both code paths.
>>>
>> Yeah, I would strongly prefer to not introduce a new flag for this. Have
>> you thought about testing? Do you intend to write new tests to stress
>> the coalescing?
>>
>> Do you think your proposed solution has low enough overhead (in terms
>>>> of CPU and memory) to be on "by default"?
>>>>
>>>> We decided to switch it on by default in our VM.
>>>
>>> Memory overhead can be almost exactly calculated. Bitmasks take 2 bits
>>> per
>>> specialized-chunk-sized-area. That means, for specialized-chunk-size = 1k
>>> (128 meta words): metaspace size / 8192. So, for 1G of metaspace we pay
>>> 132KB overhead for the bitmasks, or roughly 0.1%.
>>>
>>> There is some CPU overhead, but in my tests I could not measure anything
>>> above noise level.
>>>
>> Those numbers seems low enough to me in order to not warrant a new flag.
>>
>> Thanks,
>>>> Erik
>>>>
>>>>
>>>> Btw, I understand that it is difficult to estimate this proposal
>>> without a
>>> prototype to play around. As I already mentioned, the patch right now
>>> only
>>> exists in our code base and not yet in the OpenJDK. If you guys are
>>> seriously interested in this JEP, I will invest the time to port the
>>> patch
>>> to the OpenJDK, so that you can check it out for yourself.
>>>
>> Yes, we are seriously interested :) I think the proposal sounds good. I
>> guess
>> the devil will be in the details, so I (we) would really appreciate if
>> you want to port your internal patch to OpenJDK.
>>
>> Thanks,
>> Erik
>>
>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>> Thank you very much!
>>>>>
>>>>> Kind Regards, Thomas
>>>>>
>>>>>
>>>>> On Tue, Sep 27, 2016 at 10:45 AM, Thomas Stüfe <
>>>>> thomas.stuefe at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Dear all,
>>>>>>
>>>>>> please take a look at this Enhancement Proposal for the metaspace
>>>>>> allocator. I hope these are the right groups for this discussion.
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8166690
>>>>>>
>>>>>> Background:
>>>>>>
>>>>>> We at SAP see at times at customer installations OOMs in Metaspace
>>>>>> (usually, with compressed class pointers enabled, in Compressed Class
>>>>>> Space). The VM attempts to allocate metaspace and fails, hitting the
>>>>>> CompressedClassSpaceSize limit. Note that we usually set the limit
>>>>>>
>>>>> lower
>>>>
>>>>> than the default, typically at 256M.
>>>>>>
>>>>>> When analyzing, we observed that a large part of the metaspace is
>>>>>>
>>>>> indeed
>>>>
>>>>> free but "locked in" into metaspace chunks of the wrong size: often we
>>>>>> would find a lot of free small chunks, but the allocation request was
>>>>>>
>>>>> for
>>>>
>>>>> medium chunks, and failed.
>>>>>>
>>>>>> The reason was that if at some point in time a lot of class loaders
>>>>>>
>>>>> were
>>>>
>>>>> alive, each with only a few small classes loaded. This would lead to
>>>>>>
>>>>> the
>>>>
>>>>> metaspace being swamped with lots of small chunks. This is because each
>>>>>> SpaceManager first allocates small chunks, only after a certain amount
>>>>>>
>>>>> of
>>>>
>>>>> allocation requests switches to larger chunks.
>>>>>>
>>>>>> These small chunks are free and wait in the freelist, but cannot be
>>>>>>
>>>>> reused
>>>>
>>>>> for allocation requests which require larger chunks, even if they are
>>>>>> physically adjacent in the virtual space.
>>>>>>
>>>>>> We (at SAP) added a patch which allows on-the-fly metaspace chunk
>>>>>>
>>>>> merging
>>>>
>>>>> - to merge multiple adjacent smaller chunk to form a larger chunk.
>>>>>>
>>>>> This, in
>>>>
>>>>> combination with the reverse direction - splitting a large chunk to get
>>>>>> smaller chunks - partly negates the "chunks-are-locked-in-into-
>>>>>>
>>>>> their-size"
>>>>
>>>>> limitation and provides for better reuse of metaspace chunks. It also
>>>>>> provides better defragmentation as well.
>>>>>>
>>>>>> I discussed this fix off-list with Coleen Phillimore and Jon
>>>>>> Masamitsu,
>>>>>> and instead of just offering this as a fix, both recommended to open a
>>>>>>
>>>>> JEP
>>>>
>>>>> for this, because its scope would be beyond that of a simple fix.
>>>>>>
>>>>>> So here is my first JEP :) I hope it follows the right form. Please,
>>>>>> if
>>>>>> you have time, take a look and tell us what you think.
>>>>>>
>>>>>> Thank you, and Kind Regards,
>>>>>>
>>>>>> Thomas Stüfe
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>