Request for review: JDK-8009561 NPG: Metaspace fragmentation when retiring a Metachunk

Tue Jul 30 14:45:38 UTC 2013

Ah, I see. Thanks Michael, that approach sounds good then.

-- ramki

On Tue, Jul 30, 2013 at 1:49 AM, Mikael Gerdin <mikael.gerdin at oracle.com> wrote:
> Ramki,
>
>
> On 2013-07-29 22:13, Srinivas Ramakrishna wrote:
>>
>> Just an off-thecuff quite tangential remark...
>>
>> Haven't been following these discussions closely, but it would be
>> interesting to record the allocation patterns and determine if an
>> approach like having a big bag of pages might be more efficient for
>> allocation and result in less fragmentation. (This is a general issue
>> that might help CMS allocation as well.) It might be a good summer
>> intern project if, as is usually the case, folks are too busy with
>> normal work to go off on a research expedition for 2 months ;-)
>
>
> Indeed, this is more or less what the normal allocation path for Metaspace
> is. It does pointer-bumping on per-classloader chunks of memory.
> However since we must support deallocation of arbitrary memory due to class
> redefinition and other monstrosities we also have a free list to do
> allocations from in order to reuse fragmented memory.
>
> /Mikael
>
>
>>
>> One could perhaps get a good idea even just based on an
>> instrumentation trace of births and deaths, and a simulation based on
>> such traces from real programs running on CMS and NPG.
>>
>> -- ramki
>>
>> On Mon, Jul 29, 2013 at 11:34 AM, Jon Masamitsu
>> <jon.masamitsu at oracle.com> wrote:
>>>
>>> Mikael,
>>>
>>> I get your point now about possble loss of the "slack".
>>>
>>> Jon
>>>
>>>
>>> On 7/28/13 5:50 AM, Mikael Gerdin wrote:
>>>>
>>>>
>>>> On 07/26/2013 10:29 PM, Jon Masamitsu wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 7/26/2013 6:22 AM, Mikael Gerdin wrote:
>>>>>>
>>>>>>
>>>>>> Jon,
>>>>>>
>>>>>> On 06/12/2013 12:51 AM, Jon Masamitsu wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 6/11/13 2:46 PM, Mikael Gerdin wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Jon
>>>>>>>>
>>>>>>>> On 06/07/2013 07:36 PM, Jon Masamitsu wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/7/2013 8:28 AM, Mikael Gerdin wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jon,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2013-06-06 16:50, Jon Masamitsu wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Mikael,
>>>>>>>>>>>
>>>>>>>>>>> Thanks.  I'd be interested in seeing the instrumentation you
>>>>>>>>>>> add.  Might be worth adding as an enhancement in a later
>>>>>>>>>>> changeset.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I did a 1hr KS run today with and without block splitting, here's
>>>>>>>>>> what
>>>>>>>>>> I came up with (in an entirely non-scientific way)
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/splitting.txt
>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/splitting.png
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Good graphs.
>>>>>>>>>
>>>>>>>>> The behavior is what we expect (I think).  With splitting we are
>>>>>>>>> able to
>>>>>>>>> do more
>>>>>>>>> small allocations from the dictionary (where we split a larger
>>>>>>>>> block to
>>>>>>>>> get a smaller
>>>>>>>>> block) and get fewer larger blocks allocated (some have been
>>>>>>>>> split).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> We hit the HWM 4 times with splitting and 5 times without
>>>>>>>>>> splitting.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Because we don't have to expand (get new chunks as often, which is
>>>>>>>>> good) I
>>>>>>>>> would surmise.
>>>>>>>>>
>>>>>>>>>> On the other hand: splitting did lead us with more metaspace
>>>>>>>>>> memory
>>>>>>>>>> committed in the end.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> One explanation would be that allocations of larger block need to
>>>>>>>>> come
>>>>>>>>> out of newly committed space instead of the dictionary (where the
>>>>>>>>> large
>>>>>>>>> blocks have been broken up).
>>>>>>>>>
>>>>>>>>> Is there a policy that we could use that says
>>>>>>>>>
>>>>>>>>> "break up a larger block for a smaller block allocation only if
>>>>>>>>> ..."
>>>>>>>>>
>>>>>>>>> You fill in the blank?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ...only if the larger block is less than 4 times larger than the
>>>>>>>> allocation? 2 times? 8 times?
>>>>>>>>
>>>>>>>> I could try some more KS runs but I'm unsure if the figures I come
>>>>>>>> up
>>>>>>>> with are actually relevant.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I also don't know if more KS runs would be relevant.    Can you ask
>>>>>>> the
>>>>>>> dictionary
>>>>>>> how many blocks there are of the size you're going to split?  If we
>>>>>>> only
>>>>>>> split if
>>>>>>> there are more than 4 blocks of that size, that would moderate the
>>>>>>> splitting
>>>>>>> a bit.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I thought about this again and when I saw Fred's patch I think that
>>>>>> not splitting blocks won't help us here.
>>>>>> As Fred noticed when we deallocate we deallocate based on the known
>>>>>> size of an object. So even if I don't split in the allocation path the
>>>>>> rest of the Metablock returned from the freelist is wasted because if
>>>>>> it is deallocated the deallocation path has no knowledge of the size
>>>>>> of the block originally returned from the block freelist, right?
>>>>>
>>>>>
>>>>>
>>>>> When you get a Metablock from the freelist, you know its size
>>>>> (accurately after Fred's change).
>>>>> Then you can calculate the size of the Metablock that is being put back
>>>>> on the freelist.
>>>>> I don't see a problem.
>>>>
>>>>
>>>>
>>>> Exactly, but if we switch to Dither::atLeast and _don't_ split the block
>>>> the caller of allocate() won't know that the allocation size was
>>>> actually
>>>> larger than requested. And since the deallocation path calculates the
>>>> size
>>>> to deallocate based on the size of the object we would waste the "slack"
>>>> if
>>>> we don't put it back into the dictionary.
>>>>
>>>> Let's say we get an allocation request for 8 words and the dictionary
>>>> contains a 12 word block,
>>>>
>>>>      v= allocation request
>>>> |========----|
>>>>             ^= slack which could be returned to the block dictionary.
>>>>
>>>> Since the allocation request was for 8 words and the deallocation path
>>>> uses the same size calculation the 4 words will be wasted.
>>>>
>>>> /Mikael
>>>>
>>>>>
>>>>> Jon
>>>>>
>>>>>>
>>>>>> I should probably redo my runs with Fred's patch applied as well.
>>>>>>
>>>>>> /Mikael
>>>>>>
>>>>>>>
>>>>>>> Jon
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I put up the very simple instrumentation at:
>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/instrument/webrev
>>>>>>>>>>
>>>>>>>>>> I also changed the allocation_from_dictionary_limit to 4k to force
>>>>>>>>>> us
>>>>>>>>>> to make more freelist allocations.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Does it really make sense to have any
>>>>>>>>> allocation_from_dictionary_limit?
>>>>>>>>> I know it was initially added because allocation from a freelist
>>>>>>>>> takes
>>>>>>>>> longer
>>>>>>>>> but to have a static limit like that just seems to put that space
>>>>>>>>> forever
>>>>>>>>> beyond reach.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I thought you had added the limit. I sort of feel that 64k is a bit
>>>>>>>> much but the code would definitely be simpler if there was none.
>>>>>>>> We already take the hit of acquiring a Mutex for each Metaspace
>>>>>>>> allocation so maybe the dictionary lookup isn't that expensive?
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks for the numbers.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> You're welcome.
>>>>>>>>
>>>>>>>> /Mikael
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Jon
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> /Mikael
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Jon
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 6/6/13 2:22 AM, Mikael Gerdin wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Jon,
>>>>>>>>>>>>
>>>>>>>>>>>> On 2013-06-06 04:41, Jon Masamitsu wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 6/5/2013 7:04 AM, Mikael Gerdin wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can I have some reviews of this small fix to the Metaspace
>>>>>>>>>>>>>> memory
>>>>>>>>>>>>>> allocation path.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Problem:
>>>>>>>>>>>>>> When a Metaspace allocation request cannot be satisfied by the
>>>>>>>>>>>>>> current
>>>>>>>>>>>>>> chunk the chunk is retired and a new chunk is requested. This
>>>>>>>>>>>>>> causes
>>>>>>>>>>>>>> whatever is left in the chunk to be effectively leaked.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Suggested fix:
>>>>>>>>>>>>>> Put the remaining memory in each chunk on the Metablock
>>>>>>>>>>>>>> freelist
>>>>>>>>>>>>>> so it
>>>>>>>>>>>>>> can be used to satisfy future allocations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Possible addition:
>>>>>>>>>>>>>> When allocating from the block free list, use
>>>>>>>>>>>>>> FreeBlockDictionary<Metablock>::atLeast instead of
>>>>>>>>>>>>>> FreeBlockDictionary<Metablock>::exactly and split the
>>>>>>>>>>>>>> Metablock if
>>>>>>>>>>>>>> it's large enough.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One might argue that this increases the fragmentation of the
>>>>>>>>>>>>>> memory on
>>>>>>>>>>>>>> the block free list but I think that we primarily want to use
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> block free list for small allocations and allocate from chunks
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> large allocations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>>> Only fix:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/webrev.0/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The "Only fix" looks good.  Did you test with
>>>>>>>>>>>>> metaspace_slow_verify=true?
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Incremental webrev for splitting blocks:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/webrev.0%2b/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Change looks good.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Did you do any long running tests with the block splitting?
>>>>>>>>>>>>> Such as
>>>>>>>>>>>>> 24hours with kitchensink?  Something that would reuse
>>>>>>>>>>>>> Metablocks
>>>>>>>>>>>>> so that we can see if we are fragmenting instead of reusing?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I did some runs earlier but I don't have any data from them.
>>>>>>>>>>>> I can try to get an instrumented build together and run KS over
>>>>>>>>>>>> the
>>>>>>>>>>>> weekend.
>>>>>>>>>>>>
>>>>>>>>>>>> /Mikael
>>>>>>>>>>>>
>>>>>>>>>>>>> Jon
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Bug links:
>>>>>>>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8009561
>>>>>>>>>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8009561
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> /Mikael
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>