Request for review: JDK-8009561 NPG: Metaspace fragmentation when retiring a Metachunk

Tue Jul 30 08:49:57 UTC 2013

Ramki,

On 2013-07-29 22:13, Srinivas Ramakrishna wrote:
> Just an off-thecuff quite tangential remark...
>
> Haven't been following these discussions closely, but it would be
> interesting to record the allocation patterns and determine if an
> approach like having a big bag of pages might be more efficient for
> allocation and result in less fragmentation. (This is a general issue
> that might help CMS allocation as well.) It might be a good summer
> intern project if, as is usually the case, folks are too busy with
> normal work to go off on a research expedition for 2 months ;-)

Indeed, this is more or less what the normal allocation path for 
Metaspace is. It does pointer-bumping on per-classloader chunks of memory.
However since we must support deallocation of arbitrary memory due to 
class redefinition and other monstrosities we also have a free list to 
do allocations from in order to reuse fragmented memory.

/Mikael

>
> One could perhaps get a good idea even just based on an
> instrumentation trace of births and deaths, and a simulation based on
> such traces from real programs running on CMS and NPG.
>
> -- ramki
>
> On Mon, Jul 29, 2013 at 11:34 AM, Jon Masamitsu
> <jon.masamitsu at oracle.com> wrote:
>> Mikael,
>>
>> I get your point now about possble loss of the "slack".
>>
>> Jon
>>
>>
>> On 7/28/13 5:50 AM, Mikael Gerdin wrote:
>>>
>>> On 07/26/2013 10:29 PM, Jon Masamitsu wrote:
>>>>
>>>>
>>>> On 7/26/2013 6:22 AM, Mikael Gerdin wrote:
>>>>>
>>>>> Jon,
>>>>>
>>>>> On 06/12/2013 12:51 AM, Jon Masamitsu wrote:
>>>>>>
>>>>>>
>>>>>> On 6/11/13 2:46 PM, Mikael Gerdin wrote:
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>> On 06/07/2013 07:36 PM, Jon Masamitsu wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/7/2013 8:28 AM, Mikael Gerdin wrote:
>>>>>>>>>
>>>>>>>>> Jon,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2013-06-06 16:50, Jon Masamitsu wrote:
>>>>>>>>>>
>>>>>>>>>> Mikael,
>>>>>>>>>>
>>>>>>>>>> Thanks.  I'd be interested in seeing the instrumentation you
>>>>>>>>>> add.  Might be worth adding as an enhancement in a later
>>>>>>>>>> changeset.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I did a 1hr KS run today with and without block splitting, here's
>>>>>>>>> what
>>>>>>>>> I came up with (in an entirely non-scientific way)
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/splitting.txt
>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/splitting.png
>>>>>>>>
>>>>>>>>
>>>>>>>> Good graphs.
>>>>>>>>
>>>>>>>> The behavior is what we expect (I think).  With splitting we are
>>>>>>>> able to
>>>>>>>> do more
>>>>>>>> small allocations from the dictionary (where we split a larger
>>>>>>>> block to
>>>>>>>> get a smaller
>>>>>>>> block) and get fewer larger blocks allocated (some have been split).
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> We hit the HWM 4 times with splitting and 5 times without splitting.
>>>>>>>>
>>>>>>>>
>>>>>>>> Because we don't have to expand (get new chunks as often, which is
>>>>>>>> good) I
>>>>>>>> would surmise.
>>>>>>>>
>>>>>>>>> On the other hand: splitting did lead us with more metaspace memory
>>>>>>>>> committed in the end.
>>>>>>>>
>>>>>>>>
>>>>>>>> One explanation would be that allocations of larger block need to
>>>>>>>> come
>>>>>>>> out of newly committed space instead of the dictionary (where the
>>>>>>>> large
>>>>>>>> blocks have been broken up).
>>>>>>>>
>>>>>>>> Is there a policy that we could use that says
>>>>>>>>
>>>>>>>> "break up a larger block for a smaller block allocation only if ..."
>>>>>>>>
>>>>>>>> You fill in the blank?
>>>>>>>
>>>>>>>
>>>>>>> ...only if the larger block is less than 4 times larger than the
>>>>>>> allocation? 2 times? 8 times?
>>>>>>>
>>>>>>> I could try some more KS runs but I'm unsure if the figures I come up
>>>>>>> with are actually relevant.
>>>>>>
>>>>>>
>>>>>> I also don't know if more KS runs would be relevant.    Can you ask the
>>>>>> dictionary
>>>>>> how many blocks there are of the size you're going to split?  If we
>>>>>> only
>>>>>> split if
>>>>>> there are more than 4 blocks of that size, that would moderate the
>>>>>> splitting
>>>>>> a bit.
>>>>>
>>>>>
>>>>> I thought about this again and when I saw Fred's patch I think that
>>>>> not splitting blocks won't help us here.
>>>>> As Fred noticed when we deallocate we deallocate based on the known
>>>>> size of an object. So even if I don't split in the allocation path the
>>>>> rest of the Metablock returned from the freelist is wasted because if
>>>>> it is deallocated the deallocation path has no knowledge of the size
>>>>> of the block originally returned from the block freelist, right?
>>>>
>>>>
>>>> When you get a Metablock from the freelist, you know its size
>>>> (accurately after Fred's change).
>>>> Then you can calculate the size of the Metablock that is being put back
>>>> on the freelist.
>>>> I don't see a problem.
>>>
>>>
>>> Exactly, but if we switch to Dither::atLeast and _don't_ split the block
>>> the caller of allocate() won't know that the allocation size was actually
>>> larger than requested. And since the deallocation path calculates the size
>>> to deallocate based on the size of the object we would waste the "slack" if
>>> we don't put it back into the dictionary.
>>>
>>> Let's say we get an allocation request for 8 words and the dictionary
>>> contains a 12 word block,
>>>
>>>      v= allocation request
>>> |========----|
>>>             ^= slack which could be returned to the block dictionary.
>>>
>>> Since the allocation request was for 8 words and the deallocation path
>>> uses the same size calculation the 4 words will be wasted.
>>>
>>> /Mikael
>>>
>>>>
>>>> Jon
>>>>
>>>>>
>>>>> I should probably redo my runs with Fred's patch applied as well.
>>>>>
>>>>> /Mikael
>>>>>
>>>>>>
>>>>>> Jon
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I put up the very simple instrumentation at:
>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/instrument/webrev
>>>>>>>>>
>>>>>>>>> I also changed the allocation_from_dictionary_limit to 4k to force
>>>>>>>>> us
>>>>>>>>> to make more freelist allocations.
>>>>>>>>
>>>>>>>>
>>>>>>>> Does it really make sense to have any
>>>>>>>> allocation_from_dictionary_limit?
>>>>>>>> I know it was initially added because allocation from a freelist
>>>>>>>> takes
>>>>>>>> longer
>>>>>>>> but to have a static limit like that just seems to put that space
>>>>>>>> forever
>>>>>>>> beyond reach.
>>>>>>>
>>>>>>>
>>>>>>> I thought you had added the limit. I sort of feel that 64k is a bit
>>>>>>> much but the code would definitely be simpler if there was none.
>>>>>>> We already take the hit of acquiring a Mutex for each Metaspace
>>>>>>> allocation so maybe the dictionary lookup isn't that expensive?
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the numbers.
>>>>>>>
>>>>>>>
>>>>>>> You're welcome.
>>>>>>>
>>>>>>> /Mikael
>>>>>>>
>>>>>>>>
>>>>>>>> Jon
>>>>>>>>
>>>>>>>>>
>>>>>>>>> /Mikael
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Jon
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/6/13 2:22 AM, Mikael Gerdin wrote:
>>>>>>>>>>>
>>>>>>>>>>> Jon,
>>>>>>>>>>>
>>>>>>>>>>> On 2013-06-06 04:41, Jon Masamitsu wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 6/5/2013 7:04 AM, Mikael Gerdin wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can I have some reviews of this small fix to the Metaspace
>>>>>>>>>>>>> memory
>>>>>>>>>>>>> allocation path.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Problem:
>>>>>>>>>>>>> When a Metaspace allocation request cannot be satisfied by the
>>>>>>>>>>>>> current
>>>>>>>>>>>>> chunk the chunk is retired and a new chunk is requested. This
>>>>>>>>>>>>> causes
>>>>>>>>>>>>> whatever is left in the chunk to be effectively leaked.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suggested fix:
>>>>>>>>>>>>> Put the remaining memory in each chunk on the Metablock freelist
>>>>>>>>>>>>> so it
>>>>>>>>>>>>> can be used to satisfy future allocations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Possible addition:
>>>>>>>>>>>>> When allocating from the block free list, use
>>>>>>>>>>>>> FreeBlockDictionary<Metablock>::atLeast instead of
>>>>>>>>>>>>> FreeBlockDictionary<Metablock>::exactly and split the
>>>>>>>>>>>>> Metablock if
>>>>>>>>>>>>> it's large enough.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One might argue that this increases the fragmentation of the
>>>>>>>>>>>>> memory on
>>>>>>>>>>>>> the block free list but I think that we primarily want to use
>>>>>>>>>>>>> the
>>>>>>>>>>>>> block free list for small allocations and allocate from chunks
>>>>>>>>>>>>> for
>>>>>>>>>>>>> large allocations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Webrev:
>>>>>>>>>>>>> Only fix:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/webrev.0/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The "Only fix" looks good.  Did you test with
>>>>>>>>>>>> metaspace_slow_verify=true?
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Incremental webrev for splitting blocks:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~mgerdin/8009561/webrev.0%2b/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Change looks good.
>>>>>>>>>>>>
>>>>>>>>>>>> Did you do any long running tests with the block splitting?
>>>>>>>>>>>> Such as
>>>>>>>>>>>> 24hours with kitchensink?  Something that would reuse Metablocks
>>>>>>>>>>>> so that we can see if we are fragmenting instead of reusing?
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I did some runs earlier but I don't have any data from them.
>>>>>>>>>>> I can try to get an instrumented build together and run KS over
>>>>>>>>>>> the
>>>>>>>>>>> weekend.
>>>>>>>>>>>
>>>>>>>>>>> /Mikael
>>>>>>>>>>>
>>>>>>>>>>>> Jon
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bug links:
>>>>>>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-8009561
>>>>>>>>>>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8009561
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> /Mikael
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>