RFR (M): 8006952: Slow VM due to excessive code cache freelist iteration

Tue Apr 16 13:44:38 PDT 2013

On 2013-04-16 20:39, Vladimir Kozlov wrote:
> I asked about changes in CompileBroker::compiler_thread_loop():
>
> 1584       if (CodeCache::unallocated_capacity() < 
> CodeCacheMinimumFreeSpace) {
> 1585         // the code cache is really full
> 1586         handle_full_code_cache();
>
> It is executed before we compile a method.
>
> For example, if we have a hot method at the end of codecache which is 
> used then the check above will prevent us from compiling any methods 
> even when codecache has enough space in the middle (a block on 
> free_list). Do I get it right?

No, the space on the free list is included in 
CodeCache::unallocated_capacity().

But we might have 1500k of really fragmented space on the free list, and 
nothing but the adapter/stub-reserved space in the heap. Then compiles 
will pass the check at 1584 and 1587, but later fail on 
bufferblob/nmethod allocation. But that case is handled by calls too 
handle_full_code_cache() at the allocation sites.

//N

>
> Thanks,
> Vladimir
>
> On 4/16/13 11:18 AM, Nils Eliasson wrote:
>> Yes.
>>
>> So its pretty much like it was before:
>>
>> 1) Try to allocate from the free list using best fit
>> 2) Allocate a new chunk from the code cache heap (always increasing,
>> reclaimed memory always goes to the free list)
>>
>> With the addition that for both 1 and 2, stubs and adapters have a
>> strict monopoly on the last 500k of address space. But they are
>> themselves free to be placed wherever possible, and will essentially
>> only use the last part as a last resort.
>>
>> I belive this will allow us to decrease the CodeCacheMinimumThreshold
>> since we now have strong guarantees that nmethods will not eat up or
>> fragment the last part. But that is for another change.
>>
>> Thanks for your time!
>> //Nils
>>
>>
>>
>>
>> On 2013-04-16 17:00, Vladimir Kozlov wrote:
>>> Looks good.
>>>
>>> Do I understand correctly that now we bailout from in compileBroker
>>> conservatively if no enough space at the end of code cache? But in a
>>> middle we could have a big free block. Right?
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 4/16/13 3:16 AM, Nils Eliasson wrote:
>>>> Latest webrev: http://cr.openjdk.java.net/~neliasso/8006952/webrev.09/
>>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.09/>
>>>>
>>>> Updates after comments from Vladimir K. and Niclas.
>>>>
>>>> - Removed 'red_zone' naming and refined comments
>>>> - Refined naming in heap.cpp/hpp on conversion methods
>>>> size(...)/number_of_segments(...) to
>>>> size_to_segments(...)/segments_to_size(...)
>>>> - Bool flags as verbs (critical -> is_critical)
>>>> - Removed get_largest_block() and all its uses.
>>>>
>>>> Enjoy!
>>>>
>>>> //Nils
>>>>
>>>> On 2013-04-15 12:36, Nils Eliasson wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 2013-04-12 19:21, Vladimir Kozlov wrote:
>>>>>> Nils,
>>>>>>
>>>>>> It is sad that you have to reverse parfait fix but I agree that your
>>>>>> changes are more clean.
>>>>>
>>>>> Please - don't be! I taken extra precaution and have verified that
>>>>> there is a null check between the call to new and the constructor.
>>>>> This is consistent with both the C++ spec and the behaviour we 
>>>>> rely on
>>>>> when running out of code cache. The CodeCache::has_space would have
>>>>> forced me to do it more complex, it would have to copy/wrap the
>>>>> heap::allocate algorithm but returning a bool instead of actually
>>>>> allocating.
>>>>>
>>>>> Sorry for forgetting to add it to the change log in my request. I did
>>>>> not intend to revert it without being open about it.
>>>>>
>>>>> I don't understand why Parfait is confused about 'this' being null in
>>>>> the constructor. I will open a bug on Parfait if I can find where to
>>>>> submit it.
>>>>>
>>>>>>
>>>>>> After these changes largest_free_block() is used only in statistics
>>>>>> output which is useless since it is not participating in codecache
>>>>>> allocations.  May be we should remove it and use
>>>>>> heap_unallocated_capacity() instead (do we need lock then?):
>>>>>
>>>>> Yes, good idea. Must check that I don't break any log tool if I 
>>>>> change
>>>>> the output though.
>>>>>
>>>>>>
>>>>>> size_t CodeCache::largest_free_block() {
>>>>>>   return _heap->heap_unallocated_capacity();
>>>>>> }
>>>>>>
>>>>>> Also red_zone_start() is confusing because it does not show how 
>>>>>> it is
>>>>>> related to heap_unallocated_capacity(). In one place you use one:
>>>>>>
>>>>>>  211     if (size(length) > (heap_unallocated_capacity() -
>>>>>> CodeCacheMinimumFreeSpace)) {
>>>>>>
>>>>>> and in an other use second:
>>>>>>
>>>>>>  437     if (((size_t)cur + length) >= (size_t)red_zone_start()) {
>>>>>>
>>>>>> Can we just use only heap_unallocated_capacity()?
>>>>>
>>>>> I agree that it is confusing. The background is that they serve a bit
>>>>> different purpose. In one case we are allocating from the freelist 
>>>>> and
>>>>> need to check against the adress where the 'red zone/critical
>>>>> allocation area begins.' In the other it is a block allocation from
>>>>> the contiguous heap where we know heap_unallocated_capacity is the
>>>>> size of the block at the end of the heap. I'll come up with something
>>>>> better.
>>>>>
>>>>> Thanks for taking your time!
>>>>> //Nils
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 4/12/13 3:19 AM, Nils Eliasson wrote:
>>>>>>> I have now incorporated both Vladimirs and Andreas suggestion:
>>>>>>>
>>>>>>> Changes to previous revision includes:
>>>>>>> 1) Updated the checks in both heap::allocate and
>>>>>>> heap::search_freelist
>>>>>>> so that non critical allocation never stride into the
>>>>>>> CodeCacheMinimumFreeSpace.
>>>>>>> 2) Inverted the critical condition - now all allocations default
>>>>>>> to not
>>>>>>> critical. Only the places where we would go fatal if the allocation
>>>>>>> returns null are critical.
>>>>>>> 3) Refactored some names and removed duplicate methods in heap.cpp.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.08/
>>>>>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.08/>
>>>>>>>
>>>>>>> This changes also opens up for decreasing the
>>>>>>> CodeCacheMinimumFreeSpace
>>>>>>> since we now guarantee that only critical allocation will end up
>>>>>>> there.
>>>>>>> And they will not grow to 500K when compilation have stopped 
>>>>>>> anyways.
>>>>>>>
>>>>>>> Kind regards,
>>>>>>> //Nils
>>>>>>>
>>>>>>>
>>>>>>> On 2013-02-08 15:54, Nils Eliasson wrote:
>>>>>>>> Vladimir and Andreas, thanks for your feedback! I am considering
>>>>>>>> your
>>>>>>>> suggestions.
>>>>>>>>
>>>>>>>> My fix is a step in the right direction but I realize it is not
>>>>>>>> enough.
>>>>>>>>
>>>>>>>> Initially, it works good reserving the last part of the heap 
>>>>>>>> block.
>>>>>>>> But if the application has a bigger active code footprint that 
>>>>>>>> what
>>>>>>>> can fit in the code cache, there will be new flushes that will
>>>>>>>> increase fragmentation and slowly reduce the usable part of the 
>>>>>>>> code
>>>>>>>> cache. So I will add functionality for backing off.
>>>>>>>>
>>>>>>>> //Nils
>>>>>>>>
>>>>>>>> Schoesser, Andreas skrev 2013-02-01 10:58:
>>>>>>>>> Hi Nils,
>>>>>>>>>
>>>>>>>>> this is IMO a good idea since we experienced the same problem 
>>>>>>>>> with
>>>>>>>>> one of our ports. I think in our case it was not the nmethod
>>>>>>>>> sweeper
>>>>>>>>> continuously running (there's MinCodeCacheFlushingInterval to 
>>>>>>>>> only
>>>>>>>>> speculatively disconnect nmethods every 30 seconds) but from the
>>>>>>>>> compiler loop constantly polling the freelist without the nmethod
>>>>>>>>> sweeper making any progress.
>>>>>>>>>
>>>>>>>>> You also might consider to propagate the "critical" flag down to
>>>>>>>>> search_freelist(). Then, a simple address compare would do to 
>>>>>>>>> keep
>>>>>>>>> the end of the code cache free from nmethods:
>>>>>>>>> - For non-critical blobs, search_freelist() may only fit free
>>>>>>>>> blocks
>>>>>>>>> < CodeCache::end() - CodeCacheMinimumFreeSpace
>>>>>>>>> - For critical blocks, search_freelist() may fit all free blocks
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Andreas
>>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net
>>>>>>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On 
>>>>>>>>> Behalf Of
>>>>>>>>> Nils Eliasson
>>>>>>>>> Sent: Donnerstag, 31. Januar 2013 19:41
>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>>>>>> Subject: Re: RFR (M): 8006952: Slow VM due to excessive code 
>>>>>>>>> cache
>>>>>>>>> freelist iteration
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.04/
>>>>>>>>>
>>>>>>>>> //N
>>>>>>>>>
>>>>>>>>> Vladimir Kozlov skrev 2013-01-30 21:16:
>>>>>>>>>> Nils, you sent webrev for other bug fix.
>>>>>>>>>>
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>> On 1/30/13 7:43 AM, Nils Eliasson wrote:
>>>>>>>>>>> Thanks for the link, interesting background.
>>>>>>>>>>>
>>>>>>>>>>> Even if finding the largest block on the freelist is free, we
>>>>>>>>>>> would
>>>>>>>>>>> still have a misbehavior when the free heap block is less than
>>>>>>>>>>> 1,5M
>>>>>>>>>>> and
>>>>>>>>>>> the flushing begins to create a 1,5M freelist block.
>>>>>>>>>>>
>>>>>>>>>>> Unallocated memory is a good measurement if have some 
>>>>>>>>>>> continuous
>>>>>>>>>>> memory
>>>>>>>>>>> in reserve for critical blobs. I am thinking of using the
>>>>>>>>>>> thresholds
>>>>>>>>>>> against unallocated memory, but forbidding nmethods from using
>>>>>>>>>>> the
>>>>>>>>>>> last
>>>>>>>>>>> 0,5M (MinimumFreeSpace) from the heap block. That would prevent
>>>>>>>>>>> the VM
>>>>>>>>>>> from fragmenting the last continuous memory space, and allow 
>>>>>>>>>>> the
>>>>>>>>>>> VM to
>>>>>>>>>>> use all of the freelist.
>>>>>>>>>>>
>>>>>>>>>>> Like this:
>>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8002364/webrev.03
>>>>>>>>>>>
>>>>>>>>>>> //Nils
>>>>>>>>>>>
>>>>>>>>>>> Vladimir Kozlov skrev 2013-01-29 20:35:
>>>>>>>>>>>> Nils,
>>>>>>>>>>>>
>>>>>>>>>>>> You are reversing 7025742 fix. I would prefer to keep track 
>>>>>>>>>>>> of a
>>>>>>>>>>>> largest free block during allocation in CodeCache (by using
>>>>>>>>>>>> ordered
>>>>>>>>>>>> list or other technique) to avoid scanning the list.
>>>>>>>>>>>>
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>> On 1/29/13 2:56 AM, Nils Eliasson wrote:
>>>>>>>>>>>>> Remove continuous free block requirement for code cache
>>>>>>>>>>>>> flushing and
>>>>>>>>>>>>> minimum free space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This causes a degenerate behavior when the VM are repeatedly
>>>>>>>>>>>>> flushing
>>>>>>>>>>>>> the code cache trying to free up a continuous 1,5M block.
>>>>>>>>>>>>> Since the
>>>>>>>>>>>>> freelist is fragmented a significant part of the code 
>>>>>>>>>>>>> cache is
>>>>>>>>>>>>> kept
>>>>>>>>>>>>> unallocated. A significant amount of time is spent
>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>> freelist holding the code cache lock or waiting for other
>>>>>>>>>>>>> threads
>>>>>>>>>>>>> doing
>>>>>>>>>>>>> the same.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't see any reason for why a continuous block is 
>>>>>>>>>>>>> required.
>>>>>>>>>>>>> Code
>>>>>>>>>>>>> cache flushing threshold 
>>>>>>>>>>>>> (CodeCacheFlushingMinimumFreeSpace) is
>>>>>>>>>>>>> default
>>>>>>>>>>>>> 1,5M and code cache minimum threshold
>>>>>>>>>>>>> (CodeCacheMinimumFreeSpace) is
>>>>>>>>>>>>> 0,5M. Finding such a block on the freelist is often
>>>>>>>>>>>>> unrealistic,
>>>>>>>>>>>>> and has
>>>>>>>>>>>>> not any purpose. The largest nmethods are in the order of
>>>>>>>>>>>>> 250k, and
>>>>>>>>>>>>> they
>>>>>>>>>>>>> are not allocated when code cache is starting to run out. All
>>>>>>>>>>>>> critical
>>>>>>>>>>>>> adapters are small and will fit easily in the freelist or in
>>>>>>>>>>>>> the
>>>>>>>>>>>>> remaining heap space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.02/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Nils Eliasson
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>