RFR (M): 8006952: Slow VM due to excessive code cache freelist iteration

Tue Apr 16 11:39:01 PDT 2013

I asked about changes in CompileBroker::compiler_thread_loop():

1584       if (CodeCache::unallocated_capacity() < 
CodeCacheMinimumFreeSpace) {
1585         // the code cache is really full
1586         handle_full_code_cache();

It is executed before we compile a method.

For example, if we have a hot method at the end of codecache which is 
used then the check above will prevent us from compiling any methods 
even when codecache has enough space in the middle (a block on 
free_list). Do I get it right?

Thanks,
Vladimir

On 4/16/13 11:18 AM, Nils Eliasson wrote:
> Yes.
>
> So its pretty much like it was before:
>
> 1) Try to allocate from the free list using best fit
> 2) Allocate a new chunk from the code cache heap (always increasing,
> reclaimed memory always goes to the free list)
>
> With the addition that for both 1 and 2, stubs and adapters have a
> strict monopoly on the last 500k of address space. But they are
> themselves free to be placed wherever possible, and will essentially
> only use the last part as a last resort.
>
> I belive this will allow us to decrease the CodeCacheMinimumThreshold
> since we now have strong guarantees that nmethods will not eat up or
> fragment the last part. But that is for another change.
>
> Thanks for your time!
> //Nils
>
>
>
>
> On 2013-04-16 17:00, Vladimir Kozlov wrote:
>> Looks good.
>>
>> Do I understand correctly that now we bailout from in compileBroker
>> conservatively if no enough space at the end of code cache? But in a
>> middle we could have a big free block. Right?
>>
>> Thanks,
>> Vladimir
>>
>> On 4/16/13 3:16 AM, Nils Eliasson wrote:
>>> Latest webrev: http://cr.openjdk.java.net/~neliasso/8006952/webrev.09/
>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.09/>
>>>
>>> Updates after comments from Vladimir K. and Niclas.
>>>
>>> - Removed 'red_zone' naming and refined comments
>>> - Refined naming in heap.cpp/hpp on conversion methods
>>> size(...)/number_of_segments(...) to
>>> size_to_segments(...)/segments_to_size(...)
>>> - Bool flags as verbs (critical -> is_critical)
>>> - Removed get_largest_block() and all its uses.
>>>
>>> Enjoy!
>>>
>>> //Nils
>>>
>>> On 2013-04-15 12:36, Nils Eliasson wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 2013-04-12 19:21, Vladimir Kozlov wrote:
>>>>> Nils,
>>>>>
>>>>> It is sad that you have to reverse parfait fix but I agree that your
>>>>> changes are more clean.
>>>>
>>>> Please - don't be! I taken extra precaution and have verified that
>>>> there is a null check between the call to new and the constructor.
>>>> This is consistent with both the C++ spec and the behaviour we rely on
>>>> when running out of code cache. The CodeCache::has_space would have
>>>> forced me to do it more complex, it would have to copy/wrap the
>>>> heap::allocate algorithm but returning a bool instead of actually
>>>> allocating.
>>>>
>>>> Sorry for forgetting to add it to the change log in my request. I did
>>>> not intend to revert it without being open about it.
>>>>
>>>> I don't understand why Parfait is confused about 'this' being null in
>>>> the constructor. I will open a bug on Parfait if I can find where to
>>>> submit it.
>>>>
>>>>>
>>>>> After these changes largest_free_block() is used only in statistics
>>>>> output which is useless since it is not participating in codecache
>>>>> allocations.  May be we should remove it and use
>>>>> heap_unallocated_capacity() instead (do we need lock then?):
>>>>
>>>> Yes, good idea. Must check that I don't break any log tool if I change
>>>> the output though.
>>>>
>>>>>
>>>>> size_t CodeCache::largest_free_block() {
>>>>>   return _heap->heap_unallocated_capacity();
>>>>> }
>>>>>
>>>>> Also red_zone_start() is confusing because it does not show how it is
>>>>> related to heap_unallocated_capacity(). In one place you use one:
>>>>>
>>>>>  211     if (size(length) > (heap_unallocated_capacity() -
>>>>> CodeCacheMinimumFreeSpace)) {
>>>>>
>>>>> and in an other use second:
>>>>>
>>>>>  437     if (((size_t)cur + length) >= (size_t)red_zone_start()) {
>>>>>
>>>>> Can we just use only heap_unallocated_capacity()?
>>>>
>>>> I agree that it is confusing. The background is that they serve a bit
>>>> different purpose. In one case we are allocating from the freelist and
>>>> need to check against the adress where the 'red zone/critical
>>>> allocation area begins.' In the other it is a block allocation from
>>>> the contiguous heap where we know heap_unallocated_capacity is the
>>>> size of the block at the end of the heap. I'll come up with something
>>>> better.
>>>>
>>>> Thanks for taking your time!
>>>> //Nils
>>>>
>>>>>
>>>>> Thanks,
>>>>> Vladimir
>>>>>
>>>>> On 4/12/13 3:19 AM, Nils Eliasson wrote:
>>>>>> I have now incorporated both Vladimirs and Andreas suggestion:
>>>>>>
>>>>>> Changes to previous revision includes:
>>>>>> 1) Updated the checks in both heap::allocate and
>>>>>> heap::search_freelist
>>>>>> so that non critical allocation never stride into the
>>>>>> CodeCacheMinimumFreeSpace.
>>>>>> 2) Inverted the critical condition - now all allocations default
>>>>>> to not
>>>>>> critical. Only the places where we would go fatal if the allocation
>>>>>> returns null are critical.
>>>>>> 3) Refactored some names and removed duplicate methods in heap.cpp.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.08/
>>>>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.08/>
>>>>>>
>>>>>> This changes also opens up for decreasing the
>>>>>> CodeCacheMinimumFreeSpace
>>>>>> since we now guarantee that only critical allocation will end up
>>>>>> there.
>>>>>> And they will not grow to 500K when compilation have stopped anyways.
>>>>>>
>>>>>> Kind regards,
>>>>>> //Nils
>>>>>>
>>>>>>
>>>>>> On 2013-02-08 15:54, Nils Eliasson wrote:
>>>>>>> Vladimir and Andreas, thanks for your feedback! I am considering
>>>>>>> your
>>>>>>> suggestions.
>>>>>>>
>>>>>>> My fix is a step in the right direction but I realize it is not
>>>>>>> enough.
>>>>>>>
>>>>>>> Initially, it works good reserving the last part of the heap block.
>>>>>>> But if the application has a bigger active code footprint that what
>>>>>>> can fit in the code cache, there will be new flushes that will
>>>>>>> increase fragmentation and slowly reduce the usable part of the code
>>>>>>> cache. So I will add functionality for backing off.
>>>>>>>
>>>>>>> //Nils
>>>>>>>
>>>>>>> Schoesser, Andreas skrev 2013-02-01 10:58:
>>>>>>>> Hi Nils,
>>>>>>>>
>>>>>>>> this is IMO a good idea since we experienced the same problem with
>>>>>>>> one of our ports. I think in our case it was not the nmethod
>>>>>>>> sweeper
>>>>>>>> continuously running (there's MinCodeCacheFlushingInterval to only
>>>>>>>> speculatively disconnect nmethods every 30 seconds) but from the
>>>>>>>> compiler loop constantly polling the freelist without the nmethod
>>>>>>>> sweeper making any progress.
>>>>>>>>
>>>>>>>> You also might consider to propagate the "critical" flag down to
>>>>>>>> search_freelist(). Then, a simple address compare would do to keep
>>>>>>>> the end of the code cache free from nmethods:
>>>>>>>> - For non-critical blobs, search_freelist() may only fit free
>>>>>>>> blocks
>>>>>>>> < CodeCache::end() - CodeCacheMinimumFreeSpace
>>>>>>>> - For critical blocks, search_freelist() may fit all free blocks
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>> -----Original Message-----
>>>>>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net
>>>>>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of
>>>>>>>> Nils Eliasson
>>>>>>>> Sent: Donnerstag, 31. Januar 2013 19:41
>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>>>>> Subject: Re: RFR (M): 8006952: Slow VM due to excessive code cache
>>>>>>>> freelist iteration
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.04/
>>>>>>>>
>>>>>>>> //N
>>>>>>>>
>>>>>>>> Vladimir Kozlov skrev 2013-01-30 21:16:
>>>>>>>>> Nils, you sent webrev for other bug fix.
>>>>>>>>>
>>>>>>>>> Vladimir
>>>>>>>>>
>>>>>>>>> On 1/30/13 7:43 AM, Nils Eliasson wrote:
>>>>>>>>>> Thanks for the link, interesting background.
>>>>>>>>>>
>>>>>>>>>> Even if finding the largest block on the freelist is free, we
>>>>>>>>>> would
>>>>>>>>>> still have a misbehavior when the free heap block is less than
>>>>>>>>>> 1,5M
>>>>>>>>>> and
>>>>>>>>>> the flushing begins to create a 1,5M freelist block.
>>>>>>>>>>
>>>>>>>>>> Unallocated memory is a good measurement if have some continuous
>>>>>>>>>> memory
>>>>>>>>>> in reserve for critical blobs. I am thinking of using the
>>>>>>>>>> thresholds
>>>>>>>>>> against unallocated memory, but forbidding nmethods from using
>>>>>>>>>> the
>>>>>>>>>> last
>>>>>>>>>> 0,5M (MinimumFreeSpace) from the heap block. That would prevent
>>>>>>>>>> the VM
>>>>>>>>>> from fragmenting the last continuous memory space, and allow the
>>>>>>>>>> VM to
>>>>>>>>>> use all of the freelist.
>>>>>>>>>>
>>>>>>>>>> Like this:
>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8002364/webrev.03
>>>>>>>>>>
>>>>>>>>>> //Nils
>>>>>>>>>>
>>>>>>>>>> Vladimir Kozlov skrev 2013-01-29 20:35:
>>>>>>>>>>> Nils,
>>>>>>>>>>>
>>>>>>>>>>> You are reversing 7025742 fix. I would prefer to keep track of a
>>>>>>>>>>> largest free block during allocation in CodeCache (by using
>>>>>>>>>>> ordered
>>>>>>>>>>> list or other technique) to avoid scanning the list.
>>>>>>>>>>>
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 1/29/13 2:56 AM, Nils Eliasson wrote:
>>>>>>>>>>>> Remove continuous free block requirement for code cache
>>>>>>>>>>>> flushing and
>>>>>>>>>>>> minimum free space.
>>>>>>>>>>>>
>>>>>>>>>>>> This causes a degenerate behavior when the VM are repeatedly
>>>>>>>>>>>> flushing
>>>>>>>>>>>> the code cache trying to free up a continuous 1,5M block.
>>>>>>>>>>>> Since the
>>>>>>>>>>>> freelist is fragmented a significant part of the code cache is
>>>>>>>>>>>> kept
>>>>>>>>>>>> unallocated. A significant amount of time is spent
>>>>>>>>>>>> traversing the
>>>>>>>>>>>> freelist holding the code cache lock or waiting for other
>>>>>>>>>>>> threads
>>>>>>>>>>>> doing
>>>>>>>>>>>> the same.
>>>>>>>>>>>>
>>>>>>>>>>>> I don't see any reason for why a continuous block is required.
>>>>>>>>>>>> Code
>>>>>>>>>>>> cache flushing threshold (CodeCacheFlushingMinimumFreeSpace) is
>>>>>>>>>>>> default
>>>>>>>>>>>> 1,5M and code cache minimum threshold
>>>>>>>>>>>> (CodeCacheMinimumFreeSpace) is
>>>>>>>>>>>> 0,5M. Finding such a block on the freelist is often
>>>>>>>>>>>> unrealistic,
>>>>>>>>>>>> and has
>>>>>>>>>>>> not any purpose. The largest nmethods are in the order of
>>>>>>>>>>>> 250k, and
>>>>>>>>>>>> they
>>>>>>>>>>>> are not allocated when code cache is starting to run out. All
>>>>>>>>>>>> critical
>>>>>>>>>>>> adapters are small and will fit easily in the freelist or in
>>>>>>>>>>>> the
>>>>>>>>>>>> remaining heap space.
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.02/
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Nils Eliasson
>>>>>>>
>>>>>>
>>>>
>>>
>