RFR (M): 8006952: Slow VM due to excessive code cache freelist iteration

Tue Apr 16 14:34:20 PDT 2013

On 4/16/13 1:44 PM, Nils Eliasson wrote:
>
> On 2013-04-16 20:39, Vladimir Kozlov wrote:
>> I asked about changes in CompileBroker::compiler_thread_loop():
>>
>> 1584       if (CodeCache::unallocated_capacity() <
>> CodeCacheMinimumFreeSpace) {
>> 1585         // the code cache is really full
>> 1586         handle_full_code_cache();
>>
>> It is executed before we compile a method.
>>
>> For example, if we have a hot method at the end of codecache which is
>> used then the check above will prevent us from compiling any methods
>> even when codecache has enough space in the middle (a block on
>> free_list). Do I get it right?
>
> No, the space on the free list is included in
> CodeCache::unallocated_capacity().

I am sorry, I mistook CodeHeap::heap_unallocated_capacity() method for 
CodeHeap::unallocated_capacity(). And unallocated_capacity() takes into 
account free_list as you said.

Changes are good for push.

Thanks,
Vladimir

>
> But we might have 1500k of really fragmented space on the free list, and
> nothing but the adapter/stub-reserved space in the heap. Then compiles
> will pass the check at 1584 and 1587, but later fail on
> bufferblob/nmethod allocation. But that case is handled by calls too
> handle_full_code_cache() at the allocation sites.
>
> //N
>
>>
>> Thanks,
>> Vladimir
>>
>> On 4/16/13 11:18 AM, Nils Eliasson wrote:
>>> Yes.
>>>
>>> So its pretty much like it was before:
>>>
>>> 1) Try to allocate from the free list using best fit
>>> 2) Allocate a new chunk from the code cache heap (always increasing,
>>> reclaimed memory always goes to the free list)
>>>
>>> With the addition that for both 1 and 2, stubs and adapters have a
>>> strict monopoly on the last 500k of address space. But they are
>>> themselves free to be placed wherever possible, and will essentially
>>> only use the last part as a last resort.
>>>
>>> I belive this will allow us to decrease the CodeCacheMinimumThreshold
>>> since we now have strong guarantees that nmethods will not eat up or
>>> fragment the last part. But that is for another change.
>>>
>>> Thanks for your time!
>>> //Nils
>>>
>>>
>>>
>>>
>>> On 2013-04-16 17:00, Vladimir Kozlov wrote:
>>>> Looks good.
>>>>
>>>> Do I understand correctly that now we bailout from in compileBroker
>>>> conservatively if no enough space at the end of code cache? But in a
>>>> middle we could have a big free block. Right?
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 4/16/13 3:16 AM, Nils Eliasson wrote:
>>>>> Latest webrev: http://cr.openjdk.java.net/~neliasso/8006952/webrev.09/
>>>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.09/>
>>>>>
>>>>> Updates after comments from Vladimir K. and Niclas.
>>>>>
>>>>> - Removed 'red_zone' naming and refined comments
>>>>> - Refined naming in heap.cpp/hpp on conversion methods
>>>>> size(...)/number_of_segments(...) to
>>>>> size_to_segments(...)/segments_to_size(...)
>>>>> - Bool flags as verbs (critical -> is_critical)
>>>>> - Removed get_largest_block() and all its uses.
>>>>>
>>>>> Enjoy!
>>>>>
>>>>> //Nils
>>>>>
>>>>> On 2013-04-15 12:36, Nils Eliasson wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> On 2013-04-12 19:21, Vladimir Kozlov wrote:
>>>>>>> Nils,
>>>>>>>
>>>>>>> It is sad that you have to reverse parfait fix but I agree that your
>>>>>>> changes are more clean.
>>>>>>
>>>>>> Please - don't be! I taken extra precaution and have verified that
>>>>>> there is a null check between the call to new and the constructor.
>>>>>> This is consistent with both the C++ spec and the behaviour we
>>>>>> rely on
>>>>>> when running out of code cache. The CodeCache::has_space would have
>>>>>> forced me to do it more complex, it would have to copy/wrap the
>>>>>> heap::allocate algorithm but returning a bool instead of actually
>>>>>> allocating.
>>>>>>
>>>>>> Sorry for forgetting to add it to the change log in my request. I did
>>>>>> not intend to revert it without being open about it.
>>>>>>
>>>>>> I don't understand why Parfait is confused about 'this' being null in
>>>>>> the constructor. I will open a bug on Parfait if I can find where to
>>>>>> submit it.
>>>>>>
>>>>>>>
>>>>>>> After these changes largest_free_block() is used only in statistics
>>>>>>> output which is useless since it is not participating in codecache
>>>>>>> allocations.  May be we should remove it and use
>>>>>>> heap_unallocated_capacity() instead (do we need lock then?):
>>>>>>
>>>>>> Yes, good idea. Must check that I don't break any log tool if I
>>>>>> change
>>>>>> the output though.
>>>>>>
>>>>>>>
>>>>>>> size_t CodeCache::largest_free_block() {
>>>>>>>   return _heap->heap_unallocated_capacity();
>>>>>>> }
>>>>>>>
>>>>>>> Also red_zone_start() is confusing because it does not show how
>>>>>>> it is
>>>>>>> related to heap_unallocated_capacity(). In one place you use one:
>>>>>>>
>>>>>>>  211     if (size(length) > (heap_unallocated_capacity() -
>>>>>>> CodeCacheMinimumFreeSpace)) {
>>>>>>>
>>>>>>> and in an other use second:
>>>>>>>
>>>>>>>  437     if (((size_t)cur + length) >= (size_t)red_zone_start()) {
>>>>>>>
>>>>>>> Can we just use only heap_unallocated_capacity()?
>>>>>>
>>>>>> I agree that it is confusing. The background is that they serve a bit
>>>>>> different purpose. In one case we are allocating from the freelist
>>>>>> and
>>>>>> need to check against the adress where the 'red zone/critical
>>>>>> allocation area begins.' In the other it is a block allocation from
>>>>>> the contiguous heap where we know heap_unallocated_capacity is the
>>>>>> size of the block at the end of the heap. I'll come up with something
>>>>>> better.
>>>>>>
>>>>>> Thanks for taking your time!
>>>>>> //Nils
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 4/12/13 3:19 AM, Nils Eliasson wrote:
>>>>>>>> I have now incorporated both Vladimirs and Andreas suggestion:
>>>>>>>>
>>>>>>>> Changes to previous revision includes:
>>>>>>>> 1) Updated the checks in both heap::allocate and
>>>>>>>> heap::search_freelist
>>>>>>>> so that non critical allocation never stride into the
>>>>>>>> CodeCacheMinimumFreeSpace.
>>>>>>>> 2) Inverted the critical condition - now all allocations default
>>>>>>>> to not
>>>>>>>> critical. Only the places where we would go fatal if the allocation
>>>>>>>> returns null are critical.
>>>>>>>> 3) Refactored some names and removed duplicate methods in heap.cpp.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.08/
>>>>>>>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.08/>
>>>>>>>>
>>>>>>>> This changes also opens up for decreasing the
>>>>>>>> CodeCacheMinimumFreeSpace
>>>>>>>> since we now guarantee that only critical allocation will end up
>>>>>>>> there.
>>>>>>>> And they will not grow to 500K when compilation have stopped
>>>>>>>> anyways.
>>>>>>>>
>>>>>>>> Kind regards,
>>>>>>>> //Nils
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2013-02-08 15:54, Nils Eliasson wrote:
>>>>>>>>> Vladimir and Andreas, thanks for your feedback! I am considering
>>>>>>>>> your
>>>>>>>>> suggestions.
>>>>>>>>>
>>>>>>>>> My fix is a step in the right direction but I realize it is not
>>>>>>>>> enough.
>>>>>>>>>
>>>>>>>>> Initially, it works good reserving the last part of the heap
>>>>>>>>> block.
>>>>>>>>> But if the application has a bigger active code footprint that
>>>>>>>>> what
>>>>>>>>> can fit in the code cache, there will be new flushes that will
>>>>>>>>> increase fragmentation and slowly reduce the usable part of the
>>>>>>>>> code
>>>>>>>>> cache. So I will add functionality for backing off.
>>>>>>>>>
>>>>>>>>> //Nils
>>>>>>>>>
>>>>>>>>> Schoesser, Andreas skrev 2013-02-01 10:58:
>>>>>>>>>> Hi Nils,
>>>>>>>>>>
>>>>>>>>>> this is IMO a good idea since we experienced the same problem
>>>>>>>>>> with
>>>>>>>>>> one of our ports. I think in our case it was not the nmethod
>>>>>>>>>> sweeper
>>>>>>>>>> continuously running (there's MinCodeCacheFlushingInterval to
>>>>>>>>>> only
>>>>>>>>>> speculatively disconnect nmethods every 30 seconds) but from the
>>>>>>>>>> compiler loop constantly polling the freelist without the nmethod
>>>>>>>>>> sweeper making any progress.
>>>>>>>>>>
>>>>>>>>>> You also might consider to propagate the "critical" flag down to
>>>>>>>>>> search_freelist(). Then, a simple address compare would do to
>>>>>>>>>> keep
>>>>>>>>>> the end of the code cache free from nmethods:
>>>>>>>>>> - For non-critical blobs, search_freelist() may only fit free
>>>>>>>>>> blocks
>>>>>>>>>> < CodeCache::end() - CodeCacheMinimumFreeSpace
>>>>>>>>>> - For critical blocks, search_freelist() may fit all free blocks
>>>>>>>>>>
>>>>>>>>>> Best regards,
>>>>>>>>>> Andreas
>>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net
>>>>>>>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On
>>>>>>>>>> Behalf Of
>>>>>>>>>> Nils Eliasson
>>>>>>>>>> Sent: Donnerstag, 31. Januar 2013 19:41
>>>>>>>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>>>>>>>> Subject: Re: RFR (M): 8006952: Slow VM due to excessive code
>>>>>>>>>> cache
>>>>>>>>>> freelist iteration
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.04/
>>>>>>>>>>
>>>>>>>>>> //N
>>>>>>>>>>
>>>>>>>>>> Vladimir Kozlov skrev 2013-01-30 21:16:
>>>>>>>>>>> Nils, you sent webrev for other bug fix.
>>>>>>>>>>>
>>>>>>>>>>> Vladimir
>>>>>>>>>>>
>>>>>>>>>>> On 1/30/13 7:43 AM, Nils Eliasson wrote:
>>>>>>>>>>>> Thanks for the link, interesting background.
>>>>>>>>>>>>
>>>>>>>>>>>> Even if finding the largest block on the freelist is free, we
>>>>>>>>>>>> would
>>>>>>>>>>>> still have a misbehavior when the free heap block is less than
>>>>>>>>>>>> 1,5M
>>>>>>>>>>>> and
>>>>>>>>>>>> the flushing begins to create a 1,5M freelist block.
>>>>>>>>>>>>
>>>>>>>>>>>> Unallocated memory is a good measurement if have some
>>>>>>>>>>>> continuous
>>>>>>>>>>>> memory
>>>>>>>>>>>> in reserve for critical blobs. I am thinking of using the
>>>>>>>>>>>> thresholds
>>>>>>>>>>>> against unallocated memory, but forbidding nmethods from using
>>>>>>>>>>>> the
>>>>>>>>>>>> last
>>>>>>>>>>>> 0,5M (MinimumFreeSpace) from the heap block. That would prevent
>>>>>>>>>>>> the VM
>>>>>>>>>>>> from fragmenting the last continuous memory space, and allow
>>>>>>>>>>>> the
>>>>>>>>>>>> VM to
>>>>>>>>>>>> use all of the freelist.
>>>>>>>>>>>>
>>>>>>>>>>>> Like this:
>>>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8002364/webrev.03
>>>>>>>>>>>>
>>>>>>>>>>>> //Nils
>>>>>>>>>>>>
>>>>>>>>>>>> Vladimir Kozlov skrev 2013-01-29 20:35:
>>>>>>>>>>>>> Nils,
>>>>>>>>>>>>>
>>>>>>>>>>>>> You are reversing 7025742 fix. I would prefer to keep track
>>>>>>>>>>>>> of a
>>>>>>>>>>>>> largest free block during allocation in CodeCache (by using
>>>>>>>>>>>>> ordered
>>>>>>>>>>>>> list or other technique) to avoid scanning the list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 1/29/13 2:56 AM, Nils Eliasson wrote:
>>>>>>>>>>>>>> Remove continuous free block requirement for code cache
>>>>>>>>>>>>>> flushing and
>>>>>>>>>>>>>> minimum free space.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This causes a degenerate behavior when the VM are repeatedly
>>>>>>>>>>>>>> flushing
>>>>>>>>>>>>>> the code cache trying to free up a continuous 1,5M block.
>>>>>>>>>>>>>> Since the
>>>>>>>>>>>>>> freelist is fragmented a significant part of the code
>>>>>>>>>>>>>> cache is
>>>>>>>>>>>>>> kept
>>>>>>>>>>>>>> unallocated. A significant amount of time is spent
>>>>>>>>>>>>>> traversing the
>>>>>>>>>>>>>> freelist holding the code cache lock or waiting for other
>>>>>>>>>>>>>> threads
>>>>>>>>>>>>>> doing
>>>>>>>>>>>>>> the same.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't see any reason for why a continuous block is
>>>>>>>>>>>>>> required.
>>>>>>>>>>>>>> Code
>>>>>>>>>>>>>> cache flushing threshold
>>>>>>>>>>>>>> (CodeCacheFlushingMinimumFreeSpace) is
>>>>>>>>>>>>>> default
>>>>>>>>>>>>>> 1,5M and code cache minimum threshold
>>>>>>>>>>>>>> (CodeCacheMinimumFreeSpace) is
>>>>>>>>>>>>>> 0,5M. Finding such a block on the freelist is often
>>>>>>>>>>>>>> unrealistic,
>>>>>>>>>>>>>> and has
>>>>>>>>>>>>>> not any purpose. The largest nmethods are in the order of
>>>>>>>>>>>>>> 250k, and
>>>>>>>>>>>>>> they
>>>>>>>>>>>>>> are not allocated when code cache is starting to run out. All
>>>>>>>>>>>>>> critical
>>>>>>>>>>>>>> adapters are small and will fit easily in the freelist or in
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> remaining heap space.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.02/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Nils Eliasson
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>