RFR (M): 8006952: Slow VM due to excessive code cache freelist iteration

Mon Apr 15 03:36:26 PDT 2013

Hi,

On 2013-04-12 19:21, Vladimir Kozlov wrote:
> Nils,
>
> It is sad that you have to reverse parfait fix but I agree that your 
> changes are more clean.

Please - don't be! I taken extra precaution and have verified that there 
is a null check between the call to new and the constructor. This is 
consistent with both the C++ spec and the behaviour we rely on when 
running out of code cache. The CodeCache::has_space would have forced me 
to do it more complex, it would have to copy/wrap the heap::allocate 
algorithm but returning a bool instead of actually allocating.

Sorry for forgetting to add it to the change log in my request. I did 
not intend to revert it without being open about it.

I don't understand why Parfait is confused about 'this' being null in 
the constructor. I will open a bug on Parfait if I can find where to 
submit it.

>
> After these changes largest_free_block() is used only in statistics 
> output which is useless since it is not participating in codecache 
> allocations.  May be we should remove it and use 
> heap_unallocated_capacity() instead (do we need lock then?):

Yes, good idea. Must check that I don't break any log tool if I change 
the output though.

>
> size_t CodeCache::largest_free_block() {
>   return _heap->heap_unallocated_capacity();
> }
>
> Also red_zone_start() is confusing because it does not show how it is 
> related to heap_unallocated_capacity(). In one place you use one:
>
>  211     if (size(length) > (heap_unallocated_capacity() - 
> CodeCacheMinimumFreeSpace)) {
>
> and in an other use second:
>
>  437     if (((size_t)cur + length) >= (size_t)red_zone_start()) {
>
> Can we just use only heap_unallocated_capacity()?

I agree that it is confusing. The background is that they serve a bit 
different purpose. In one case we are allocating from the freelist and 
need to check against the adress where the 'red zone/critical allocation 
area begins.' In the other it is a block allocation from the contiguous 
heap where we know heap_unallocated_capacity is the size of the block at 
the end of the heap. I'll come up with something better.

Thanks for taking your time!
//Nils

>
> Thanks,
> Vladimir
>
> On 4/12/13 3:19 AM, Nils Eliasson wrote:
>> I have now incorporated both Vladimirs and Andreas suggestion:
>>
>> Changes to previous revision includes:
>> 1) Updated the checks in both heap::allocate and heap::search_freelist
>> so that non critical allocation never stride into the
>> CodeCacheMinimumFreeSpace.
>> 2) Inverted the critical condition - now all allocations default to not
>> critical. Only the places where we would go fatal if the allocation
>> returns null are critical.
>> 3) Refactored some names and removed duplicate methods in heap.cpp.
>>
>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.08/
>> <http://cr.openjdk.java.net/%7Eneliasso/8006952/webrev.08/>
>>
>> This changes also opens up for decreasing the CodeCacheMinimumFreeSpace
>> since we now guarantee that only critical allocation will end up there.
>> And they will not grow to 500K when compilation have stopped anyways.
>>
>> Kind regards,
>> //Nils
>>
>>
>> On 2013-02-08 15:54, Nils Eliasson wrote:
>>> Vladimir and Andreas, thanks for your feedback! I am considering your
>>> suggestions.
>>>
>>> My fix is a step in the right direction but I realize it is not enough.
>>>
>>> Initially, it works good reserving the last part of the heap block.
>>> But if the application has a bigger active code footprint that what
>>> can fit in the code cache, there will be new flushes that will
>>> increase fragmentation and slowly reduce the usable part of the code
>>> cache. So I will add functionality for backing off.
>>>
>>> //Nils
>>>
>>> Schoesser, Andreas skrev 2013-02-01 10:58:
>>>> Hi Nils,
>>>>
>>>> this is IMO a good idea since we experienced the same problem with
>>>> one of our ports. I think in our case it was not the nmethod sweeper
>>>> continuously running (there's MinCodeCacheFlushingInterval to only
>>>> speculatively disconnect nmethods every 30 seconds) but from the
>>>> compiler loop constantly polling the freelist without the nmethod
>>>> sweeper making any progress.
>>>>
>>>> You also might consider to propagate the "critical" flag down to
>>>> search_freelist(). Then, a simple address compare would do to keep
>>>> the end of the code cache free from nmethods:
>>>> - For non-critical blobs, search_freelist() may only fit free blocks
>>>> < CodeCache::end() - CodeCacheMinimumFreeSpace
>>>> - For critical blocks, search_freelist() may fit all free blocks
>>>>
>>>> Best regards,
>>>> Andreas
>>>>
>>>> -----Original Message-----
>>>> From: hotspot-compiler-dev-bounces at openjdk.java.net
>>>> [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of
>>>> Nils Eliasson
>>>> Sent: Donnerstag, 31. Januar 2013 19:41
>>>> To: hotspot-compiler-dev at openjdk.java.net
>>>> Subject: Re: RFR (M): 8006952: Slow VM due to excessive code cache
>>>> freelist iteration
>>>>
>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.04/
>>>>
>>>> //N
>>>>
>>>> Vladimir Kozlov skrev 2013-01-30 21:16:
>>>>> Nils, you sent webrev for other bug fix.
>>>>>
>>>>> Vladimir
>>>>>
>>>>> On 1/30/13 7:43 AM, Nils Eliasson wrote:
>>>>>> Thanks for the link, interesting background.
>>>>>>
>>>>>> Even if finding the largest block on the freelist is free, we would
>>>>>> still have a misbehavior when the free heap block is less than 1,5M
>>>>>> and
>>>>>> the flushing begins to create a 1,5M freelist block.
>>>>>>
>>>>>> Unallocated memory is a good measurement if have some continuous
>>>>>> memory
>>>>>> in reserve for critical blobs. I am thinking of using the thresholds
>>>>>> against unallocated memory, but forbidding nmethods from using the
>>>>>> last
>>>>>> 0,5M (MinimumFreeSpace) from the heap block. That would prevent 
>>>>>> the VM
>>>>>> from fragmenting the last continuous memory space, and allow the 
>>>>>> VM to
>>>>>> use all of the freelist.
>>>>>>
>>>>>> Like this:
>>>>>> http://cr.openjdk.java.net/~neliasso/8002364/webrev.03
>>>>>>
>>>>>> //Nils
>>>>>>
>>>>>> Vladimir Kozlov skrev 2013-01-29 20:35:
>>>>>>> Nils,
>>>>>>>
>>>>>>> You are reversing 7025742 fix. I would prefer to keep track of a
>>>>>>> largest free block during allocation in CodeCache (by using ordered
>>>>>>> list or other technique) to avoid scanning the list.
>>>>>>>
>>>>>>> Vladimir
>>>>>>>
>>>>>>> On 1/29/13 2:56 AM, Nils Eliasson wrote:
>>>>>>>> Remove continuous free block requirement for code cache 
>>>>>>>> flushing and
>>>>>>>> minimum free space.
>>>>>>>>
>>>>>>>> This causes a degenerate behavior when the VM are repeatedly
>>>>>>>> flushing
>>>>>>>> the code cache trying to free up a continuous 1,5M block. Since 
>>>>>>>> the
>>>>>>>> freelist is fragmented a significant part of the code cache is 
>>>>>>>> kept
>>>>>>>> unallocated. A significant amount of time is spent traversing the
>>>>>>>> freelist holding the code cache lock or waiting for other threads
>>>>>>>> doing
>>>>>>>> the same.
>>>>>>>>
>>>>>>>> I don't see any reason for why a continuous block is required. 
>>>>>>>> Code
>>>>>>>> cache flushing threshold (CodeCacheFlushingMinimumFreeSpace) is
>>>>>>>> default
>>>>>>>> 1,5M and code cache minimum threshold 
>>>>>>>> (CodeCacheMinimumFreeSpace) is
>>>>>>>> 0,5M. Finding such a block on the freelist is often unrealistic,
>>>>>>>> and has
>>>>>>>> not any purpose. The largest nmethods are in the order of 250k, 
>>>>>>>> and
>>>>>>>> they
>>>>>>>> are not allocated when code cache is starting to run out. All
>>>>>>>> critical
>>>>>>>> adapters are small and will fit easily in the freelist or in the
>>>>>>>> remaining heap space.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~neliasso/8006952/webrev.02/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Nils Eliasson
>>>
>>