[9] RFR(M): 8029799: vm/mlvm/anonloader/stress/oome prints warning: CodeHeap: # of free blocks > 10000

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Feb 6 09:29:33 PST 2014


Albert,

Did you tried with CodeCacheMinBlockLength less then 4 when CodeCacheSegmentSize is 128?
It is for completeness of experiment. As I said with C1 it is 1 and we use C1 with Tiered.

Thanks,
Vladimir

On 2/6/14 8:32 AM, Albert wrote:
> Hi,
>
> I have done more experiments to see the impact of CodeCacheMinBlockLength and CodeCacheSegmentSize.
> Both factors have an impact on the length of the freelist as well as on the memory that is possibly wasted.
>
> The table below contains detailed results. Here is a description of the numbers and how they are
> calculated:
>
> * freelist length: number of HeapBlocks that are in the freelist when the program finishes
> * freelist[kb]     : total memory [kB] that is in the freelist when the program finishes.
> * unused bytes in cb: unused bytes in all CodeBlob that are in the code cache when the program
>                          finishes. This number is calculated by substracting the size of the HeapBlock in
>                          which the nmethod is stored from the nmethod size. Note that the HeapBlock size is
>                          a multiple of CodeCacheMinBlockLength * CodeCacheSegmentSize.
> * segmap[kB]: size of the segment map that is used to map addresses to HeapBlocks (i.e., find the
>                          beginning of an nmethod). Increasing CodeCacheSegmentSize decreases the segmap
>                          size. For example, a CodeCacheSegmentSize of 32 bytes requires 32kB of segmap
>                          memory per allocated MB in the code cache. A CodeCacheSegmentSize of 64 bytes
>                          requires 16kB of segmap memory per allocated MB in the code cache....
> max_used: maximum allocated memory in the code cache.
> wasted_memory: =SUM(freelist + unused bytes in cb + segmap)
> memory overhead = max_used / wasted_memory
>
> The executive summary of the results is that increasing CodeCacheSegmentSize has no negative
> impact on the memory overhead (also no positive). Increasing CodeCacheSegmentSize reduces
> the freelist length, which makes searching the freelist faster.
>
> Note that the results obtained with a modified freelist search algorithm. In the changed version,
> the compiler chooses the first block that is large enough from the freelist (first-fit). In the old version,
> the compiler looked for the smallest possible block in the freelist into which the code fits (best-fit).
> My experiments indicate that best-fit does not provide better results (less memory overhead) than
> first-fit.
>
> To summarize, switching to a larger CodeCacheSegmentSize seems reasonable.
>
>
> Here are the detailed results:
>
>
> 	
> 	failing test case 	
> 	
> 	
> 	
> 	
> 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 64 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 3085 	2299 	902 	274 	16436 	
> 	3475 	21.14% 	
> 3993 	3366 	887 	283 	16959 	
> 	4536 	26.75% 	
> 3843 	2204 	900 	273 	16377 	
> 	3377 	20.62% 	
> 3859 	2260 	898 	273 	16382 	
> 	3431 	20.94% 	
> 3860 	2250 	897 	273 	16385 	
> 	3420 	20.87% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	22.07% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 128 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 474 	1020 	2073 	137 	17451 	
> 	3230 	18.51% 	
> 504 	1192 	2064 	136 	17413 	
> 	3392 	19.48% 	
> 484 	1188 	2064 	126 	17414 	
> 	3378 	19.40% 	
> 438 	1029 	2061 	136 	17399 	
> 	3226 	18.54% 	
>
> 	
> 	
> 	
> 	
> 	
> 	0 	18.98% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	
> 	Nashorn 	
> 	
> 	
> 	
> 	
> 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 64 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 709 	1190 	662 	1198 	76118 	
> 	3050 	4.01% 	
> 688 	4200 	635 	1234 	78448 	
> 	6069 	7.74% 	
> 707 	2617 	648 	1178 	74343 	
> 	4443 	5.98% 	
> 685 	1703 	660 	1205 	76903 	
> 	3568 	4.64% 	
> 760 	1638 	675 	1174 	74563 	
> 	3487 	4.68% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	5.41% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 128 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 206 	824 	1253 	607 	77469 	
> 	2684 	3.46% 	
> 247 	2019 	1265 	583 	74017 	
> 	3867 	5.22% 	
> 239 	958 	1230 	641 	81588 	
> 	2829 	3.47% 	
> 226 	1477 	1246 	595 	76119 	
> 	3318 	4.36% 	
> 225 	2390 	1239 	596 	76051 	
> 	4225 	5.56% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	4.41% 	
>
> 	
> 	compiler.compiler 	
> 	
> 	
> 	
> 	
> 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 64 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 440 	943 	263 	298 	18133 	
> 	1504 	8.29% 	
> 458 	480 	272 	295 	18443 	
> 	1047 	5.68% 	
> 536 	1278 	260 	306 	18776 	
> 	1844 	9.82% 	
> 426 	684 	268 	304 	18789 	
> 	1256 	6.68% 	
> 503 	1430 	258 	310 	18872 	
> 	1998 	10.59% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	8.21% 	Average
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	
> 	
>
> 	4 Blocks, 128 bytes 	
> 	
> 	
> 	
> 	
> 	
> 	
> freelist length 	freelist[kB] 	unused bytes in cb 	segmap[kB] 	max_used 	
> 	wasted 	memory overhead 	
> 163 	984 	510 	157 	19233 	
> 	1651 	8.58% 	
> 132 	729 	492 	151 	18614 	
> 	1372 	7.37% 	
> 187 	1212 	498 	152 	18630 	
> 	1862 	9.99% 	
> 198 	1268 	496 	155 	18974 	
> 	1919 	10.11% 	
> 225 	1268 	496 	152 	18679 	
> 	1916 	10.26% 	
>
> 	
> 	
> 	
> 	
> 	
> 	
> 	9.26% 	
>
>
>
>
>
>
>
>
>
>
> On 02/05/2014 07:57 PM, Vladimir Kozlov wrote:
>> On 2/5/14 8:28 AM, Albert wrote:
>>> Hi Vladimir,
>>>
>>> thanks for looking at this. I've done the proposed measurements. The
>>> code which I used to
>>> get the data is included in the following webrev:
>>>
>>> http://cr.openjdk.java.net/~anoll/8029799/webrev.01/
>>
>> Good.
>>
>>>
>>> I think some people might be interested in getting that data, so we
>>> might want to keep
>>> that additional output. The exact output format can be changed later
>>> (JDK-8005885).
>>
>> I agree that it is useful information.
>>
>>>
>>> Here are the results:
>>>
>>> - failing test case:
>>>     - original: allocated in freelist: 2168kB, unused bytes in CodeBlob:
>>> 818kB,   max_used: 21983kB
>>>     - patch   : alloacted in freelist: 1123kB, unused bytes in CodeBlob:
>>> 2188kB, max_used: 17572kB
>>> - nashorn:
>>>    - original : allocated in freelist: 2426kB, unused bytes in CodeBlob:
>>> 1769kB, max_used: 201886kB
>>>    - patch    : allocated in freelist: 1150kB, unused bytes in CodeBlob:
>>> 3458kB, max_used: 202394kB
>>> - SPECJVM2008: compiler.compiler:
>>>    - original  : allocated in freelist:  168kB, unused bytes in
>>> CodeBlob: 342kB, max_used: 19837kB
>>>    - patch     : allocated in freelist:  873kB, unused bytes in
>>> CodeBlob: 671kB, max_used: 21184kB
>>>
>>> The minimum size that can be allocated from the code cache is
>>> platform-dependent.
>>> I.e., the minimum size depends on CodeCacheSegmentSize and
>>> CodeCacheMinBlockLength.
>>> On x86, for example, the min. allocatable size from the code cache is
>>> 64*4=256bytes.
>>
>> There is this comment in CodeHeap::search_freelist():
>>   // Don't leave anything on the freelist smaller than CodeCacheMinBlockLength.
>>
>> What happens if we scale down CodeCacheMinBlockLength when we increase CodeCacheSegmentSize to keep the same bytes
>> size of minimum block?:
>>
>> +     FLAG_SET_DEFAULT(CodeCacheSegmentSize, CodeCacheSegmentSize * 2);
>> +     FLAG_SET_DEFAULT(CodeCacheMinBlockLength, CodeCacheMinBlockLength/2);
>>
>> Based on your table below those small nmethods will use only 256 bytes blocks instead of 512 (128*4).
>>
>> Note for C1 in Client VM CodeCacheMinBlockLength is 1. I don't know why for C2 it is 4. Could you also try
>> CodeCacheMinBlockLength = 1?
>>
>> All above is with CodeCacheSegmentSize 128 bytes.
>>
>>> The size of adapters ranges from 400b to 600b.
>>> Here is the beginning of the nmethod size distribution of the failing
>>> test case:
>>>
>>
>> Is it possible it is in segments number and not in bytes? If it really bytes what such (32-48 bytes) nmethods look like?
>>
>> Thanks,
>> Vladimir
>>
>>>
>>> nmethod size distribution (non-zombie java)
>>> -------------------------------------------------
>>> 0-16 bytes                                0[bytes]
>>> 16-32 bytes                                0
>>> 32-48 bytes                                45
>>> 48-64 bytes                                0
>>> 64-80 bytes                                41
>>> 80-96 bytes                                0
>>> 96-112 bytes                               6247
>>> 112-128 bytes                               0
>>> 128-144 bytes                               249
>>> 144-160 bytes                               0
>>> 160-176 bytes                               139
>>> 176-192 bytes                               0
>>> 192-208 bytes                               177
>>> 208-224 bytes                               0
>>> 224-240 bytes                               180
>>> 240-256 bytes                               0
>>> ...
>>>
>>>
>>> I do not see a problem for increasing the CodeCacheSegmentSize if tiered
>>> compilation
>>> is enabled.
>>>
>>> What do you think?
>>>
>>>
>>> Best,
>>> Albert
>>>
>>>
>>> On 02/04/2014 05:52 PM, Vladimir Kozlov wrote:
>>>> I think the suggestion is reasonable since we increase CodeCache *5
>>>> for Tiered.
>>>> Albert, is it possible to collect data how much space is wasted in %
>>>> before and after this change: free space in which we can't allocate +
>>>> unused bytes at the end of nmethods/adapters? Can we squeeze an
>>>> adapter into 64 bytes?
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 2/4/14 7:41 AM, Albert wrote:
>>>>> Hi,
>>>>>
>>>>> could I get reviews for this patch (nightly failure)?
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~anoll/8029799/webrev.00/
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8029799
>>>>>
>>>>> problem: The freelist of the code cache exceeds 10'000 items, which
>>>>> results in a VM warning.
>>>>>                  The problem behind the warning is that the freelist
>>>>> is populated by a large number
>>>>>                  of small free blocks. For example, in failing test
>>>>> case (see header), the freelist grows
>>>>>                  up to more than 3500 items where the largest item on
>>>>> the list is 9 segments (one segment
>>>>>                  is 64 bytes). That experiment was done on my laptop.
>>>>> Such a large freelist can indeed be
>>>>>                  a performance problem, since we use a linear search
>>>>> to traverse the freelist.
>>>>> solution:  One way to solve the problem is to increase the minimal
>>>>> allocation size in the code cache.
>>>>>                  This can be done by two means: we can increase
>>>>> CodeCacheMinBlockLength and/or
>>>>>                  CodeCacheSegmentSize. This patch follows the latter
>>>>> approach, since increasing
>>>>>                  CodeCacheSegmentSize decreases the size that is
>>>>> required by the segment map. More
>>>>>                  concretely, the patch doubles the
>>>>> CodeCacheSegmentSize from 64 byte to 128 bytes
>>>>>                  if tiered compilation is enabled.
>>>>>                  The patch also contains an optimization in the
>>>>> freelist search (stop searching if we found
>>>>>                  the appropriate size) and contains some code cleanups.
>>>>> testing:    With the proposed change, the size of the freelist is
>>>>> reduced to 200 items. There is only
>>>>>                  a slight increase in memory required by code cache
>>>>> by at most 3% (all data measured
>>>>>                  for the failing test case on a Linux 64-bit system,
>>>>> 4 cores).
>>>>>                  To summarize, increasing the minimum allocation size
>>>>> in the code cache results in
>>>>>                  potentially more unused memory in the code cache due
>>>>> to unused bits at the end of
>>>>>                  an nmethod. The advantage is that we potentially
>>>>> have less fragmentation.
>>>>>
>>>>> proposal: - I think we could remove CodeCacheMinBlockLength without
>>>>> loss of generality or usability
>>>>>                    and instead adapt the parameter
>>>>> CodeCacheSegmentSize at Vm startup.
>>>>>                    Any opinions?
>>>>>
>>>>> Many thanks in advance,
>>>>> Albert
>>>>>
>>>
>


More information about the hotspot-compiler-dev mailing list