[9] RFR(M): 8029799: vm/mlvm/anonloader/stress/oome prints warning: CodeHeap: # of free blocks > 10000

Albert Noll albert.noll at oracle.com
Thu Feb 6 08:45:05 PST 2014


My previous mail contains an error. The size of a HeapBlock must be a multiple of CodeCacheSegmentSize and at least CodeCacheSegmentSize * CodeCacheMinBlockLength.

Albert

Von meinem iPhone gesendet

> Am 06.02.2014 um 17:32 schrieb Albert <albert.noll at oracle.com>:
> 
> Hi,
> 
> I have done more experiments to see the impact of CodeCacheMinBlockLength and CodeCacheSegmentSize.
> Both factors have an impact on the length of the freelist as well as on the memory that is possibly wasted.
> 
> The table below contains detailed results. Here is a description of the numbers and how they are
> calculated:
> 
> * freelist length: number of HeapBlocks that are in the freelist when the program finishes
> * freelist[kb]     : total memory [kB] that is in the freelist when the program finishes. 
> * unused bytes in cb: unused bytes in all CodeBlob that are in the code cache when the program
>                         finishes. This number is calculated by substracting the size of the HeapBlock in 
>                         which the nmethod is stored from the nmethod size. Note that the HeapBlock size is 
>                         a multiple of CodeCacheMinBlockLength * CodeCacheSegmentSize.
> * segmap[kB]: size of the segment map that is used to map addresses to HeapBlocks (i.e., find the
>                         beginning of an nmethod). Increasing CodeCacheSegmentSize decreases the segmap 
>                         size. For example, a CodeCacheSegmentSize of 32 bytes requires 32kB of segmap 
>                         memory per allocated MB in the code cache. A CodeCacheSegmentSize of 64 bytes 
>                         requires 16kB of segmap memory per allocated MB in the code cache....
> max_used: maximum allocated memory in the code cache.
> wasted_memory: =SUM(freelist + unused bytes in cb + segmap)
> memory overhead = max_used / wasted_memory
> 
> The executive summary of the results is that increasing CodeCacheSegmentSize has no negative
> impact on the memory overhead (also no positive). Increasing CodeCacheSegmentSize reduces
> the freelist length, which makes searching the freelist faster.
> 
> Note that the results obtained with a modified freelist search algorithm. In the changed version,
> the compiler chooses the first block that is large enough from the freelist (first-fit). In the old version, 
> the compiler looked for the smallest possible block in the freelist into which the code fits (best-fit). 
> My experiments indicate that best-fit does not provide better results (less memory overhead) than 
> first-fit.
> 
> To summarize, switching to a larger CodeCacheSegmentSize seems reasonable.
> 
> 
> Here are the detailed results:
>  
> 
> 
> failing test case	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 64 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 3085	2299	902	274	16436	
> 3475	21.14%	
> 3993	3366	887	283	16959	
> 4536	26.75%	
> 3843	2204	900	273	16377	
> 3377	20.62%	
> 3859	2260	898	273	16382	
> 3431	20.94%	
> 3860	2250	897	273	16385	
> 3420	20.87%	
> 
> 
> 
> 
> 
> 
> 
> 22.07%	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 128 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 474	1020	2073	137	17451	
> 3230	18.51%	
> 504	1192	2064	136	17413	
> 3392	19.48%	
> 484	1188	2064	126	17414	
> 3378	19.40%	
> 438	1029	2061	136	17399	
> 3226	18.54%	
> 
> 
> 
> 
> 
> 
> 0	18.98%	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Nashorn	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 64 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 709	1190	662	1198	76118	
> 3050	4.01%	
> 688	4200	635	1234	78448	
> 6069	7.74%	
> 707	2617	648	1178	74343	
> 4443	5.98%	
> 685	1703	660	1205	76903	
> 3568	4.64%	
> 760	1638	675	1174	74563	
> 3487	4.68%	
> 
> 
> 
> 
> 
> 
> 
> 5.41%	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 128 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 206	824	1253	607	77469	
> 2684	3.46%	
> 247	2019	1265	583	74017	
> 3867	5.22%	
> 239	958	1230	641	81588	
> 2829	3.47%	
> 226	1477	1246	595	76119	
> 3318	4.36%	
> 225	2390	1239	596	76051	
> 4225	5.56%	
> 
> 
> 
> 
> 
> 
> 
> 4.41%	
> 
> 
> compiler.compiler	
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 64 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 440	943	263	298	18133	
> 1504	8.29%	
> 458	480	272	295	18443	
> 1047	5.68%	
> 536	1278	260	306	18776	
> 1844	9.82%	
> 426	684	268	304	18789	
> 1256	6.68%	
> 503	1430	258	310	18872	
> 1998	10.59%	
> 
> 
> 
> 
> 
> 
> 
> 8.21%	Average
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 4 Blocks, 128 bytes	
> 
> 
> 
> 
> 
> 
> freelist length	freelist[kB]	unused bytes in cb	segmap[kB]	max_used	
> wasted	memory overhead	
> 163	984	510	157	19233	
> 1651	8.58%	
> 132	729	492	151	18614	
> 1372	7.37%	
> 187	1212	498	152	18630	
> 1862	9.99%	
> 198	1268	496	155	18974	
> 1919	10.11%	
> 225	1268	496	152	18679	
> 1916	10.26%	
> 
> 
> 
> 
> 
> 
> 
> 9.26%	
> 
> 
> 
> 
> 
> 
> 
> 
> 
>> On 02/05/2014 07:57 PM, Vladimir Kozlov wrote:
>>> On 2/5/14 8:28 AM, Albert wrote: 
>>> Hi Vladimir, 
>>> 
>>> thanks for looking at this. I've done the proposed measurements. The 
>>> code which I used to 
>>> get the data is included in the following webrev: 
>>> 
>>> http://cr.openjdk.java.net/~anoll/8029799/webrev.01/
>> 
>> Good. 
>> 
>>> 
>>> I think some people might be interested in getting that data, so we 
>>> might want to keep 
>>> that additional output. The exact output format can be changed later 
>>> (JDK-8005885).
>> 
>> I agree that it is useful information. 
>> 
>>> 
>>> Here are the results: 
>>> 
>>> - failing test case: 
>>>     - original: allocated in freelist: 2168kB, unused bytes in CodeBlob: 
>>> 818kB,   max_used: 21983kB 
>>>     - patch   : alloacted in freelist: 1123kB, unused bytes in CodeBlob: 
>>> 2188kB, max_used: 17572kB 
>>> - nashorn: 
>>>    - original : allocated in freelist: 2426kB, unused bytes in CodeBlob: 
>>> 1769kB, max_used: 201886kB 
>>>    - patch    : allocated in freelist: 1150kB, unused bytes in CodeBlob: 
>>> 3458kB, max_used: 202394kB 
>>> - SPECJVM2008: compiler.compiler: 
>>>    - original  : allocated in freelist:  168kB, unused bytes in 
>>> CodeBlob: 342kB, max_used: 19837kB 
>>>    - patch     : allocated in freelist:  873kB, unused bytes in 
>>> CodeBlob: 671kB, max_used: 21184kB 
>>> 
>>> The minimum size that can be allocated from the code cache is 
>>> platform-dependent. 
>>> I.e., the minimum size depends on CodeCacheSegmentSize and 
>>> CodeCacheMinBlockLength. 
>>> On x86, for example, the min. allocatable size from the code cache is 
>>> 64*4=256bytes.
>> 
>> There is this comment in CodeHeap::search_freelist(): 
>>   // Don't leave anything on the freelist smaller than CodeCacheMinBlockLength. 
>> 
>> What happens if we scale down CodeCacheMinBlockLength when we increase CodeCacheSegmentSize to keep the same bytes size of minimum block?: 
>> 
>> +     FLAG_SET_DEFAULT(CodeCacheSegmentSize, CodeCacheSegmentSize * 2); 
>> +     FLAG_SET_DEFAULT(CodeCacheMinBlockLength, CodeCacheMinBlockLength/2); 
>> 
>> Based on your table below those small nmethods will use only 256 bytes blocks instead of 512 (128*4). 
>> 
>> Note for C1 in Client VM CodeCacheMinBlockLength is 1. I don't know why for C2 it is 4. Could you also try CodeCacheMinBlockLength = 1? 
>> 
>> All above is with CodeCacheSegmentSize 128 bytes. 
>> 
>>> The size of adapters ranges from 400b to 600b. 
>>> Here is the beginning of the nmethod size distribution of the failing 
>>> test case:
>> 
>> Is it possible it is in segments number and not in bytes? If it really bytes what such (32-48 bytes) nmethods look like? 
>> 
>> Thanks, 
>> Vladimir 
>> 
>>> 
>>> nmethod size distribution (non-zombie java) 
>>> ------------------------------------------------- 
>>> 0-16 bytes                                0[bytes] 
>>> 16-32 bytes                                0 
>>> 32-48 bytes                                45 
>>> 48-64 bytes                                0 
>>> 64-80 bytes                                41 
>>> 80-96 bytes                                0 
>>> 96-112 bytes                               6247 
>>> 112-128 bytes                               0 
>>> 128-144 bytes                               249 
>>> 144-160 bytes                               0 
>>> 160-176 bytes                               139 
>>> 176-192 bytes                               0 
>>> 192-208 bytes                               177 
>>> 208-224 bytes                               0 
>>> 224-240 bytes                               180 
>>> 240-256 bytes                               0 
>>> ... 
>>> 
>>> 
>>> I do not see a problem for increasing the CodeCacheSegmentSize if tiered 
>>> compilation 
>>> is enabled. 
>>> 
>>> What do you think? 
>>> 
>>> 
>>> Best, 
>>> Albert 
>>> 
>>> 
>>>> On 02/04/2014 05:52 PM, Vladimir Kozlov wrote: 
>>>> I think the suggestion is reasonable since we increase CodeCache *5 
>>>> for Tiered. 
>>>> Albert, is it possible to collect data how much space is wasted in % 
>>>> before and after this change: free space in which we can't allocate + 
>>>> unused bytes at the end of nmethods/adapters? Can we squeeze an 
>>>> adapter into 64 bytes? 
>>>> 
>>>> Thanks, 
>>>> Vladimir 
>>>> 
>>>>> On 2/4/14 7:41 AM, Albert wrote: 
>>>>> Hi, 
>>>>> 
>>>>> could I get reviews for this patch (nightly failure)? 
>>>>> 
>>>>> webrev: http://cr.openjdk.java.net/~anoll/8029799/webrev.00/ 
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8029799 
>>>>> 
>>>>> problem: The freelist of the code cache exceeds 10'000 items, which 
>>>>> results in a VM warning. 
>>>>>                  The problem behind the warning is that the freelist 
>>>>> is populated by a large number 
>>>>>                  of small free blocks. For example, in failing test 
>>>>> case (see header), the freelist grows 
>>>>>                  up to more than 3500 items where the largest item on 
>>>>> the list is 9 segments (one segment 
>>>>>                  is 64 bytes). That experiment was done on my laptop. 
>>>>> Such a large freelist can indeed be 
>>>>>                  a performance problem, since we use a linear search 
>>>>> to traverse the freelist. 
>>>>> solution:  One way to solve the problem is to increase the minimal 
>>>>> allocation size in the code cache. 
>>>>>                  This can be done by two means: we can increase 
>>>>> CodeCacheMinBlockLength and/or 
>>>>>                  CodeCacheSegmentSize. This patch follows the latter 
>>>>> approach, since increasing 
>>>>>                  CodeCacheSegmentSize decreases the size that is 
>>>>> required by the segment map. More 
>>>>>                  concretely, the patch doubles the 
>>>>> CodeCacheSegmentSize from 64 byte to 128 bytes 
>>>>>                  if tiered compilation is enabled. 
>>>>>                  The patch also contains an optimization in the 
>>>>> freelist search (stop searching if we found 
>>>>>                  the appropriate size) and contains some code cleanups. 
>>>>> testing:    With the proposed change, the size of the freelist is 
>>>>> reduced to 200 items. There is only 
>>>>>                  a slight increase in memory required by code cache 
>>>>> by at most 3% (all data measured 
>>>>>                  for the failing test case on a Linux 64-bit system, 
>>>>> 4 cores). 
>>>>>                  To summarize, increasing the minimum allocation size 
>>>>> in the code cache results in 
>>>>>                  potentially more unused memory in the code cache due 
>>>>> to unused bits at the end of 
>>>>>                  an nmethod. The advantage is that we potentially 
>>>>> have less fragmentation. 
>>>>> 
>>>>> proposal: - I think we could remove CodeCacheMinBlockLength without 
>>>>> loss of generality or usability 
>>>>>                    and instead adapt the parameter 
>>>>> CodeCacheSegmentSize at Vm startup. 
>>>>>                    Any opinions? 
>>>>> 
>>>>> Many thanks in advance, 
>>>>> Albert
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140206/54edc3a6/attachment-0001.html 


More information about the hotspot-compiler-dev mailing list