[9] RFR(M): 8029799: vm/mlvm/anonloader/stress/oome prints warning: CodeHeap: # of free blocks > 10000
Albert
albert.noll at oracle.com
Thu Feb 6 08:32:07 PST 2014
Hi,
I have done more experiments to see the impact of
CodeCacheMinBlockLength and CodeCacheSegmentSize.
Both factors have an impact on the length of the freelist as well as on
the memory that is possibly wasted.
The table below contains detailed results. Here is a description of the
numbers and how they are
calculated:
* freelist length: number of HeapBlocks that are in the freelist when
the program finishes
* freelist[kb] : total memory [kB] that is in the freelist when the
program finishes.
* unused bytes in cb: unused bytes in all CodeBlob that are in the code
cache when the program
finishes. This number is calculated by
substracting the size of the HeapBlock in
which the nmethod is stored from the nmethod
size. Note that the HeapBlock size is
a multiple of CodeCacheMinBlockLength *
CodeCacheSegmentSize.
* segmap[kB]: size of the segment map that is used to map addresses to
HeapBlocks (i.e., find the
beginning of an nmethod). Increasing
CodeCacheSegmentSize decreases the segmap
size. For example, a CodeCacheSegmentSize of 32
bytes requires 32kB of segmap
memory per allocated MB in the code cache. A
CodeCacheSegmentSize of 64 bytes
requires 16kB of segmap memory per allocated MB
in the code cache....
max_used: maximum allocated memory in the code cache.
wasted_memory: =SUM(freelist + unused bytes in cb + segmap)
memory overhead = max_used / wasted_memory
The executive summary of the results is that increasing
CodeCacheSegmentSize has no negative
impact on the memory overhead (also no positive). Increasing
CodeCacheSegmentSize reduces
the freelist length, which makes searching the freelist faster.
Note that the results obtained with a modified freelist search
algorithm. In the changed version,
the compiler chooses the first block that is large enough from the
freelist (first-fit). In the old version,
the compiler looked for the smallest possible block in the freelist into
which the code fits (best-fit).
My experiments indicate that best-fit does not provide better results
(less memory overhead) than
first-fit.
To summarize, switching to a larger CodeCacheSegmentSize seems reasonable.
Here are the detailed results:
failing test case
4 Blocks, 64 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
3085 2299 902 274 16436
3475 21.14%
3993 3366 887 283 16959
4536 26.75%
3843 2204 900 273 16377
3377 20.62%
3859 2260 898 273 16382
3431 20.94%
3860 2250 897 273 16385
3420 20.87%
22.07%
4 Blocks, 128 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
474 1020 2073 137 17451
3230 18.51%
504 1192 2064 136 17413
3392 19.48%
484 1188 2064 126 17414
3378 19.40%
438 1029 2061 136 17399
3226 18.54%
0 18.98%
Nashorn
4 Blocks, 64 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
709 1190 662 1198 76118
3050 4.01%
688 4200 635 1234 78448
6069 7.74%
707 2617 648 1178 74343
4443 5.98%
685 1703 660 1205 76903
3568 4.64%
760 1638 675 1174 74563
3487 4.68%
5.41%
4 Blocks, 128 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
206 824 1253 607 77469
2684 3.46%
247 2019 1265 583 74017
3867 5.22%
239 958 1230 641 81588
2829 3.47%
226 1477 1246 595 76119
3318 4.36%
225 2390 1239 596 76051
4225 5.56%
4.41%
compiler.compiler
4 Blocks, 64 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
440 943 263 298 18133
1504 8.29%
458 480 272 295 18443
1047 5.68%
536 1278 260 306 18776
1844 9.82%
426 684 268 304 18789
1256 6.68%
503 1430 258 310 18872
1998 10.59%
8.21% Average
4 Blocks, 128 bytes
freelist length freelist[kB] unused bytes in cb segmap[kB] max_used
wasted memory overhead
163 984 510 157 19233
1651 8.58%
132 729 492 151 18614
1372 7.37%
187 1212 498 152 18630
1862 9.99%
198 1268 496 155 18974
1919 10.11%
225 1268 496 152 18679
1916 10.26%
9.26%
On 02/05/2014 07:57 PM, Vladimir Kozlov wrote:
> On 2/5/14 8:28 AM, Albert wrote:
>> Hi Vladimir,
>>
>> thanks for looking at this. I've done the proposed measurements. The
>> code which I used to
>> get the data is included in the following webrev:
>>
>> http://cr.openjdk.java.net/~anoll/8029799/webrev.01/
>
> Good.
>
>>
>> I think some people might be interested in getting that data, so we
>> might want to keep
>> that additional output. The exact output format can be changed later
>> (JDK-8005885).
>
> I agree that it is useful information.
>
>>
>> Here are the results:
>>
>> - failing test case:
>> - original: allocated in freelist: 2168kB, unused bytes in CodeBlob:
>> 818kB, max_used: 21983kB
>> - patch : alloacted in freelist: 1123kB, unused bytes in CodeBlob:
>> 2188kB, max_used: 17572kB
>> - nashorn:
>> - original : allocated in freelist: 2426kB, unused bytes in CodeBlob:
>> 1769kB, max_used: 201886kB
>> - patch : allocated in freelist: 1150kB, unused bytes in CodeBlob:
>> 3458kB, max_used: 202394kB
>> - SPECJVM2008: compiler.compiler:
>> - original : allocated in freelist: 168kB, unused bytes in
>> CodeBlob: 342kB, max_used: 19837kB
>> - patch : allocated in freelist: 873kB, unused bytes in
>> CodeBlob: 671kB, max_used: 21184kB
>>
>> The minimum size that can be allocated from the code cache is
>> platform-dependent.
>> I.e., the minimum size depends on CodeCacheSegmentSize and
>> CodeCacheMinBlockLength.
>> On x86, for example, the min. allocatable size from the code cache is
>> 64*4=256bytes.
>
> There is this comment in CodeHeap::search_freelist():
> // Don't leave anything on the freelist smaller than
> CodeCacheMinBlockLength.
>
> What happens if we scale down CodeCacheMinBlockLength when we increase
> CodeCacheSegmentSize to keep the same bytes size of minimum block?:
>
> + FLAG_SET_DEFAULT(CodeCacheSegmentSize, CodeCacheSegmentSize * 2);
> + FLAG_SET_DEFAULT(CodeCacheMinBlockLength,
> CodeCacheMinBlockLength/2);
>
> Based on your table below those small nmethods will use only 256 bytes
> blocks instead of 512 (128*4).
>
> Note for C1 in Client VM CodeCacheMinBlockLength is 1. I don't know
> why for C2 it is 4. Could you also try CodeCacheMinBlockLength = 1?
>
> All above is with CodeCacheSegmentSize 128 bytes.
>
>> The size of adapters ranges from 400b to 600b.
>> Here is the beginning of the nmethod size distribution of the failing
>> test case:
>>
>
> Is it possible it is in segments number and not in bytes? If it really
> bytes what such (32-48 bytes) nmethods look like?
>
> Thanks,
> Vladimir
>
>>
>> nmethod size distribution (non-zombie java)
>> -------------------------------------------------
>> 0-16 bytes 0[bytes]
>> 16-32 bytes 0
>> 32-48 bytes 45
>> 48-64 bytes 0
>> 64-80 bytes 41
>> 80-96 bytes 0
>> 96-112 bytes 6247
>> 112-128 bytes 0
>> 128-144 bytes 249
>> 144-160 bytes 0
>> 160-176 bytes 139
>> 176-192 bytes 0
>> 192-208 bytes 177
>> 208-224 bytes 0
>> 224-240 bytes 180
>> 240-256 bytes 0
>> ...
>>
>>
>> I do not see a problem for increasing the CodeCacheSegmentSize if tiered
>> compilation
>> is enabled.
>>
>> What do you think?
>>
>>
>> Best,
>> Albert
>>
>>
>> On 02/04/2014 05:52 PM, Vladimir Kozlov wrote:
>>> I think the suggestion is reasonable since we increase CodeCache *5
>>> for Tiered.
>>> Albert, is it possible to collect data how much space is wasted in %
>>> before and after this change: free space in which we can't allocate +
>>> unused bytes at the end of nmethods/adapters? Can we squeeze an
>>> adapter into 64 bytes?
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 2/4/14 7:41 AM, Albert wrote:
>>>> Hi,
>>>>
>>>> could I get reviews for this patch (nightly failure)?
>>>>
>>>> webrev: http://cr.openjdk.java.net/~anoll/8029799/webrev.00/
>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8029799
>>>>
>>>> problem: The freelist of the code cache exceeds 10'000 items, which
>>>> results in a VM warning.
>>>> The problem behind the warning is that the freelist
>>>> is populated by a large number
>>>> of small free blocks. For example, in failing test
>>>> case (see header), the freelist grows
>>>> up to more than 3500 items where the largest item on
>>>> the list is 9 segments (one segment
>>>> is 64 bytes). That experiment was done on my laptop.
>>>> Such a large freelist can indeed be
>>>> a performance problem, since we use a linear search
>>>> to traverse the freelist.
>>>> solution: One way to solve the problem is to increase the minimal
>>>> allocation size in the code cache.
>>>> This can be done by two means: we can increase
>>>> CodeCacheMinBlockLength and/or
>>>> CodeCacheSegmentSize. This patch follows the latter
>>>> approach, since increasing
>>>> CodeCacheSegmentSize decreases the size that is
>>>> required by the segment map. More
>>>> concretely, the patch doubles the
>>>> CodeCacheSegmentSize from 64 byte to 128 bytes
>>>> if tiered compilation is enabled.
>>>> The patch also contains an optimization in the
>>>> freelist search (stop searching if we found
>>>> the appropriate size) and contains some code
>>>> cleanups.
>>>> testing: With the proposed change, the size of the freelist is
>>>> reduced to 200 items. There is only
>>>> a slight increase in memory required by code cache
>>>> by at most 3% (all data measured
>>>> for the failing test case on a Linux 64-bit system,
>>>> 4 cores).
>>>> To summarize, increasing the minimum allocation size
>>>> in the code cache results in
>>>> potentially more unused memory in the code cache due
>>>> to unused bits at the end of
>>>> an nmethod. The advantage is that we potentially
>>>> have less fragmentation.
>>>>
>>>> proposal: - I think we could remove CodeCacheMinBlockLength without
>>>> loss of generality or usability
>>>> and instead adapt the parameter
>>>> CodeCacheSegmentSize at Vm startup.
>>>> Any opinions?
>>>>
>>>> Many thanks in advance,
>>>> Albert
>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20140206/38cac0c3/attachment-0001.html
More information about the hotspot-compiler-dev
mailing list