[9] RFR(M): 8029799: vm/mlvm/anonloader/stress/oome prints warning: CodeHeap: # of free blocks > 10000
Albert
albert.noll at oracle.com
Fri Feb 7 08:06:54 PST 2014
Vladimir, Chris, thanks for looking at this.
The measurement results are attached to the bug.
https://bugs.openjdk.java.net/browse/JDK-8029799
I've tried various settings and the current configuration seems to
be good for non-tiered compilation. In the current settings, the minimum
size that can be allocated from the code cache is 64 bytes for C1 and
256 bytes for C2. This is fine, since C1-compiled code is typically smaller
than C2-compiled code. The tradeoff we are facing here is that smaller
sizes can lead to more fragmentation (a lot of small chunks are on the
freelist), however, the memory wasted at the end of a method is smaller.
When tiered compilation is enabled, we have C1 and C2 methods stored
in the same code heap, so we have to decide for one minimum-allocatable
size.
The current implementation chooses the C2 setting. Since we compile C1
methods
when the application starts, we allocate small memory units that might
be too
small to fit a C2 version of the method. This is why the freelist can
grow to > 10.000
items.
The current patch increases the minimum-allocatable size ONLY when tiered
compilation is enabled to 512 bytes. This leads to less memory overhead,
since
more memory units that are initially used for C1 methods can later be
used by
C2 methods.
I think that this problem can be solved by segmented code cache that we
plan to
integrate into 9. If we have multiple code heaps, individual code heaps
can use
different values for CodeCacheSegmentSize and CodeCacheMinBlockLength.
I think we should look at this, since a memory overhead of > 20%, as in the
failing test case, seems unreasonably large. Shall I file an RFE that
suggests to
look into this, once the segmented code cache patch is integrated?
For now, I think, there is not much more we can do.
Concerning the small method sizes:
The size that is provided by PrintCodeCache2 is the instruction size
(nm->insts_size)
and not the size of the nmethod. I changed that in this patch, since the
the output
suggests something different: ("nmethod size distribution")
Here is the link to the webrev:
http://cr.openjdk.java.net/~anoll/8029799/webrev.02/
Best,
Albert
On 02/06/2014 10:29 PM, Christian Thalinger wrote:
> On Feb 5, 2014, at 10:57 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>
>> On 2/5/14 8:28 AM, Albert wrote:
>>> Hi Vladimir,
>>>
>>> thanks for looking at this. I've done the proposed measurements. The
>>> code which I used to
>>> get the data is included in the following webrev:
>>>
>>> http://cr.openjdk.java.net/~anoll/8029799/webrev.01/
>> Good.
>>
>>> I think some people might be interested in getting that data, so we
>>> might want to keep
>>> that additional output. The exact output format can be changed later
>>> (JDK-8005885).
>> I agree that it is useful information.
>>
>>> Here are the results:
>>>
>>> - failing test case:
>>> - original: allocated in freelist: 2168kB, unused bytes in CodeBlob:
>>> 818kB, max_used: 21983kB
>>> - patch : alloacted in freelist: 1123kB, unused bytes in CodeBlob:
>>> 2188kB, max_used: 17572kB
>>> - nashorn:
>>> - original : allocated in freelist: 2426kB, unused bytes in CodeBlob:
>>> 1769kB, max_used: 201886kB
>>> - patch : allocated in freelist: 1150kB, unused bytes in CodeBlob:
>>> 3458kB, max_used: 202394kB
>>> - SPECJVM2008: compiler.compiler:
>>> - original : allocated in freelist: 168kB, unused bytes in
>>> CodeBlob: 342kB, max_used: 19837kB
>>> - patch : allocated in freelist: 873kB, unused bytes in
>>> CodeBlob: 671kB, max_used: 21184kB
>>>
>>> The minimum size that can be allocated from the code cache is
>>> platform-dependent.
>>> I.e., the minimum size depends on CodeCacheSegmentSize and
>>> CodeCacheMinBlockLength.
>>> On x86, for example, the min. allocatable size from the code cache is
>>> 64*4=256bytes.
>> There is this comment in CodeHeap::search_freelist():
>> // Don't leave anything on the freelist smaller than CodeCacheMinBlockLength.
>>
>> What happens if we scale down CodeCacheMinBlockLength when we increase CodeCacheSegmentSize to keep the same bytes size of minimum block?:
>>
>> + FLAG_SET_DEFAULT(CodeCacheSegmentSize, CodeCacheSegmentSize * 2);
>> + FLAG_SET_DEFAULT(CodeCacheMinBlockLength, CodeCacheMinBlockLength/2);
>>
>> Based on your table below those small nmethods will use only 256 bytes blocks instead of 512 (128*4).
>>
>> Note for C1 in Client VM CodeCacheMinBlockLength is 1. I don't know why for C2 it is 4. Could you also try CodeCacheMinBlockLength = 1?
>>
>> All above is with CodeCacheSegmentSize 128 bytes.
>>
>>> The size of adapters ranges from 400b to 600b.
>>> Here is the beginning of the nmethod size distribution of the failing
>>> test case:
>>>
>> Is it possible it is in segments number and not in bytes? If it really bytes what such (32-48 bytes) nmethods look like?
> This is just a guess but these methods could be method handle trampolines. They are very small.
>
>> Thanks,
>> Vladimir
>>
>>> nmethod size distribution (non-zombie java)
>>> -------------------------------------------------
>>> 0-16 bytes 0[bytes]
>>> 16-32 bytes 0
>>> 32-48 bytes 45
>>> 48-64 bytes 0
>>> 64-80 bytes 41
>>> 80-96 bytes 0
>>> 96-112 bytes 6247
>>> 112-128 bytes 0
>>> 128-144 bytes 249
>>> 144-160 bytes 0
>>> 160-176 bytes 139
>>> 176-192 bytes 0
>>> 192-208 bytes 177
>>> 208-224 bytes 0
>>> 224-240 bytes 180
>>> 240-256 bytes 0
>>> ...
>>>
>>>
>>> I do not see a problem for increasing the CodeCacheSegmentSize if tiered
>>> compilation
>>> is enabled.
>>>
>>> What do you think?
>>>
>>>
>>> Best,
>>> Albert
>>>
>>>
>>> On 02/04/2014 05:52 PM, Vladimir Kozlov wrote:
>>>> I think the suggestion is reasonable since we increase CodeCache *5
>>>> for Tiered.
>>>> Albert, is it possible to collect data how much space is wasted in %
>>>> before and after this change: free space in which we can't allocate +
>>>> unused bytes at the end of nmethods/adapters? Can we squeeze an
>>>> adapter into 64 bytes?
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 2/4/14 7:41 AM, Albert wrote:
>>>>> Hi,
>>>>>
>>>>> could I get reviews for this patch (nightly failure)?
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~anoll/8029799/webrev.00/
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8029799
>>>>>
>>>>> problem: The freelist of the code cache exceeds 10'000 items, which
>>>>> results in a VM warning.
>>>>> The problem behind the warning is that the freelist
>>>>> is populated by a large number
>>>>> of small free blocks. For example, in failing test
>>>>> case (see header), the freelist grows
>>>>> up to more than 3500 items where the largest item on
>>>>> the list is 9 segments (one segment
>>>>> is 64 bytes). That experiment was done on my laptop.
>>>>> Such a large freelist can indeed be
>>>>> a performance problem, since we use a linear search
>>>>> to traverse the freelist.
>>>>> solution: One way to solve the problem is to increase the minimal
>>>>> allocation size in the code cache.
>>>>> This can be done by two means: we can increase
>>>>> CodeCacheMinBlockLength and/or
>>>>> CodeCacheSegmentSize. This patch follows the latter
>>>>> approach, since increasing
>>>>> CodeCacheSegmentSize decreases the size that is
>>>>> required by the segment map. More
>>>>> concretely, the patch doubles the
>>>>> CodeCacheSegmentSize from 64 byte to 128 bytes
>>>>> if tiered compilation is enabled.
>>>>> The patch also contains an optimization in the
>>>>> freelist search (stop searching if we found
>>>>> the appropriate size) and contains some code cleanups.
>>>>> testing: With the proposed change, the size of the freelist is
>>>>> reduced to 200 items. There is only
>>>>> a slight increase in memory required by code cache
>>>>> by at most 3% (all data measured
>>>>> for the failing test case on a Linux 64-bit system,
>>>>> 4 cores).
>>>>> To summarize, increasing the minimum allocation size
>>>>> in the code cache results in
>>>>> potentially more unused memory in the code cache due
>>>>> to unused bits at the end of
>>>>> an nmethod. The advantage is that we potentially
>>>>> have less fragmentation.
>>>>>
>>>>> proposal: - I think we could remove CodeCacheMinBlockLength without
>>>>> loss of generality or usability
>>>>> and instead adapt the parameter
>>>>> CodeCacheSegmentSize at Vm startup.
>>>>> Any opinions?
>>>>>
>>>>> Many thanks in advance,
>>>>> Albert
More information about the hotspot-compiler-dev
mailing list