RFR(xs): 8185033: On Metaspace OOM, ChunkManager composition should be logged.
Thomas Stüfe
thomas.stuefe at gmail.com
Sat Jul 22 08:37:06 UTC 2017
Hi Coleen, Mikael,
new webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/8185033-On-Metaspace-OOM-chunkManager-composition-should-be-logged/webrev.01/webrev/
I only applied Mikaels patch atop of mine, rebased to top (after the UL
refactoring change Eric pushed yesterday) and fixed merge errors.
Thanks, Thomas
On Fri, Jul 21, 2017 at 8:54 PM, <coleen.phillimore at oracle.com> wrote:
>
>
> On 7/21/17 2:16 PM, Thomas Stüfe wrote:
>
> Hi Coleen,
>
> On Fri, Jul 21, 2017 at 5:34 PM, <coleen.phillimore at oracle.com> wrote:
>
>>
>> Hi Thomas,
>> Thank you for working on this.
>>
>>
> thanks for reviewing!
>
>
>>
>> On 7/21/17 2:54 AM, Thomas Stüfe wrote:
>>
>>> Dear all,
>>>
>>> May I please get reviews and a sponsor for the following small
>>> enhancement.
>>>
>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8185033
>>> Webrev:
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8185033-On-Metasp
>>> ace-OOM-chunkManager-composition-should-be-logged/webrev.00/webrev/
>>>
>>> This patch adds a bit of logging if a Metaspace OOM happens. We will now
>>> get a printout of the composition of the global chunk managers (non-class
>>> and, if necessary, class).
>>>
>>> Before it looked like this:
>>>
>>> [272.727s][info][gc,metaspace,freelist] All Metaspace:
>>> [272.728s][info][gc,metaspace,freelist] data space: Chunk accounting:
>>> used in chunks 100752K + unused in chunks 650K + capacity in free
>>> chunks
>>> 8293K = 109696K capacity in allocated chunks 101403K
>>> [272.728s][info][gc,metaspace,freelist] class space: Chunk accounting:
>>> used in chunks 4047K + unused in chunks 97K + capacity in free chunks
>>> 6085K = 10230K capacity in allocated chunks 4145K
>>> [272.729s][info][gc,metaspace,freelist] Total fragmentation waste
>>> (words)
>>> doesn't count free space
>>> [272.729s][info][gc,metaspace,freelist] data: 183 specialized(s) 960,
>>> 169
>>> small(s) 41, 1507 medium(s) 4480, large count 1
>>> [272.729s][info][gc,metaspace,freelist] class: 183 specialized(s) 0, 13
>>> small(s) 12, 111 medium(s) 8, large count 1
>>>
>>> Now:
>>> [250.548s][info][gc,metaspace,freelist] All Metaspace:
>>> [250.549s][info][gc,metaspace,freelist] data space: Chunk accounting:
>>> used in chunks 100748K + unused in chunks 638K + capacity in free
>>> chunks
>>> 10357K = 111744K capacity in allocated chunks 101387K
>>> [250.550s][info][gc,metaspace,freelist] class space: Chunk accounting:
>>> used in chunks 4047K + unused in chunks 97K + capacity in free chunks
>>> 6085K = 10230K capacity in allocated chunks 4145K
>>> [250.550s][info][gc,metaspace,freelist] Total fragmentation waste
>>> (words)
>>> doesn't count free space
>>> [250.550s][info][gc,metaspace,freelist] data: 183 specialized(s) 960,
>>> 165
>>> small(s) 27, 1507 medium(s) 4504, large count 1
>>> [250.550s][info][gc,metaspace,freelist] class: 183 specialized(s) 0, 13
>>> small(s) 12, 111 medium(s) 14, large count 1
>>> [250.550s][info][gc,metaspace,freelist] Chunkmanager (non-class):
>>> [250.550s][info][gc,metaspace,freelist] 89 specialized (128 bytes)
>>> chunks, total 91136 bytes
>>> [250.550s][info][gc,metaspace,freelist] 2567 small (512 bytes) chunks,
>>> total 10514432 bytes
>>> [250.550s][info][gc,metaspace,freelist] 0 medium (8192 bytes) chunks,
>>> total 0 bytes
>>> [250.550s][info][gc,metaspace,freelist] 0 humongous chunks, total 0
>>> bytes
>>> [250.550s][info][gc,metaspace,freelist] total size: 10605568.
>>> [250.550s][info][gc,metaspace,freelist] Chunkmanager (class):
>>> [250.550s][info][gc,metaspace,freelist] 87 specialized (128 bytes)
>>> chunks, total 89088 bytes
>>> [250.550s][info][gc,metaspace,freelist] 2999 small (256 bytes) chunks,
>>> total 6141952 bytes
>>> [250.550s][info][gc,metaspace,freelist] 0 medium (4096 bytes) chunks,
>>> total 0 bytes
>>> [250.550s][info][gc,metaspace,freelist] 0 humongous chunks, total 0
>>> bytes
>>> [250.550s][info][gc,metaspace,freelist] total size: 6231040.
>>>
>>
>> Yes, this looks like it'll help. Your numbers are less mystifying than
>> the existing ones.
>>
>>>
>>> This helps in understanding the underlying cause of the OOM, if that
>>> cause
>>> is related to chunks lingering around in the freelist. See also [1], a
>>> Jep
>>> proposing to improve the metaspace chunk allocation to increase the
>>> chance
>>> of chunk reuse. This patch here is also a preparation for the prototype
>>> work of [1].
>>>
>>> Note that admittedly, metaspace.cpp will again gain a bit of complexity
>>> by
>>> this patch. When examining the coding, I was tempted to refactor but
>>> refrained from doing so to keep the patch focused. But I think
>>> metaspace.cpp (especially the coding associated with statistics) could be
>>> cleaned up a bit. I opened a follow up item [2] to collect
>>> cleanup-issues.
>>> I am willing to do that myself, but would prefer it to be done after
>>> functional patches - which still have a slight chance of being backported
>>> to jdk9 - are all being added.
>>>
>>
>> My first impression was that get_statistics() and print_statistics()
>> should be member functions of ChunkManagerStatistics because of all the
>> stat->lines. But ChunkManager has to own these also. I can't think of
>> anything better.
>>
>>
> I wanted to keep the structure a dumb data holder (I actually would have
> preferred a more C-ish name like cmstat_t or similar but I kept the name in
> line with existing name conventions). But I could make "ChunkManager::print_statistics"
> a member function of ChunkManagerStatistics, if you prefer it that way. I
> think it does not matter much, just a question of aesthetics.
>
>
> Yes, it seemed that you were trying to encapsulate the data and I don't
> see a better way. I'm glad you kept the name of the class with the
> existing naming conventions because you may find an improvement to make it
> aesthetic at some point. It's not worth changing now.
>
>
>
>> Yes, the metaspace code could definitely use refactoring with respect to
>> statistics collection. Please! Also I added to your RFE to rename
>> MetaspaceAux.
>>
>>
> Great! Lets use that RFE to collect more thoughts.
>
>
>> I agree with Mikael about adding the code to avoid casting. You could
>> change the type to 'int' with another patch (or this one, I don't mind
>> looking at it again).
>>
>>
> I will do as Mikael requested, his additions make sense. Changing the
> numbers to "int" can be part of that cleanup to come.
>
>
>> I could sponsor this when it's ready.
>>
>>
> Thanks! I'll post an updated webrev soon.
>
>
> I'll watch for it.
> Thanks,
> Coleen
>
>
>
>> Thanks!
>> Coleen
>>
>>
> Kind Regards Thomas
>
>>
>>
>>
>>> Patch was built and tested on Windows x64, Linux x64 and AIX. I also ran
>>> jtreg hotspot/runtime/Metaspace tests, no issues.
>>>
>>> Kind Regards, Thomas
>>>
>>> ----
>>>
>>> [1] https://bugs.openjdk.java.net/browse/JDK-8166690
>>> [2] https://bugs.openjdk.java.net/browse/JDK-8185034
>>>
>>
>>
>
>
More information about the hotspot-dev
mailing list