RFR: 8241361: ZGC: Implement memory related JFR events
Per Liden
per.liden at oracle.com
Fri Mar 27 09:24:45 UTC 2020
On 3/27/20 9:56 AM, Stefan Karlsson wrote:
> Talked to Per and here's the latest changes:
> https://cr.openjdk.java.net/~stefank/8241361/webrev.03.delta/
> https://cr.openjdk.java.net/~stefank/8241361/webrev.03/
>
> ZPageFlush:
> - Report the logical flushed value and not the transient value that
> occurs during overflushing.
> - Don't report "requested".
> - For allocations, this value is always the same as the flushed value
> - For uncommit, this value is usually inflated and not interesting
>
> ZRelocationSet:
> - Make the event span the duration of the selection
> - Remove the add_all function. FTR, I would have preferred this to stay
> instead of duplicating the summations.
>
> ZRelocationSetGroup:
> - Added an event to report the relocation info per group (small, medium,
> large [currently not used])
Thanks for doing those adjustments, looks good!
cheers,
Per
>
> Thanks,
> StefanK
>
> On 2020-03-26 13:30, Stefan Karlsson wrote:
>> On 2020-03-26 12:01, Per Liden wrote:
>>> Hi,
>>>
>>> On 3/23/20 10:30 AM, Stefan Karlsson wrote:
>>> [...]
>>>>>
>>>>> * I think cl->_flushed user here:
>>>>>
>>>>> 604 event.commit(cl->_requested, cl->_flushed, for_allocation);
>>>>>
>>>>> should instead just be:
>>>>>
>>>>> 604 event.commit(cl->_requested, flushed, for_allocation);
>>>>>
>>>>> Right?
>>>>
>>>> I intentionally used cl->_flushed since that describes how much we
>>>> flushed including overflushed parts of pages. Maybe we should report
>>>> both values? Maybe also rename the local variable flushed to destroyed?
>>>
>>> Hmm, not sure I see the point of reporting anything except what was
>>> actually flushed. When would the other numbers be of interest? Keep
>>> in mind that the overflushed part of this is immediately put back
>>> into the cache, and is never unmapped/destroyed or anything like
>>> that. Outside of flush_cache() no one will know (or care) if we
>>> overflushed or not, right?
>>
>> Overflushing causes malloc calls for pages and bitmaps. I thought that
>> could be of interest when looking at latencies. If you don't want it,
>> I'll remove it.
>>
>>>
>>>>
>>>>>
>>>>>
>>>>> src/hotspot/share/gc/z/zPageCache.hpp
>>>>> -------------------------------------
>>>>>
>>>>> Instead of:
>>>>>
>>>>> friend class ZPageAllocator;
>>>>>
>>>>> add a getter for requested()?
>>>>>
>>>>
>>>> I also want _flushed, depending on the resolution of the above. I
>>>> don't think its bad to friend our closures that are pure extensions
>>>> to the "owning" class. I don't have a very strong opinion here, but
>>>> gravitated towards a friend declaration to minimize the exposure of
>>>> the implementation details. If you still want me to add getters,
>>>> I'll do it.
>>>
>>> In this case, I'd prefer getters. Assuming my comment above is
>>> accepted, we only need one new getter.
>>
>> OK.
>>
>>>
>>>>
>>>>>
>>>>> src/hotspot/share/gc/z/zRelocationSetSelector.cpp
>>>>> -------------------------------------------------
>>>>>
>>>>> * Same here, instead of:
>>>>>
>>>>> #include "jfrfiles/jfrEventClasses.hpp"
>>>>>
>>>>> I think we should do:
>>>>>
>>>>> #include "jfr/jfrEvents.hpp"
>>>>
>>>> Yes
>>>>
>>>>>
>>>>>
>>>>> * You don't think we should use ZPageTypeType that you introduced,
>>>>> and send three different ZRelocationSet events, one for each page
>>>>> type? Shouldn't this event also be timed, and sent from within
>>>>> ZRelocationSetSelectorGroup::select()?
>>>>
>>>> JMC is not always great at handling normalized events. If we want
>>>> events per type I think we should add them in _addition_ to the
>>>> event I added.
>>>
>>> Ok, I'm sure you're right but still want to understand. When you say
>>> "normalized events", what are you thinking of in this context?
>>
>> What I'm meaning is that if you have the minimally sufficient
>> information like:
>> - Relocated bytes in small pages: x
>> - Relocated bytes in medium pages: y
>> - Relocated bytes in large pages: z
>>
>> You need to go through extra lengths to figure out:
>> - Relocated total bytes: x + y + x
>>
>> and unless you want to fiddle around with JMC too much, it's better to
>> also (or instead) reported the sum of the values.
>>
>> StefanK
>>
>>>
>>> cheers,
>>> Per
>>>
>>>>
>>>>>
>>>>>
>>>>> src/hotspot/share/gc/z/zTracer.cpp
>>>>> ----------------------------------
>>>>>
>>>>> 43 writer.write("small");
>>>>> 44 writer.write_key(ZPageTypeMedium);
>>>>> 45 writer.write("medium");
>>>>> 46 writer.write_key(ZPageTypeLarge);
>>>>> 47 writer.write("large");
>>>>>
>>>>> How about "Small", "Medium" and "Large"? I could only find one
>>>>> other place (in jfrStackTraceRepository.cpp) where names were
>>>>> given, and those start with a capital letter.
>>>>
>>>> OK
>>>>
>>>> Here's the updated webrevs with the easy fixes:
>>>> https://cr.openjdk.java.net/~stefank/8241361/webrev.02.delta/
>>>> https://cr.openjdk.java.net/~stefank/8241361/webrev.02
>>>>
>>>> Waiting for answers and comments to the rest.
>>>>
>>>> Thanks,
>>>> StefanK
>>>>
>>>>>
>>>>> cheers,
>>>>> Per
>>>>>
>>>>>
>>>>>>
>>>>>> Added events:
>>>>>>
>>>>>> ZAllocationStall - Record when we run out of heap memory and the
>>>>>> Java threads stall, waiting for the GC to free up memory.
>>>>>>
>>>>>> ZPageAllocation - Updated the existing event to also record the
>>>>>> duration of the event. Updated the event to only be reported if
>>>>>> the allocation takes longer than 1 ms.
>>>>>>
>>>>>> ZPageCacheFlush - Record when the page cache needs to be flushed.
>>>>>> This usually happens when we run out of a specific page size and
>>>>>> have to detach the physical and virtual memory to materialize a
>>>>>> new ZPage. We also flush pages when we uncommit memory.
>>>>>>
>>>>>> ZRelocationSet - Record information about the selected relocation
>>>>>> set.
>>>>>>
>>>>>> ZUncommit - Record when we uncommit and hand back memory to the OS.
>>>>>>
>>>>>> The patch also contains some small cosmetic changes to existing
>>>>>> events, whitespace fixes.
>>>>
>>
More information about the hotspot-gc-dev
mailing list