RFR: 8241361: ZGC: Implement memory related JFR events
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Mar 27 08:56:45 UTC 2020
Talked to Per and here's the latest changes:
https://cr.openjdk.java.net/~stefank/8241361/webrev.03.delta/
https://cr.openjdk.java.net/~stefank/8241361/webrev.03/
ZPageFlush:
- Report the logical flushed value and not the transient value that
occurs during overflushing.
- Don't report "requested".
- For allocations, this value is always the same as the flushed value
- For uncommit, this value is usually inflated and not interesting
ZRelocationSet:
- Make the event span the duration of the selection
- Remove the add_all function. FTR, I would have preferred this to stay
instead of duplicating the summations.
ZRelocationSetGroup:
- Added an event to report the relocation info per group (small, medium,
large [currently not used])
Thanks,
StefanK
On 2020-03-26 13:30, Stefan Karlsson wrote:
> On 2020-03-26 12:01, Per Liden wrote:
>> Hi,
>>
>> On 3/23/20 10:30 AM, Stefan Karlsson wrote:
>> [...]
>>>>
>>>> * I think cl->_flushed user here:
>>>>
>>>> 604 event.commit(cl->_requested, cl->_flushed, for_allocation);
>>>>
>>>> should instead just be:
>>>>
>>>> 604 event.commit(cl->_requested, flushed, for_allocation);
>>>>
>>>> Right?
>>>
>>> I intentionally used cl->_flushed since that describes how much we
>>> flushed including overflushed parts of pages. Maybe we should report
>>> both values? Maybe also rename the local variable flushed to destroyed?
>>
>> Hmm, not sure I see the point of reporting anything except what was
>> actually flushed. When would the other numbers be of interest? Keep in
>> mind that the overflushed part of this is immediately put back into
>> the cache, and is never unmapped/destroyed or anything like that.
>> Outside of flush_cache() no one will know (or care) if we overflushed
>> or not, right?
>
> Overflushing causes malloc calls for pages and bitmaps. I thought that
> could be of interest when looking at latencies. If you don't want it,
> I'll remove it.
>
>>
>>>
>>>>
>>>>
>>>> src/hotspot/share/gc/z/zPageCache.hpp
>>>> -------------------------------------
>>>>
>>>> Instead of:
>>>>
>>>> friend class ZPageAllocator;
>>>>
>>>> add a getter for requested()?
>>>>
>>>
>>> I also want _flushed, depending on the resolution of the above. I
>>> don't think its bad to friend our closures that are pure extensions
>>> to the "owning" class. I don't have a very strong opinion here, but
>>> gravitated towards a friend declaration to minimize the exposure of
>>> the implementation details. If you still want me to add getters, I'll
>>> do it.
>>
>> In this case, I'd prefer getters. Assuming my comment above is
>> accepted, we only need one new getter.
>
> OK.
>
>>
>>>
>>>>
>>>> src/hotspot/share/gc/z/zRelocationSetSelector.cpp
>>>> -------------------------------------------------
>>>>
>>>> * Same here, instead of:
>>>>
>>>> #include "jfrfiles/jfrEventClasses.hpp"
>>>>
>>>> I think we should do:
>>>>
>>>> #include "jfr/jfrEvents.hpp"
>>>
>>> Yes
>>>
>>>>
>>>>
>>>> * You don't think we should use ZPageTypeType that you introduced,
>>>> and send three different ZRelocationSet events, one for each page
>>>> type? Shouldn't this event also be timed, and sent from within
>>>> ZRelocationSetSelectorGroup::select()?
>>>
>>> JMC is not always great at handling normalized events. If we want
>>> events per type I think we should add them in _addition_ to the event
>>> I added.
>>
>> Ok, I'm sure you're right but still want to understand. When you say
>> "normalized events", what are you thinking of in this context?
>
> What I'm meaning is that if you have the minimally sufficient
> information like:
> - Relocated bytes in small pages: x
> - Relocated bytes in medium pages: y
> - Relocated bytes in large pages: z
>
> You need to go through extra lengths to figure out:
> - Relocated total bytes: x + y + x
>
> and unless you want to fiddle around with JMC too much, it's better to
> also (or instead) reported the sum of the values.
>
> StefanK
>
>>
>> cheers,
>> Per
>>
>>>
>>>>
>>>>
>>>> src/hotspot/share/gc/z/zTracer.cpp
>>>> ----------------------------------
>>>>
>>>> 43 writer.write("small");
>>>> 44 writer.write_key(ZPageTypeMedium);
>>>> 45 writer.write("medium");
>>>> 46 writer.write_key(ZPageTypeLarge);
>>>> 47 writer.write("large");
>>>>
>>>> How about "Small", "Medium" and "Large"? I could only find one other
>>>> place (in jfrStackTraceRepository.cpp) where names were given, and
>>>> those start with a capital letter.
>>>
>>> OK
>>>
>>> Here's the updated webrevs with the easy fixes:
>>> https://cr.openjdk.java.net/~stefank/8241361/webrev.02.delta/
>>> https://cr.openjdk.java.net/~stefank/8241361/webrev.02
>>>
>>> Waiting for answers and comments to the rest.
>>>
>>> Thanks,
>>> StefanK
>>>
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>
>>>>>
>>>>> Added events:
>>>>>
>>>>> ZAllocationStall - Record when we run out of heap memory and the
>>>>> Java threads stall, waiting for the GC to free up memory.
>>>>>
>>>>> ZPageAllocation - Updated the existing event to also record the
>>>>> duration of the event. Updated the event to only be reported if the
>>>>> allocation takes longer than 1 ms.
>>>>>
>>>>> ZPageCacheFlush - Record when the page cache needs to be flushed.
>>>>> This usually happens when we run out of a specific page size and
>>>>> have to detach the physical and virtual memory to materialize a new
>>>>> ZPage. We also flush pages when we uncommit memory.
>>>>>
>>>>> ZRelocationSet - Record information about the selected relocation set.
>>>>>
>>>>> ZUncommit - Record when we uncommit and hand back memory to the OS.
>>>>>
>>>>> The patch also contains some small cosmetic changes to existing
>>>>> events, whitespace fixes.
>>>
>
More information about the hotspot-gc-dev
mailing list