RFR: 8241361: ZGC: Implement memory related JFR events

Stefan Karlsson stefan.karlsson at oracle.com
Fri Mar 27 08:56:45 UTC 2020


Talked to Per and here's the latest changes:
  https://cr.openjdk.java.net/~stefank/8241361/webrev.03.delta/
  https://cr.openjdk.java.net/~stefank/8241361/webrev.03/

ZPageFlush:
- Report the logical flushed value and not the transient value that 
occurs during overflushing.
- Don't report "requested".
  - For allocations, this value is always the same as the flushed value
  - For uncommit, this value is usually inflated and not interesting

ZRelocationSet:
- Make the event span the duration of the selection
- Remove the add_all function. FTR, I would have preferred this to stay 
instead of duplicating the summations.

ZRelocationSetGroup:
- Added an event to report the relocation info per group (small, medium, 
large [currently not used])

Thanks,
StefanK

On 2020-03-26 13:30, Stefan Karlsson wrote:
> On 2020-03-26 12:01, Per Liden wrote:
>> Hi,
>>
>> On 3/23/20 10:30 AM, Stefan Karlsson wrote:
>> [...]
>>>>
>>>> * I think cl->_flushed user here:
>>>>
>>>>  604   event.commit(cl->_requested, cl->_flushed, for_allocation);
>>>>
>>>> should instead just be:
>>>>
>>>>  604   event.commit(cl->_requested, flushed, for_allocation);
>>>>
>>>> Right?
>>>
>>> I intentionally used cl->_flushed since that describes how much we 
>>> flushed including overflushed parts of pages. Maybe we should report 
>>> both values? Maybe also rename the local variable flushed to destroyed?
>>
>> Hmm, not sure I see the point of reporting anything except what was 
>> actually flushed. When would the other numbers be of interest? Keep in 
>> mind that the overflushed part of this is immediately put back into 
>> the cache, and is never unmapped/destroyed or anything like that. 
>> Outside of flush_cache() no one will know (or care) if we overflushed 
>> or not, right?
> 
> Overflushing causes malloc calls for pages and bitmaps. I thought that 
> could be of interest when looking at latencies. If you don't want it, 
> I'll remove it.
> 
>>
>>>
>>>>
>>>>
>>>> src/hotspot/share/gc/z/zPageCache.hpp
>>>> -------------------------------------
>>>>
>>>> Instead of:
>>>>
>>>>   friend class ZPageAllocator;
>>>>
>>>> add a getter for requested()?
>>>>
>>>
>>> I also want _flushed, depending on the resolution of the above. I 
>>> don't think its bad to friend our closures that are pure extensions 
>>> to the "owning" class. I don't have a very strong opinion here, but 
>>> gravitated towards a friend declaration to minimize the exposure of 
>>> the implementation details. If you still want me to add getters, I'll 
>>> do it.
>>
>> In this case, I'd prefer getters. Assuming my comment above is 
>> accepted, we only need one new getter.
> 
> OK.
> 
>>
>>>
>>>>
>>>> src/hotspot/share/gc/z/zRelocationSetSelector.cpp
>>>> -------------------------------------------------
>>>>
>>>> * Same here, instead of:
>>>>
>>>>   #include "jfrfiles/jfrEventClasses.hpp"
>>>>
>>>> I think we should do:
>>>>
>>>>   #include "jfr/jfrEvents.hpp"
>>>
>>> Yes
>>>
>>>>
>>>>
>>>> * You don't think we should use ZPageTypeType that you introduced, 
>>>> and send three different ZRelocationSet events, one for each page 
>>>> type? Shouldn't this event also be timed, and sent from within 
>>>> ZRelocationSetSelectorGroup::select()?
>>>
>>> JMC is not always great at handling normalized events. If we want 
>>> events per type I think we should add them in _addition_ to the event 
>>> I added.
>>
>> Ok, I'm sure you're right but still want to understand. When you say 
>> "normalized events", what are you thinking of in this context?
> 
> What I'm meaning is that if you have the minimally sufficient 
> information like:
> - Relocated bytes in small pages: x
> - Relocated bytes in medium pages: y
> - Relocated bytes in large pages: z
> 
> You need to go through extra lengths to figure out:
> - Relocated total bytes: x + y + x
> 
> and unless you want to fiddle around with JMC too much, it's better to 
> also (or instead) reported the sum of the values.
> 
> StefanK
> 
>>
>> cheers,
>> Per
>>
>>>
>>>>
>>>>
>>>> src/hotspot/share/gc/z/zTracer.cpp
>>>> ----------------------------------
>>>>
>>>>   43     writer.write("small");
>>>>   44     writer.write_key(ZPageTypeMedium);
>>>>   45     writer.write("medium");
>>>>   46     writer.write_key(ZPageTypeLarge);
>>>>   47     writer.write("large");
>>>>
>>>> How about "Small", "Medium" and "Large"? I could only find one other 
>>>> place (in jfrStackTraceRepository.cpp) where names were given, and 
>>>> those start with a capital letter.
>>>
>>> OK
>>>
>>> Here's the updated webrevs with the easy fixes:
>>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.02.delta/
>>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.02
>>>
>>> Waiting for answers and comments to the rest.
>>>
>>> Thanks,
>>> StefanK
>>>
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>
>>>>>
>>>>> Added events:
>>>>>
>>>>> ZAllocationStall - Record when we run out of heap memory and the 
>>>>> Java threads stall, waiting for the GC to free up memory.
>>>>>
>>>>> ZPageAllocation - Updated the existing event to also record the 
>>>>> duration of the event. Updated the event to only be reported if the 
>>>>> allocation takes longer than 1 ms.
>>>>>
>>>>> ZPageCacheFlush - Record when the page cache needs to be flushed. 
>>>>> This usually happens when we run out of a specific page size and 
>>>>> have to detach the physical and virtual memory to materialize a new 
>>>>> ZPage. We also flush pages when we uncommit memory.
>>>>>
>>>>> ZRelocationSet - Record information about the selected relocation set.
>>>>>
>>>>> ZUncommit - Record when we uncommit and hand back memory to the OS.
>>>>>
>>>>> The patch also contains some small cosmetic changes to existing 
>>>>> events, whitespace fixes.
>>>
> 



More information about the hotspot-gc-dev mailing list