RFR: 8241361: ZGC: Implement memory related JFR events

Stefan Karlsson stefan.karlsson at oracle.com
Fri Mar 27 09:29:29 UTC 2020


Thanks for reviewing and providing some of the changes for webrev.03.

StefanK

On 2020-03-27 10:24, Per Liden wrote:
> On 3/27/20 9:56 AM, Stefan Karlsson wrote:
>> Talked to Per and here's the latest changes:
>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.03.delta/
>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.03/
>>
>> ZPageFlush:
>> - Report the logical flushed value and not the transient value that 
>> occurs during overflushing.
>> - Don't report "requested".
>>   - For allocations, this value is always the same as the flushed value
>>   - For uncommit, this value is usually inflated and not interesting
>>
>> ZRelocationSet:
>> - Make the event span the duration of the selection
>> - Remove the add_all function. FTR, I would have preferred this to 
>> stay instead of duplicating the summations.
>>
>> ZRelocationSetGroup:
>> - Added an event to report the relocation info per group (small, 
>> medium, large [currently not used])
> 
> Thanks for doing those adjustments, looks good!
> 
> cheers,
> Per
> 
>>
>> Thanks,
>> StefanK
>>
>> On 2020-03-26 13:30, Stefan Karlsson wrote:
>>> On 2020-03-26 12:01, Per Liden wrote:
>>>> Hi,
>>>>
>>>> On 3/23/20 10:30 AM, Stefan Karlsson wrote:
>>>> [...]
>>>>>>
>>>>>> * I think cl->_flushed user here:
>>>>>>
>>>>>>  604   event.commit(cl->_requested, cl->_flushed, for_allocation);
>>>>>>
>>>>>> should instead just be:
>>>>>>
>>>>>>  604   event.commit(cl->_requested, flushed, for_allocation);
>>>>>>
>>>>>> Right?
>>>>>
>>>>> I intentionally used cl->_flushed since that describes how much we 
>>>>> flushed including overflushed parts of pages. Maybe we should 
>>>>> report both values? Maybe also rename the local variable flushed to 
>>>>> destroyed?
>>>>
>>>> Hmm, not sure I see the point of reporting anything except what was 
>>>> actually flushed. When would the other numbers be of interest? Keep 
>>>> in mind that the overflushed part of this is immediately put back 
>>>> into the cache, and is never unmapped/destroyed or anything like 
>>>> that. Outside of flush_cache() no one will know (or care) if we 
>>>> overflushed or not, right?
>>>
>>> Overflushing causes malloc calls for pages and bitmaps. I thought 
>>> that could be of interest when looking at latencies. If you don't 
>>> want it, I'll remove it.
>>>
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> src/hotspot/share/gc/z/zPageCache.hpp
>>>>>> -------------------------------------
>>>>>>
>>>>>> Instead of:
>>>>>>
>>>>>>   friend class ZPageAllocator;
>>>>>>
>>>>>> add a getter for requested()?
>>>>>>
>>>>>
>>>>> I also want _flushed, depending on the resolution of the above. I 
>>>>> don't think its bad to friend our closures that are pure extensions 
>>>>> to the "owning" class. I don't have a very strong opinion here, but 
>>>>> gravitated towards a friend declaration to minimize the exposure of 
>>>>> the implementation details. If you still want me to add getters, 
>>>>> I'll do it.
>>>>
>>>> In this case, I'd prefer getters. Assuming my comment above is 
>>>> accepted, we only need one new getter.
>>>
>>> OK.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> src/hotspot/share/gc/z/zRelocationSetSelector.cpp
>>>>>> -------------------------------------------------
>>>>>>
>>>>>> * Same here, instead of:
>>>>>>
>>>>>>   #include "jfrfiles/jfrEventClasses.hpp"
>>>>>>
>>>>>> I think we should do:
>>>>>>
>>>>>>   #include "jfr/jfrEvents.hpp"
>>>>>
>>>>> Yes
>>>>>
>>>>>>
>>>>>>
>>>>>> * You don't think we should use ZPageTypeType that you introduced, 
>>>>>> and send three different ZRelocationSet events, one for each page 
>>>>>> type? Shouldn't this event also be timed, and sent from within 
>>>>>> ZRelocationSetSelectorGroup::select()?
>>>>>
>>>>> JMC is not always great at handling normalized events. If we want 
>>>>> events per type I think we should add them in _addition_ to the 
>>>>> event I added.
>>>>
>>>> Ok, I'm sure you're right but still want to understand. When you say 
>>>> "normalized events", what are you thinking of in this context?
>>>
>>> What I'm meaning is that if you have the minimally sufficient 
>>> information like:
>>> - Relocated bytes in small pages: x
>>> - Relocated bytes in medium pages: y
>>> - Relocated bytes in large pages: z
>>>
>>> You need to go through extra lengths to figure out:
>>> - Relocated total bytes: x + y + x
>>>
>>> and unless you want to fiddle around with JMC too much, it's better 
>>> to also (or instead) reported the sum of the values.
>>>
>>> StefanK
>>>
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> src/hotspot/share/gc/z/zTracer.cpp
>>>>>> ----------------------------------
>>>>>>
>>>>>>   43     writer.write("small");
>>>>>>   44     writer.write_key(ZPageTypeMedium);
>>>>>>   45     writer.write("medium");
>>>>>>   46     writer.write_key(ZPageTypeLarge);
>>>>>>   47     writer.write("large");
>>>>>>
>>>>>> How about "Small", "Medium" and "Large"? I could only find one 
>>>>>> other place (in jfrStackTraceRepository.cpp) where names were 
>>>>>> given, and those start with a capital letter.
>>>>>
>>>>> OK
>>>>>
>>>>> Here's the updated webrevs with the easy fixes:
>>>>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.02.delta/
>>>>>   https://cr.openjdk.java.net/~stefank/8241361/webrev.02
>>>>>
>>>>> Waiting for answers and comments to the rest.
>>>>>
>>>>> Thanks,
>>>>> StefanK
>>>>>
>>>>>>
>>>>>> cheers,
>>>>>> Per
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Added events:
>>>>>>>
>>>>>>> ZAllocationStall - Record when we run out of heap memory and the 
>>>>>>> Java threads stall, waiting for the GC to free up memory.
>>>>>>>
>>>>>>> ZPageAllocation - Updated the existing event to also record the 
>>>>>>> duration of the event. Updated the event to only be reported if 
>>>>>>> the allocation takes longer than 1 ms.
>>>>>>>
>>>>>>> ZPageCacheFlush - Record when the page cache needs to be flushed. 
>>>>>>> This usually happens when we run out of a specific page size and 
>>>>>>> have to detach the physical and virtual memory to materialize a 
>>>>>>> new ZPage. We also flush pages when we uncommit memory.
>>>>>>>
>>>>>>> ZRelocationSet - Record information about the selected relocation 
>>>>>>> set.
>>>>>>>
>>>>>>> ZUncommit - Record when we uncommit and hand back memory to the OS.
>>>>>>>
>>>>>>> The patch also contains some small cosmetic changes to existing 
>>>>>>> events, whitespace fixes.
>>>>>
>>>



More information about the hotspot-gc-dev mailing list