RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions

Hohensee, Paul hohensee at amazon.com
Thu Aug 2 18:14:56 UTC 2018


Inline.

On 8/1/18, 12:03 PM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:

    Hi,
    
    On Mon, 2018-07-30 at 19:18 +0000, Hohensee, Paul wrote:
    > At JVMLS, so can't look in depth this instant, but I'm fine with your
    > approach, except I'd get the new JMX and jstat structure in place
    > before fixing the data that gets reported. Imo it'll be easier to fit
    > correct data into the new JMX/jstat setup than into the old one, and
    > doing it the new way will give us a good idea of exactly what we
    > should do for the legacy ones.
    
    What I want to first do is polishing the existing code a bit so that
    changes to the jmx/jstat structures does not touch G1CollectedHeap code
    as much as possible, and remove cruft.

Good.
    
    Some of these refactorings are interesting for other somewhat unrelated
    improvements too :)

What are those? :)
    
    Before that is finished, we can meanwhile figure out what values we
    want in the MXBeans too...
    
    Which brings me back to the question of which pools will there be in
    the "new" mxbeans, what their contents could be, and the relations
    between the returned values (i.e. consistency).
    
    So my question has been, what do you think about the following
    additions to the scheme proposed in the CSR for the "new" set of pools:
    
     - remove the "to" survivor space pool
     - add a "free regions" pool

There's only one survivor space MemoryPool right now, so that doesn't change. Adding a free regions pool seems good to me, but I'm not sure how to define its content. Call it "G1 Free Space". 'committed' and 'used' would be the same value, just as for humongous and archive, and got from the free region set. 'init' would be (total committed for all regions - all the other memory pool 'init' values), 'max' would be -1 (undefined).

There's a 'to space' jstat counter which is explicitly disabled in G1 (always reads zero and is never updated). I assume this is for compatibility with existing jstat usage, if so I'd leave it alone, which I've done in my proposed patch.
     
    There should probably be a third change to the pools in the CSR: in G1
    it is possible that there are no regions for memory areas that are
    actually committed iirc. This is because region size can be different
    to page size - just consider 1G pages. G1 at G1CollectedHeap level may
    ask the HeapRegionManager to uncommit a certain amount of memory due to
    shrinking, but the HeapRegionManager may not find a contiguous set of
    regions that fits full 1G pages.
    What do you think about a "logically uncommited" pool?
    (I need to check how HeapRegionManager works again, but I think there
    is the possibility to have such)

This seems like it'd be a bit confusing for the average user. If the HeapRegionManager can't actually uncommit the memory, then from the user's pov it really is committed (i.e., not available to other processes).
    
    About returned value consistency:
    
    - it seems that the best we can achieve for MemoryPools that the values
    within a single MemoryUsage after a call to getUsage() are consistent.

Yes.
    
    - values of further calls to get the MemoryUsage of a different pool
    may contradict the first result.

Yes, if there's been a call to recalculate_sizes() in between. Given that there's usually a few seconds between gc events that cause that, if the JMX user grabs all the memory pool values at once, they're effectively always consistent.
    
    - a corollary of that is that the sum of values of a particular members
    (basically all of them) of MemoryUsage may not add up to particular
    values. E.g. the sum of the "max" fields will not sum up to -Xmx.

Yes. That's why the 'max' values are all -1 (undefined), except for the old space, which is -Xmx. So they do add up to -Xmx by definition.
    
    - only some values of MemoryUsage of the same pool are guaranteed to
    always be <= or >= a previous value between two GC induced safepoints
    (that includes Remark and Cleanup as we may free memory there). E.g.
    used() does, but others may not.
    
    - no consistency guarantees for other ways of getting memory sizes

Yes and yes, but see above. They’re 'almost always consistent'.
    
    About expected values:
    
    - the only pool with "init" size > 0 will be the "Free region" pool
    (and potentially the "logically uncommitted" one).

Depends on when you create the G1MemoryPool objects relative to when you call recalculate_sizes() for the first time. I put the latter as the first thing in initialize_serviceability(), so some 'init' fields may be non-zero, e.g., the archive pool.
    
    - "used" values will not include filler objects like in the other
    collectors.

Right now, all values are multiples of the region size, because it'd be a lot of overhead to calculate partial use. You can also argue that from the point of view of a user worrying about tracking memory pool size, region size granularity is good enough.
    
    - "max" is basically always the maximum heap capacity for all pools.
    (Probably it is more interesting to use "current" maximum heap
    capacity, not absolute maximum, i.e. the -Xmx value?)

The way people use the memory pools is typically to just add up the max values to get -Xmx, because they have code that works with all collectors. We don't want to break that invariant.
    
    The values in the compatibility pools will be some sum of the new ones.

Yes. The legacy value of the old pool 'used' is the sum of the new 'old', 'humongous' and 'archive' pools.
    
    > Your archive region set webrev looks pretty much the same as what I
    > wrote, but I got a trace trap when I tried to execute the resulting
    > JVM. Not a clue why, so I abandoned it.
    
    Can you describe this trace trap a bit more? On OSX, in debug?

It happened during the build when the new JVM tried to execute. I've merged my old repo with jdk tip and am running a build to try again. The webrev (some cruft in it) is at http://cr.openjdk.java.net/~phh/8196989/webrev.01/.
    
    > I'd not have thought of making a G1MonitoringScope, looks good.
    
      I was not soliciting reviews yet, but thanks anyway :) I will take
    these into consideration in the proper review.

:)

Paul
    
    Thanks,
      Thomas
    
    > 
    > Thanks,
    > 
    > Paul
    > 
    > On 7/30/18, 6:04 AM, "Thomas Schatzl" <thomas.schatzl at oracle.com>
    > wrote:
    > 
    >     Hi Paul,
    >     
    >       did some prototyping and wanted to show you the results and get
    > your
    >     input:
    >     
    >     On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote:
    >     > 
    >     [...]
    >     > Could we work together on first refactoring the code before
    > adding
    >     > new
    >     > kinds of spaces to the MXBeans?
    >     > 
    >     > Looking at this change and mine roughly the following issues
    > would
    >     > need to be resolved first:
    >     > - find a solution for archive regions as suggested above :) At
    > the
    >     > moment, without doing the change, I would tend to make archive
    >     > regions separate from old regions.
    >     
    >     I went with that and I am currently testing https://bugs.openjdk.
    > java.n
    >     et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.ope
    > njdk.j
    >     ava.net/~tschatzl/8208498/webrev/
    >     
    >     > - move serviceability stuff as much as possible to
    >     > g1MonitoringSupport
    >     
    >     Preliminary webrev:
    >     http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/we
    > brev/
    >     
    >     I think this came out better than expected: while we maybe want
    > to add
    >     a ServiceabilitySupport interface that collects the
    >     get_memory_manager/pools/* methods in the future, imho this is a
    > lot
    >     better than current code as it tightens the G1MonitoringSupport
    >     interface quite a bit.
    >     
    >     Particularly of note should be the G1MonitoringScope class that
    >     collects both TraceCollectorStats and TraceMemoryManagerStats
    > into a
    >     single class. (Instead of the two bools passed to it something
    >     indicating the GC directly would probably be better too).
    >     
    >     It would be nice if something similar could be made for the
    > concurrent Trace*Stats.
    >     
    >     > - clean up MemoryPool, remove duplicate information
    >     > - provide and return sane memory pool used/committed values to
    > the
    >     > MXBeans
    >     > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed"
    >     > variables for every single memory pool. Use MemoryUsage structs
    >     > for them. Make reading of memory pool information atomic wrt to
    its readers
    > (note
    >     > that I think it is currently just impossible to get consistent
    > output
    >     > for other statistics like jstat) - that's JDK-8207200.
    >     > - add whatever serviceability stuff for the new pools/jstat/*
    > in
    >     > steps.
    >     
    >     
    >     Thanks,
    >       Thomas
    >     
    >     
    > 
    
    



More information about the serviceability-dev mailing list