RFR (S): 8195115: G1 Old Gen MemoryPool CollectionUsage.used values don't reflect mixed GC results
Kirk Pepperdine
kirk at kodewerk.com
Thu Feb 1 07:15:02 UTC 2018
Hi Paul,
Is this is targeted for 11? It seems like a positive useful change.
<TLDR>
My opinion is that if you’re going to break the tool chain.. make sure the change is well worth the pain and get in everything that you want or makes sense or what ever. Just make sure it’s worth the pain. It’s been far too often that this hasn’t been the case.
As for this change, bundling values together hides information. This often leaves you either guessing as to what the underlying constraint is or more commonly, one responds to poor performance by expanding the heap size until the underlying supporting data structure is accidently at the a size it needed to be. A poor way of dealing with the issue at hand. Much better would be to understand which resource is stressed and only tune that bit leaving the other stuff alone meaning more information here is better then less. As the moment I calculate tenured size but subtracting young from total. The answer is of course incorrect in that is muxs in other stuff (reserved) but it’s much better to perform analytics on an incorrect tenured than a noisy total heap value. This It would be even better if the breakout was more precise and this work seems to provide a better breakout.
As an aside, a number of years ago with I was working on Crays, CRI published a not that their processors could overflow under certain conditions and they wanted to redesign the CPU around the IEEE standard to prevent/correct for this. The resounding response from the Cray users was NO. Reason being, these researchers had years of accumulated data and results and techniques that had been adjusted for the bugs in the Cray CPU. Everyone knew they were there… everyone knew that IEEE was better and a sensible fix to make except when you added in the disruption that would have been caused by the loss of years and years of very valuable historical data…
Kind regards,
Kirk
> On Jan 31, 2018, at 2:30 AM, Hohensee, Paul <hohensee at amazon.com> wrote:
>
> It’s true that my patch doesn’t completely solve the larger problem, but it fixes the most immediately important part of it, particularly for JDK8 where current expected behavior is entrenched. If we’re going to fix the larger problem, imo we should file another bug/rfe to do it. I’d be happy to fix that one too, once we have a spec.
>
> What do you think of my suggestions? To summarize:
>
> - Keep the existing memory pools and add humongous and archive pools.
> - Make the archive pool part of the old gen, and generalize it to all collectors.
> - Split humongous regions off from the old gen pool into their own pool. The old gen and humongous pools are disjoint region sets.
> - Keep the existing “G1 Young Generation” and “G1 Old Generation” collectors and their associated memory pools (net of this patch). We add the humongous pool to them.
> - Add “G1 Full” as an alias of the existing “G1 Old Generation” collector.
> - Add the “G1 Young”, “G1 Mixed” and “G1 Concurrent Cycle” collectors.
> - Set the G1 old gen memory pool max size to –Xmx, the archive space max size to whatever it is, and the rest of the G1 memory pool max sizes to -1 == undefined, as now.
>
> The resulting memory pools:
>
> “G1 Eden Space”
> “G1 Survivor Space”
> “G1 Old Gen”
> “G1 Humongous Space”
> “Archive Space”
>
> The resulting collectors and their memory pools:
>
> “G1 Young Generation” (the existing young/mixed collector), “G1 Old Generation”/”G1 Full”, “G1 Mixed”
> - “G1 Eden Space”
> - “G1 Survivor Space”
> - “G1 Old Gen”
> - “G1 Humongous Space”
> “G1 Young”
> - “G1 Eden Space”
> - “G1 Survivor Space”
> - “G1 Humongous Space”
> “G1 Concurrent Cycle”
> - “G1 Old Gen”
> - “G1 Humongous Space”
>
> I’m not religious about the names, but I like my suggestions. :)
>
> The significant addition to my previous email, and an incompatible change, is splitting humongous regions off from the old gen pool. This means that apps that specifically monitor old gen occupancy will no longer see humongous regions. Monitoring apps that just add up info about all a collector’s pools won’t see a difference. I may be corrected (by Kirk, perhaps), but imo it’s not as bad a compatibility issue as one might think, because the type of app that uses a lot of humongous regions isn’t all that common. E.g., apps that cache data in the heap in the form of large compressed arrays, and apps with large hashmap bucket list arrays. The heaps such apps use are very often large enough to use 32mb regions, hence need really big objects to actually allocate humongous regions.
>
> Thanks,
>
> Paul
>
> On 1/30/18, 5:51 AM, "Erik Helin" <erik.helin at oracle.com> wrote:
>
> On 01/30/2018 03:07 AM, Hohensee, Paul wrote:
>> That’s one reviewer who’s ok with a short term patch. Anyone else? And,
>> any reviewers for said short term patch? :)
>
> Well, the patch is not really complete as it is. The problem is the
> definitions of the MemoryPoolMXBeans and GarbageCollectorMXBeans, which,
> as I tried to hint at in my first email, is a mess for G1. The names and
> implementations of these MemoryPoolMXBeans and GarbageCollectionMXBeans
> for G1 are very old, G1 has changed a lot since those were implemented
> (hence my suggestion for finally fixing this).
>
> The issue with your patch is that the MemoryPoolMXBean named "G1 Old
> Gen" consists of both old and humongous regions (it will also include
> archive regions). Old regions can be collected by mixed, concurrent and
> full collections. Humongous regions can be collected by young, mixed or
> full collections and the concurrent cycle. Archive regions will never be
> collected. Your patch will update the pool in the case of a mixed
> collection collecting old regions or humongous regions, but misses the
> following cases:
> - a young collection collecting humongous regions
> - a concurrent cycle collecting humongous regions
> - a concurrent cycle collecting old regions
>
> Unfortunately I could not come up with a good way to solve the above
> without re-designing the pools. I'm not sure about accepting your patch
> as is, since it might cause even more confusion for users compared to
> the current (already confusing) situation.
>
> One idea we have discussed is to implement the re-design but also add a
> flag, -XX:+UseG1LegacyPoolsAndBeans (false by default), to allow for a
> smoother transition. Would that solution work for you?
>
> Thanks,
> Erik
>
>> Thanks,
>>
>> Paul
>>
>> *From: *mandy chung <mandy.chung at oracle.com>
>> *Organization: *Oracle Corporation
>> *Date: *Monday, January 29, 2018 at 1:41 PM
>> *To: *"Hohensee, Paul" <hohensee at amazon.com>
>> *Cc: *"serviceability-dev at openjdk.java.net"
>> <serviceability-dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net"
>> <hotspot-gc-dev at openjdk.java.net>
>> *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool
>> CollectionUsage.used values don't reflect mixed GC results
>>
>> I created JDK-8196362 to look into whether it makes sense to provide
>> some categorization to differentiate eden space vs the heap space for
>> long-lived objects.
>>
>> W.r.t. to JDK-8195115, I have to defer to GC team to comment on the
>> mixed collection update. If they are okay, I have no objection to
>> implement a short-term fix and do the proper G1 memory pools as a
>> separate patch.
>>
>> Mandy
>>
>> On 1/29/18 1:02 PM, Hohensee, Paul wrote:
>>
>> We don’t use getType, and you guessed correctly: we use the memory
>> pool name as an indicator of the specific characteristics of a
>> memory pool, in particular eden.
>>
>> What we want is an indication of long term heap occupancy. We
>> calculate it using CollectionUsage for non-eden heap memory pools,
>> regardless of collector. We don’t use JMX notification, rather we
>> periodically poll CollectionUsage for memory pools with names that
>> contain “Old”, “Tenured”, or “Survivor”. We get the memory pools
>> from the GarbageCollectorMXBeans (we don’t care what the collector
>> names are). For the named memory pools, we sum CollectionUsage.used
>> and divide by the sum of CollectionUsage.max to get a long term heap
>> occupancy percentage. We don’t want to include eden because it’s
>> really just an allocation buffer and not part of the storage for
>> long-lived objects. I suppose we could use a negative test instead
>> by using all memory pools with names that don’t include “Eden”.
>>
>> The bug is that the “G1 Old Gen” memory pool isn’t being updated
>> when the “G1 Young Generation” collector runs a mixed collection. As
>> far as JMX is concerned, that collector only knows about eden and
>> the survivor space. The patch adds the old gen to the memory pools
>> it knows about and has mixed collections update the old gen’s
>> CollectionUsage.
>>
>> I managed to get a submit repo run to succeed last week and it found
>> a problem. I’ve uploaded a new webrev that fixes the failure of the
>> jtreg test TestMemoryMXBeansAndPoolsPresence.java due to the young
>> gen collector being expected to know only about eden and the
>> survivor space.
>>
>> http://cr.openjdk.java.net/~phh/8195115/webrev.hs.01/
>> <http://cr.openjdk.java.net/%7Ephh/8195115/webrev.hs.01/>
>>
>> Waiting on the submit repo to come back with a result on it.
>>
>> Thanks,
>>
>> Paul
>>
>> *From: *mandy chung <mandy.chung at oracle.com>
>> <mailto:mandy.chung at oracle.com>
>> *Organization: *Oracle Corporation
>> *Date: *Monday, January 29, 2018 at 10:52 AM
>> *To: *"Hohensee, Paul" <hohensee at amazon.com>
>> <mailto:hohensee at amazon.com>, Erik Helin <erik.helin at oracle.com>
>> <mailto:erik.helin at oracle.com>, David Holmes
>> <david.holmes at oracle.com> <mailto:david.holmes at oracle.com>
>> *Cc: *"serviceability-dev at openjdk.java.net"
>> <mailto:serviceability-dev at openjdk.java.net>
>> <serviceability-dev at openjdk.java.net>
>> <mailto:serviceability-dev at openjdk.java.net>,
>> "hotspot-gc-dev at openjdk.java.net"
>> <mailto:hotspot-gc-dev at openjdk.java.net>
>> <hotspot-gc-dev at openjdk.java.net>
>> <mailto:hotspot-gc-dev at openjdk.java.net>
>> *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool
>> CollectionUsage.used values don't reflect mixed GC results
>>
>> On 1/29/18 10:35 AM, mandy chung wrote:
>>
>> Thanks for the reply Paul. Try to understand a little more on
>> the specific from gc-specific memory pool you depend on.
>>
>> On 1/29/18 8:27 AM, Hohensee, Paul wrote:
>>
>> A name change would affect Amazon’s heap monitoring, and
>> thus I expect it would affect other users as well.
>>
>> As long as there are gc-specific memory pools, we’re going
>> to need to be able to identify them, and right now that’s
>> done via name.
>>
>>
>> MemoryPoolMXBean::getType returns "heap" memory type for
>> GC-specific memory pools. Are you using this method? Do you
>> use the name to build in specific characteristic of a memory
>> pool (e.g. eden vs old gen)?
>>
>>
>>
>>
>> All the mxbeans are identified by name, so that’s a general
>> design principle. The only way I can think of to get rid of
>> name dependency would be to figure out what abstract metrics
>> users want to monitor and implement them for all collectors.
>> HeapUsage (instantaneous occupancy) is one, CollectionUsage
>> (long-lived occupancy) is another, both of these for the
>> entire heap, not just particular memory pools.
>>
>>
>> The sum of HeapUsage and CollectionUsage of all heap memory
>> pools was expected to give an incorrect approximation for the
>> entire heap usage. Are you seeing issue/bug with the sum result?
>>
>>
>> typo: s/an incorrect approximation/an approximation.
>>
>> Mandy
>>
>>
>>
>> Mandy
>>
>>
>>
>> That said, imo there will always be a demand for the ability
>> to get collector and memory pool specific details, so I
>> don’t see a way to get around providing named entities.
>>
>> Paul
>>
>> *From: *mandy chung <mandy.chung at oracle.com>
>> <mailto:mandy.chung at oracle.com>
>> *Organization: *Oracle Corporation
>> *Date: *Friday, January 26, 2018 at 2:38 PM
>> *To: *"Hohensee, Paul" <hohensee at amazon.com>
>> <mailto:hohensee at amazon.com>, Erik Helin
>> <erik.helin at oracle.com> <mailto:erik.helin at oracle.com>,
>> David Holmes <david.holmes at oracle.com>
>> <mailto:david.holmes at oracle.com>
>> *Cc: *"serviceability-dev at openjdk.java.net"
>> <mailto:serviceability-dev at openjdk.java.net>
>> <serviceability-dev at openjdk.java.net>
>> <mailto:serviceability-dev at openjdk.java.net>,
>> "hotspot-gc-dev at openjdk.java.net"
>> <mailto:hotspot-gc-dev at openjdk.java.net>
>> <hotspot-gc-dev at openjdk.java.net>
>> <mailto:hotspot-gc-dev at openjdk.java.net>
>> *Subject: *Re: RFR (S): 8195115: G1 Old Gen MemoryPool
>> CollectionUsage.used values don't reflect mixed GC results
>>
>> On 1/25/18 1:04 PM, Hohensee, Paul wrote:
>>
>>
>>> The JMX API spec doesn’t specify what the memory pool or
>> garbage > collector names are, but the current names are
>> de-facto part of the > API, so if we change the existing
>> ones, imo a CSR should be filed.
>>
>> The names are implementation details but I can see how an application
>>
>> might be impacted if they depend on it. CSR approval is not strictly
>>
>> necessary while I think filing one to document the change would be
>>
>> good.
>>
>> Does the name change impact any application you know of? I'm trying to
>>
>> understand if any improvement to API is needed so that applications
>>
>> don't need to depend on the names.
>>
>>
>> Mandy
>>
>>
>>
>>
>>
>>
>>
>>
>
>
More information about the hotspot-gc-dev
mailing list