RFR (M): 8027959: Investigate early reclamation of large objects in G1

Wed Jul 16 12:35:12 UTC 2014

Hi Thomas,

Thank for fixing this! Overall it looks very good.

I had an offline conversation with Thomas about using more explicit 
testing for humongous objects instead of relying on failed allocations 
in PLABs. We need to evaluate that from a performance perspective.

Some other minor comments:

g1_globals.hpp
Indentation of the trailing \ looks wrong. At least in the webrev.

g1CollectedHeap.hpp

   void set_in_cset(uintptr_t index) { assert(get_by_index(index) != 
IsHumongous, "Should not overwrite InCSetOrHumongous value"); 
set_by_index(index, InCSet); }

I think the assert message should be: "Should not overwrite IsHumongous 
value"

g1CollectedHeap.cpp

Style comment (not a strong opinion here). I think I would prefer to 
move the check for G1ReclaimDeadHumongousObjectsAtYoungGC into inside 
clear_humongous_is_live_table() rather than having it in gc_prologue(). 
Also, can clear_humongous_is_live_table() skip the clearing if 
_has_humongous_reclaim_candidates is false?

Similarly, I would prefer to move the 
G1ReclaimDeadHumongousObjectsAtYoungGC tests from 
do_collection_pause_at_safepoint() into 
register_humongous_regions_with_in_cset_fast_test() and 
eagerly_reclaim_humongous_regions().

The test in TestGCLogMessages just tests the syntax of the new log 
messages. Is it worth adding a test that actually verifies that 
humongous objects are reclaimed?

Thanks,
Bengt

On 2014-07-15 11:10, Thomas Schatzl wrote:
> Hi all,
>
>    could I have reviews for the following change that allows G1 to
> eagerly/early reclaim humongous objects on every GC?
>
> Problem:
>
> In G1 large objects are always allocated in the old generation,
> currently requiring a complete heap liveness analysis (full gc, marking)
> to reclaim them.
>
> This is far from ideal for many transaction based enterprise
> applications that create large objects that are only live until a
> (typically short-lived) transaction has been completed (e.g. in a
> ResultSet of a JDBC query that generates a large result,
> byteoutputstreams, etc).
> This results in the heap filling up relatively quickly, typically
> leading to unnecessary marking cycles just to reclaim them.
>
> The solution implemented here is to directly target these types of
> objects by using remembered set and reachability information from any GC
> to make (conservatively) sure that we can reclaim the space.
>
> You can quickly determine this if there are no references from the roots
> or young gen to that object, and if there are no remembered set entries
> to that object. This is sufficient because:
>   - g1 root processing walks over all roots and young gen always which
> are sources for potential references.
>   - the remembered set contains potential locations that reference this
> object. These are all locations, as humongous objects are always
> allocated into their own regions (so there can be no intra-region
> references).
>
> These are all potential reference locations during GC pause because GC
> pause makes sure that the remembered set is current at pause time.
>
> We can also reclaim if the region is considered live by the marking if
> it has been allocated during marking (and the other conditions hold). At
> reclaim time, if something referenced that object, there either must
> have been a remembered set entry or a reference from the roots/or young
> gen if it is actually live so nobody can install a reference from it any
> more.
> (If there has once been a reference from another old region, it must
> have had a remembered set entry).
> When marking continues after GC, it will simply notice that the region
> has been freed, and skip over it during continue.
>
> After putting the humongous region into the collection set, liveness
> detection occurs by intercepting the slow path for allocation of space
> for that humongous object. As it is humongous, we always end up there.
>
> The change includes some minor optimizations:
>   - during registering the humongous regions for inclusion into the
> collection set, we already check whether that humongous object is
> actually one we can potentially remove. E.g. has no remembered set. This
> makes it a "humongous candidate" (note there is no actual state for
> this, just a name for these regions)
>   - after finding out that the region is live once, remove that humongous
> region from the collection set so that further references to it do not
> cause use to go into the slow path. This is to avoid going into the slow
> path too often if that object is referenced a lot. (Most likely, if that
> object had many references it would not be a "humongous candidate" btw)
>   - if there were no candidates at the start of the GC, then do not
> bother trying to reclaim later.
>
> In total I found no particular slowdown when enabling this feature by
> default. I.e. if there are no humongous candidate objects, there will be
> no change to the current code path at all because none will be added to
> the collection set.
>
> The feature can be disabled completely by disabling
> G1ReclaimDeadHumongousObjectsAtYoungGC.
>
> There is a new log line "Humongous Reclaim" measuring reclaim time, and
> if with G1LogLevel=finest is set it prints some statistics about total,
> candidate and reclaimed humongous objects on the heap.
>
> The CR contains a graph showing large improvements on average humongous
> object reclamation delay. In total we have seen some benchmarks
> reclaiming GBs of heap space over time using this functionality (instead
> of waiting for the marking/full GC). This improves throughput
> significantly as there is more space available for the young gen on
> average now.
>
> Also it might avoid users to manually increase heap region sizes just to
> avoid humongous object troubles.
>
> CR:
>   https://bugs.openjdk.java.net/browse/JDK-8027959
>
> Webrev:
>   http://cr.openjdk.java.net/~tschatzl/8027959/webrev/
>
> Testing:
>   jprt, aurora adhoc, various internal benchmarks
>
> Thanks,
>    Thomas
>
>