RFR(M): 8218049: Survivor MemoryMXBean used() size granularity is region based

Wed Apr 3 22:43:38 UTC 2019

> On Apr 3, 2019, at 4:05 AM, sangheon.kim at oracle.com wrote:
> 
> Hi all,
> 
> Here's the second webrev which includes some recommendations from Thomas:
> 1) Fixing memory ordering issue
> 2) Changing how G1SurvivorRegions collect data.
> 3) Some minor refactoring.
> 
> http://cr.openjdk.java.net/~sangheki/8218049/webrev.1
> http://cr.openjdk.java.net/~sangheki/8218049/webrev.1_inc_0/
> Testing: hs-tier 1 ~ 5, survivorAlignment/* tests
> 
> Thanks,
> Sangheon

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1SurvivorRegions.cpp
  31 G1SurvivorRegions::G1SurvivorRegions() : _regions(new (ResourceObj::C_HEAP, mtGC) GrowableArray<HeapRegion*>(8, true, mtGC)),
  32                                          _used_bytes(0) {}

This is kind of weirdly formatted.  How about putting a line break
after the colon, and indenting the initializers "normally".

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1CollectedHeap.cpp
4589   OrderAccess::storestore();

src/hotspot/share/gc/g1/g1MonitoringSupport.cpp
 233   OrderAccess::loadload();

(Note: I wrote the comment below before reading Thomas's reply.  Short
summary: I don't believe the newly introduced barriers are sufficient.)

I think these new uses of OrderAccess are an attempt to ensure the
monitoring code gets a consistent pair of values.  But I don't think
it accomplishes that goal.  We have

(G1CollectedHeap::retire_mutator_alloc_region)
T1: (a) update heap used, storestore, (b) updating eden used

(G1MonitoringSupport::recalculate_sizes)
T2: (a) get heap used, loadload, (b) getting eden used

Consider the following interleaving:
T2-a: get heap used
T1-a: update heap used
T1-b: update eden used
T2-b: get eden used

The monitoring code's heap-used value is out of sync with it's
eden-used value, the latter containing the new update but the former
not. And that could lead to failed assertions and inconstent reported
values. I don't know how bad inconsistent values might be, other than
the assert in recalculate_sizes looks like it could theoretically get
tripped. (Consider before any objects have been promoted to old.)

The other interleaving resulting in inconsistent values being used for
reporting (T1-a, T2-a, T2-b, T1-b) at least doesn't risk tripping the
assert.

I'm not sure what to suggest here, because I'm not sure what the goals
are. I don't think it's possible to ensure the monitoring side is
using consistent values without additional synchronization. I guess
there is some reason not to use a lock here? Double buffering or
GlobalCounter could be used instead.

------------------------------------------------------------------------------