RFR(M): 8218049: Survivor MemoryMXBean used() size granularity is region based

Thu Apr 4 09:02:40 UTC 2019

On Wed, 2019-04-03 at 18:43 -0400, Kim Barrett wrote:
> > On Apr 3, 2019, at 4:05 AM, sangheon.kim at oracle.com wrote:
> > 
> > Hi all,
> > 
> > Here's the second webrev which includes some recommendations from
> > Thomas:
> > 1) Fixing memory ordering issue
> > 2) Changing how G1SurvivorRegions collect data.
> > 3) Some minor refactoring.
> > 
> > http://cr.openjdk.java.net/~sangheki/8218049/webrev.1
> > http://cr.openjdk.java.net/~sangheki/8218049/webrev.1_inc_0/
> > Testing: hs-tier 1 ~ 5, survivorAlignment/* tests
> > 
> > Thanks,
> > Sangheon
> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/g1/g1SurvivorRegions.cpp
>   31 G1SurvivorRegions::G1SurvivorRegions() : _regions(new
> (ResourceObj::C_HEAP, mtGC) GrowableArray<HeapRegion*>(8, true,
> mtGC)),
>   32                                          _used_bytes(0) {}
> 
> This is kind of weirdly formatted.  How about putting a line break
> after the colon, and indenting the initializers "normally".
> 
> -------------------------------------------------------------------
> -----------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 4589   OrderAccess::storestore();
> 
> src/hotspot/share/gc/g1/g1MonitoringSupport.cpp
>  233   OrderAccess::loadload();
> 
> (Note: I wrote the comment below before reading Thomas's
> reply.  Short
> summary: I don't believe the newly introduced barriers are
> sufficient.)
> 
> I think these new uses of OrderAccess are an attempt to ensure the
> monitoring code gets a consistent pair of values.  But I don't think
> it accomplishes that goal.  We have
> 
> (G1CollectedHeap::retire_mutator_alloc_region)
> T1: (a) update heap used, storestore, (b) updating eden used
> 
> (G1MonitoringSupport::recalculate_sizes)
> T2: (a) get heap used, loadload, (b) getting eden used
> 
> Consider the following interleaving:
> T2-a: get heap used
> T1-a: update heap used
> T1-b: update eden used
> T2-b: get eden used

:(

> The other interleaving resulting in inconsistent values being used
> for reporting (T1-a, T2-a, T2-b, T1-b) at least doesn't risk tripping
> the assert.

:( :(

> 
> I'm not sure what to suggest here, because I'm not sure what the
> goals are. I don't think it's possible to ensure the monitoring side
> is using consistent values without additional synchronization. I 

At this time the only condition is to get "consistent" values into the
MemoryUsage output, however they are achieved.

That's why there has previously been this "subtract_until_zero" and
similar weird calculations to get at least some resemblance of
internally consistent values.

Ideally of course all values are consistent and actually reflect the
state of what the GC sees.

> guess there is some reason not to use a lock here? Double buffering
> or GlobalCounter could be used instead.

Double buffering (if I think it is what you think it is) would have
been what I would have suggested, but it seems at least some effort.

For this change I suggest going back to use the subtract_until_zero()
method (whether we use the racy region counts or the actual "used
bytes" makes not difference with respect to the problem) and
investigate improvements in a separate CR.

Thanks,
  Thomas