RFR: 8138920: Refactor the sampling thread from ConcurrentG1RefineThread

Thu Oct 22 08:53:39 UTC 2015

Hi Derek,

On Wed, 2015-10-21 at 12:29 -0400, Derek White wrote:
> Thanks Thomas,
> 
> On 10/21/15 4:51 AM, Thomas Schatzl wrote:
> 
> > Hi,h
> > 
> > On Tue, 2015-10-20 at 15:56 -0400, Derek White wrote:
> > > 4th Webrev for review please: 
> > > 
> > > This version includes the cleanups suggested by Per.
> > > 
> > > RFE: JDK-8138920 Refactor the sampling thread from
> > > ConcurrentG1RefineThread 
> > > 
> > > Webrev: http://cr.openjdk.java.net/~drwhite/8138920/webrev.04/
> > > 
> > > jprt: waiting for results (which might take a long time given current
> > > jprt status).
> > > 
> > > Thanks for looking!
> > Looks good overall. Some minor comments:
> > 
> > - I am okay with deferring the move of the sample thread to another CR.
> > Is there a CR for that already?
> JDK-8140255 Move the management of G1YoungRemSetSamplingThread from
> ConcurrentG1Refine

Thanks.
> 
> > - the comment for ConcurrentG1Refine::sampling_thread() might need some
> > elaboration.
> I agree. :-) If you have any suggestions...

I think we can just remove that if we describe the purpose of the
G1YoungListSampleThread.

> > - also add some description of the purpose of the new
> > G1YoungListSampleThread class. 
> You are assuming that I understand what this code is really doing (and
> why) :-)

The remembered set sampling thread is used to re-assess the validity of
the prediction for the remembered set lengths of the young generation.

At the end of the GC G1 determines the length of the young gen based on
how much time the next GC can take, and when the next GC may occur
according to the MMU.

The assumption is that a significant part of the GC is spent on scanning
the remembered sets (and many other components): for some reason the
original implementors of G1 thought that remembered sets were the most
critical and volatile ones (and probably most easiest). So they added
that thread that constantly reevaluates the prediction for the
remembered set scanning costs, and resizes the young gen. This
potentially makes the GC do a premature GC (or even increase the young
gen because we can :)) to keep pause time length.

Now somebody needs to summarize above in a more coherent way :)

That's why I also mentioned that G1YoungListSampleThread is not a very
good name as the thread does not sample the young "list" (whatever this
is) but the remembered set size to adapt the young gen size.

Maybe something with young gen length (re-)sizing/evaluating or
something with remembered set sampling in the name would be a better
fit.

I personally think that the initial assumption about remembered sets is
an oversimplification. There are lots of other reasons to e.g. increase
the young gen if the MMU allows it.
> 
> OK, in order to understand this better I really needed a definition of
> "MMU". Surprisingly I could find no definition of MMU in the source
> code. Although that seems to be a more major deficiency in our
> documentation, I'd like to finish 8138920 someday. See "JDK-8140251
> Define the G1 term MMU somewhere in the source code".

The change is fine, I think it's already pushed too.
> 
> Still looking into how this all fits together. I ask myself - How is
> this related to
> ConcurrentMarkThread::handle_adaptive_young_list_length()? Does that
> really have to do with marking?

The MMU tracker considers all (GC related) pauses. Since marking also
contains a few of them, they must be considered in the MMU.

The advantage of marking pauses are that they can be scheduled somewhat
flexibly - like it does not really matter (simplified) if GC remark or
GC cleanup are postponed a little.

Handle_adaptive_young_list_length() is a complete misnomer though,
probably named by the condition that enables the actual work.

I.e. only if G1CollectedHeap::adaptive_young_list_length() is true,
there is a point in scheduling the marking pauses. Sizing the young gen
is the main (and only) way G1 can space GC pauses at the moment (G1
cannot actively slow down mutators at this time), so this delaying of
the marking pauses is only done if G1 can choose young gen size by
itself.

> Where are other parts of this functionality hidden? What is that
> beautiful house?  Where does that highway go to?  Am I right?...Am I
> wrong? My God!...What have I done?! 

:)

> Same as it ever was... Still looking...
> > I am okay with the name, it could be more
> > specific like G1YoungRemSetSamplingThread. I will let you decide whether
> > to keep it.
> OK, changed.
> > - please rename
> > ConcurrentMarkThread::handle_adaptive_young_list_length() to something
> > more meaningful. The method delays the current thread so that the next
> > expected pauses observe the MMU. (I think it is really good to extract
> > this into a method!)
> > So something like observe_mmu() or delay_thread_to_keep_mmu() would be
> > imo more appropriate. These are just suggestions I came up within a few
> > seconds though...
> Those sound better, but still looking...

See above for background.

> > - please also add some description of what these new methods do if not
> > really obvious. E.g. that handle_adaptive_young_list_length() method
> > does not qualify for obvious :)
> Lots of non-obvious here. I found it pretty confusing that the
> variable "adaptive_young_list_length" is not actually a length, but a
> boolean!

Either rename it (probably best) or wrap it in a condition with a useful
name.

> >  I saw that the change fixes existing
> > comments, but they are few and far between.
> > 
> > Thanks for handling this issue,
> >   Thomas
> 
> OK, thanks Thomas. I'll get a new webrev out after some more research.

Thanks,
  Thomas