Request for review (S): 7068625 Testing 8 bytes of card table entries at a time speeds up card-scanning

John Coomes John.Coomes at oracle.com
Mon Mar 12 23:08:38 UTC 2012


Alexey Ragozin (alexey.ragozin at gmail.com) wrote:
> Hi,
> 
> Updated patch is here
> http://cr.openjdk.java.net/~brutisso/7068625/webrev.01/

Hi Alexey,

Can you change back to the single decrement of cur_entry in
ClearNoncleanCardWrapper::do_MemRegion()?  With the updated code,
having the increment at the end of the loop (line 226 in the new
version) keeps the preceding comment relevant.

In cardTableModRefBS.hpp, I think

  75   // a word worth row of clean card values

should be

  75   // a word's worth (row) of clean card values

Aside from those two nits, looks good.

-John

> From: John Coomes <John.Coomes at oracle.com>
> > Subject: Re: Fwd: (resend) Request for review (S): 7068625 Testing 8
> >        bytes of        card table entries at a time speeds up card-scanning
> > To: Bengt Rutisson <bengt.rutisson at oracle.com>
> > Cc: "hotspot-gc-dev at openjdk.java.net"
> >        <hotspot-gc-dev at openjdk.java.net>
> > Message-ID: <20300.16537.700270.474085 at oracle.com>
> > Content-Type: text/plain; charset=us-ascii
> >
> > Bengt Rutisson (bengt.rutisson at oracle.com) wrote:
> > >
> > > Hi all,
> > >
> > > Just pinging this review request. Does anybody have some time to look at
> > > it? It is a fairly small and straight forward change...
> >
> > Semantically looks ok.  Some style issues, though.
> >
> > cardTableRS.cpp:
> > ---------------
> >
> > I'd rather see the loop on 207 rewritten as:
> >
> >  207         while (cur_row >= limit &&
> >  208                *((intptr_t*)cur_row) ==
> > CardTableRS::clean_card_row()) {
> >  209           cur_row -= BytesPerWord;
> >  210         }
> >
> 
> done
> 
> as it's shorter and eliminates the 'break'.
> >
> > There are also whitespace problems (inconsistent indentation, 4 spaces
> > instead of 2, no space after while, etc.).
> >
> > The comment that starts with 'Reset the dirty window' is now in only
> > one of the places that does the reset.  And this comment:
> >
> >    // Note that "cur_entry" leads "start_of_non_clean" in
> >    // its leftward excursion after this point
> >    // in the loop and, when we hit the left end of "mr",
> >    // will point off of the left end of the card-table
> >    // for "mr".
> >  }
> >
> > is in an odd position (no code follows it) and the 'after' should be
> > changed to 'at'.
> >
> > But instead of changing the comments, the new behavior can be achieved
> > with a single new hunk of code in do_MemRegion that keeps the comments
> > relevant:
> >
> > --- cardTableRS.cpp.org
> > +++ cardTableRS.cpp
> > @@ -17,10 +17,22 @@
> >       // "dirty" range accumulated so far.
> >       if (start_of_non_clean < end_of_non_clean) {
> >         const MemRegion mrd(start_of_non_clean, end_of_non_clean);
> >         _dirty_card_closure->do_MemRegion(mrd);
> >       }
> > +
> > +      // fast forward through potential continuous range of clean cards
> > +      if (is_word_aligned(cur_entry)) {
> > +        jbyte* cur_row = cur_entry - BytesPerWord;
> > +        while (cur_row >= limit &&
> > +               *((intptr_t*)cur_row) == CardTableRS::clean_card_row()) {
> > +          cur_row -= BytesPerWord;
> > +        }
> > +        cur_entry = cur_row + BytesPerWord;
> > +        cur_hw = _ct->addr_for(cur_row + BytesPerWord);
> > +      }
> > +
> >       // Reset the dirty window, while continuing to look
> >       // for the next dirty card that will start a
> >       // new dirty window.
> >       end_of_non_clean = cur_hw;
> >       start_of_non_clean = cur_hw;
> >
> > I think it's easier to follow.
> >
> >
> agreed
> 
> 
> > cardTableRS.hpp:
> > ---------------
> >
> >  48   // making this constants into inline methods kills performance for
> > some reason
> >  49   static intptr_t clean_card_row() {
> >  50     return CardTableModRefBS::clean_card_row;
> >  51   }
> >
> > The comment seems to warn against the very thing the code is doing.
> > If the comment is accurate, it's fine to eliminate the method and use
> > the enum directly.
> >
> 
> comment was included in patch by accident, removed
> 
> 
> >  53   static bool  card_is_dirty_wrt_gen_iter(jbyte cv) {
> >  54     return CardTableModRefBS::card_is_dirty_wrt_gen_iter(cv);
> >
> > This whitespace change looks spurios.
> >
> 
> true
> 
> 
> >
> > -John
> >
> > > -------- Original Message --------
> > > Subject:      Request for review (S): 7068625 Testing 8 bytes of card
> > table
> > > entries at a time speeds up card-scanning
> > > Date:         Tue, 21 Feb 2012 12:03:50 +0400
> > > From:         Alexey Ragozin <alexey.ragozin at gmail.com>
> > > To:   hotspot-gc-dev at openjdk.java.net
> > > CC:   Bengt Rutisson <bengt.rutisson at oracle.com>
> > >
> > >
> > >
> > > Hi,
> > >
> > > I would like few volunteers to review changes for
> > > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7068625
> > > WebRev: http://cr.openjdk.java.net/~brutisso/7068625/webrev.00/
> > > <http://cr.openjdk.java.net/%7Ebrutisso/7068625/webrev.00/>
> > >
> > > Change summary
> > > For large heaps (I was focusing on 8GiB and above) it is common to have
> > > long continuous ranges of clean cards.
> > > Patch is introducing a short path for skipping ranges of clean cards
> > > using word aligned memory access instead of byte aligned.
> > >
> > > Patch affects serial and CMS collectors. For CMS collector stride size
> > > should be increase to see any performance gains (I was using
> > > -XX:+UnlockDiagnosticVMOptions
> > > -XX:ParGCCardsPerStrideChunk=4096)
> > >
> > > For testing I was mainly using synthetic benchmark randomly modifying
> > > hash tables in heap, thus uniformly touching cards across heaps.
> > > Average duration of young GC pause were used as KPI.
> > > More details about testing can be found at
> > >
> > http://blog.ragozin.info/2011/07/openjdk-patch-cutting-down-gc-pause.html
> > > Though article is referring jdk6, my resent tests with trunk jdk7 show
> > > no difference.
> > > I was also tested patch with real application (Oracle Coherence storage
> > > node).
> > > With 16GiB of heap and CMS/ParNew GC, enabling patch have shortened GC
> > > pauses roughly in 2 times.
> > >
> > > Source code of benchmark used in test are available at
> > >
> > https://gridkit.googlecode.com/svn/branches/aragozin-sandbox/young-gc-bench
> > > Main class YoungGCPauseBenchmark
> > >
> > > Regards,
> > > Alexey
> > >
> >



More information about the hotspot-gc-dev mailing list