CRR: 7046558 G1: concurrent marking optimizations (S)
Tony Printezis
tony.printezis at oracle.com
Wed Jun 1 12:19:48 UTC 2011
Hi,
Could I please have a couple of code reviews for the following change:
http://cr.openjdk.java.net/~tonyp/7046558/webrev.0/webrev.all/
It includes a few optimizations in the concurrent marking code. The fact
that webrev shows that almost 400 lines of code have been changed is
actually a bit misleading given that the changes are actually relatively
simple. Here's a breakdown to make your life a bit easier:
Change 1: Specialize the G1CMOopClosure to avoid the virtual call to
do_oop() every time we visit an oop during marking (the closure used to
be called CMOopClosure, I prefixed its name with G1 since it's been
moved out of the concurrentMark.cpp file). Note: I didn't change much
code here, just moved things around.
http://cr.openjdk.java.net/~tonyp/7046558/webrev.0/webrev.G1CMOptSpecial/
Change 2 (builds on Change 1): Move methods CM::deal_with_reference()
and CM::push() to the .inline.hpp file, without changing them, to make
sure they get inlined, as they are called from the fast path. You can
either check line by line that they are unchanged. Or you can take my
word for it. :-)
http://cr.openjdk.java.net/~tonyp/7046558/webrev.0/webrev.G1CMOptInline/
Change 3 (builds on Change 2): A few minor performance improvements here
and there (plus a couple of cosmetic changes to introduce some missing
curly brackets).
http://cr.openjdk.java.net/~tonyp/7046558/webrev.0/webrev.G1CMOptMisc/
I kept track of the separate patches to be able to evaluate them
separately so I thought I'd present the changes separately to make the
code review process a bit easier. I will of course push them as a single
changeset.
I was not quite sure how to evaluate the performance impact of the
changes given that marking runs concurrently and not always in a
deterministic fashion (from a performance point-of-view!!!). So I did
the evaluation using collect and analyze. When looking at time-based
profiles, I see that the above changes decrease the time we spend in the
marking code (i.e., how long we spend in the
CMConcurrentMarkingTask::work(int i) method) by between 7%-10%. The
improvement is similar when looking at cycle counts. And I also see a
nice decrease in L2 cache misses. Each separate Change does improve on
the previous one, but the Change with the biggest impact was 1.
Tony
More information about the hotspot-gc-dev
mailing list