<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<a class="moz-txt-link-freetext" href="http://cr.openjdk.java.net/~tschatzl/8146987/webrev.1/src/share/vm/gc/parallel/parMarkBitMap.cpp.frames.html">http://cr.openjdk.java.net/~tschatzl/8146987/webrev.1/src/share/vm/gc/parallel/parMarkBitMap.cpp.frames.html</a><br>
<br>
<pre><span class="new"> 143 } else if (end_obj < last_obj) {</span>
<span class="new"> 144 if (pointer_delta((HeapWord*)end_obj, (HeapWord*)beg_addr) > pointer_delta((HeapWord*)last_obj, (HeapWord*)end_obj)) {</span>
<span class="new"> 145 last_ret = last_ret - live_words_in_range_helper((HeapWord*)end_obj, last_obj);</span>
<span class="new"> 146 } else {</span>
<span class="new"> 147 last_ret = live_words_in_range_helper(beg_addr, end_obj);</span>
<span class="new"> 148 }
</span></pre>
<span class="new">Did you measure the performance improvement
afforded by lines 144 - 145?<br>
<br>
The calculation of the new address is used in two cases. One is
when<br>
the live object is being moved to its new location. In that case<br>
I would expect that the overwhelmingly common case would be<br>
end_obj > last_obj. The calculation of the new location for a
live object<br>
(where live_words_in_range() is used) proceeds from left to right
(lower<br>
to higher addresses) as each region is scanned looking for live<br>
objects. I would expect the execution of line 145 to be seldom<br>
if ever, so that just using 147 would be fine. The other case is<br>
less clear. When an object is being moved, the object references<br>
within it are updated. That access pattern seems like it would
be more<br>
random to me (fewer cache hits) but if you have data that shows <br>
that line 145 is beneficial, that would be a good data point.<br>
<br>
Jon<br>
</span><span class="new"></span><br>
<br>
<br>
<div class="moz-cite-prefix">On 01/14/2016 06:54 AM, ray alex wrote:<br>
</div>
<blockquote
cite="mid:CAETT6NwX6T7ExrxqsL3r-GZh9ZtnuHm8GGUJNmRwF=4ccZmyog@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>We previously evaluated some applications from JOlden,
Dacapo and SpecJVM2008 benchmark suites.</div>
<div>We believe these tests could reflect the essential
behaviors of parallel GC's full gc.</div>
<div><br>
</div>
<div>Specifically, the applications are: </div>
<div>BiSort, Em3d, MST, TreeAdd, Health, Perimeter (JOlden)</div>
<div>GCBench, </div>
<div>H2 (Dacapo)</div>
<div>Xml.validation, Compiler.compiler (SPECjvm2008)</div>
<div>with a 1GB heap size for all of them (except Health with
3GB).</div>
<div>The results are averagely 50% for them above.</div>
<div><br>
</div>
<div>Actually, the heap size is a factor that impacts the
frequency of full gcs, but does not influence the effect of
this approach.</div>
<div>As long as a full gc is triggered, we believe this caching
could bring improvement, and the percentage could vary
according to the access patterns of objects in different
programs. </div>
<div>Though in some scenarios the improvement may be less, it
almost does not impose any overhead.</div>
<div><br>
</div>
<div>Hoping these could be helpful for the testing.</div>
<div><br>
</div>
<div>I'm looking forward to your new test result of the patch
and the upstream process.</div>
<div>Please remind me if I could help on the test.</div>
<div><br>
</div>
<div>Thanks!</div>
<div><br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">2016-01-13 23:51 GMT+08:00 Thomas
Schatzl <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:thomas.schatzl@oracle.com" target="_blank">thomas.schatzl@oracle.com</a>></span>:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi
Tianyang,<br>
<br>
On Sun, 2016-01-10 at 22:16 +0800, ray alex wrote:<br>
> Hi,<br>
> By following the conventions, I remove the prefix
"get_" and rename<br>
> getter/setter to "last_query_*" in this patch version
4 (attached).<br>
> Besides, I ran the measurements mentioned before
(both for version 3<br>
> and recheck this version 4 ), and there was no
regressions.<br>
><br>
> May I have your reviews on this patch?<br>
><br>
> Thanks!<br>
<br>
first, sorry for the delay, holidays and such...<br>
<br>
I created a CR for this enhancement at<br>
<a moz-do-not-send="true"
href="https://bugs.openjdk.java.net/browse/JDK-8146987"
rel="noreferrer" target="_blank">https://bugs.openjdk.java.net/browse/JDK-8146987</a><br>
<br>
I also made some webrevs for this change that fix some
minor issues:<br>
<br>
base patch that includes some compilation fixes:<br>
<a moz-do-not-send="true"
href="http://cr.openjdk.java.net/%7Etschatzl/8146987/webrev.0/"
rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~tschatzl/8146987/webrev.0/</a><br>
<br>
- recent changes introduced some merge errors in
psParallelCompact,<br>
mostly due to Unified logging<br>
- the assert in psParallelCompact.cpp:3137 missed a
parameter so the<br>
sources did not compile in any debug mode.<br>
<br>
Some minor style fixes:<br>
<a moz-do-not-send="true"
href="http://cr.openjdk.java.net/%7Etschatzl/8146987/webrev.1/"
rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~tschatzl/8146987/webrev.1/</a>
(full)<br>
<a moz-do-not-send="true"
href="http://cr.openjdk.java.net/%7Etschatzl/8146987/webrev.0_to_1/"
rel="noreferrer" target="_blank">http://cr.openjdk.java.net/~tschatzl/8146987/webrev.0_to_1/</a>
(diff)<br>
<br>
- made a few of the new methods private<br>
- some coding style fixes: indentation, trailing white
spaces, copyright dates<br>
- removed the new parMarkBitMap.inline.hpp again because
its methods were only called from parMarkBitmap.cpp, fixed
includes to parMarkBitmap.cpp.<br>
- prepare some push message, please check<br>
<br>
I did run internal gc testing on the .0 webrev (vm.gc
testlist), and did some performance verification (mainly
on specjbb201x as this was what was on hand). It does (as
expected) improve the perf of full gc, although the
improvements are not as high as mentioned in your initial
email, around 15% average full gc time (for some rather
specific configuration). That is still quite good imo.<br>
<br>
Reproducing the results has been one reason why this reply
took a while, it needs a bit of carefully setting up tests
to get results that are not drowned in noise due to gc
occurring on widely different heap contents, and
particularly ergonomics.<br>
Not sure how you managed to do this with something like
dacapo. SPECjvm2008 in "realistic" configurations does not
do a lot of (if any) full gcs.<br>
<br>
The change looks good to me now, but I need to do some
more testing again with the .1 version, to make sure the
cleanup did not introduce any regressions anywhere.<br>
<br>
Thanks,<br>
Thomas<br>
<br>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature">Lei Tianyang
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>