<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Jungwoo,<br>
<br>
Thanks for the information.<br>
<br>
There's a bug in the current code which tries to patch up the BOT
after<br>
a block is split. I don't understand yet what is wrong but a simple<br>
workaround (which is known to be slower than necessary) avoids the <br>
much of the long pauses. After this has been fixed, we should
compare<br>
the performances of our two implementations.<br>
<br>
Jon<br>
<br>
<br>
<div class="moz-cite-prefix">On 5/14/2015 2:52 PM, Jungwoo Ha wrote:<br>
</div>
<blockquote
cite="mid:CA+n_jhgXinMQr6zc5EfivtXivUxs9rqqn9C49+5sN6x4R6SnwA@mail.gmail.com"
type="cite">
<div dir="ltr">We have a faster implementation for card table
scan, and on our implementation ~20s pause goes down to 1~2s
with above-mentioned experiments. It doesn't solve the problem
but it alleviates it (and actually shows that card table scan is
faster).
<div>If someone can sponsor this, I will create a patch for the
review.<br>
<div><br>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, May 13, 2015 at 6:36 PM,
Srinivas Ramakrishna <span dir="ltr"><<a
moz-do-not-send="true" href="mailto:ysr1729@gmail.com"
target="_blank">ysr1729@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div dir="ltr">Hi Jon --
<div><br>
</div>
<div>thanks, just a comment inline below....<br>
<div class="gmail_extra"><br>
<div class="gmail_quote"><span class="">On Wed, May
13, 2015 at 4:06 PM, Jon Masamitsu <span
dir="ltr"><<a moz-do-not-send="true"
href="mailto:jon.masamitsu@oracle.com"
target="_blank">jon.masamitsu@oracle.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0
0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span> <br>
<br>
<div>On 5/13/2015 11:40 AM, Srinivas
Ramakrishna wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi Jon --
<div><br>
</div>
<div>My expectation is that for any
large block spanning multiple cards,
if you walk the BOT entries for that
block from lower addresses to higher
addresses, the BOT entries should be
monotone increasing (albeit not
strictly monotone) until the last card
for that block (but one). In the
example you gave below, is D > C
> N_words, with the BOT entries in
some interior window of the same
block/object going (low to high
addresses in the window) like " ....
DDDDCCCCDDDD...." ? In other words,
there are lower-valued entries C
within a range of higher-valued
entries D, with both C and D denoting
logarithmic entries for the block in
question?</div>
</div>
</blockquote>
<br>
</span> Yes, that's what I'm seeing.<span><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>If you look in the BOT methods,
there was a parameter "reducing"
passed to some of the mark_block()
methods which was supposed to do some
kind of assertion checking towards
maintenance of this monotonicity, but
looking at it now, that may not be
sufficient and the assert could be
strengthened at the "end" of the range
being marked. More worryingly, I
believe I had turned off even that
assert before I left (ca 6uXX?)
because it was failing at that time,
but the corresponding bug hadn't been
fixed, and seems as though it still
isn't. This might possibly have some
bearing on the non-monotonicity in the
entries you see.</div>
</div>
</blockquote>
<br>
</span> Thanks for that bit of information.
So what I'm seeing is not<br>
necessarily the cause for the long pauses.
But see my next comment.<span><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I'll hopefully get some time to go
over this code again. I was looking at
the code on the train ride into work
this morning, and trying to remember
the details, but have not quite delved
again into all of its details yet or
why the assert I mention above should
be failing.</div>
</div>
</blockquote>
<br>
</span> After the logarithmic stride back to
find the block containing the<br>
start of the object, there is a object by
object walk forward to find the<br>
to find the start of the object<br>
<br>
<blockquote type="cite"> 558 while (n
<= addr) {<br>
559 debug_only(HeapWord* last =
q); // for debugging<br>
560 q = n;<br>
561 n += _sp->block_size(n);<br>
</blockquote>
[...]<br>
<br>
From instrumentation I think the long pauses
originate in this forward walk<br>
(mostly in the block_size() method). So
maybe the BOT "DDDDDCCCCC...CDDDD"<br>
indicates that BOT at the start of an objects
is being trashed and we're<br>
going way too far back. I'm trying to use
VerifyBlockOffsetArray but it seems to<br>
have bit rotted. I'll keep trying.</div>
</blockquote>
<div><br>
</div>
</span>
<div>Ah, that's what I was afraid of. This must
likely mean that when a large block is split, the
suffix from the split</div>
<div>(which is typically returned to the free list)
isn't getting fixed up as was the intention, or
when the sweeper</div>
<div>coalesces smaller blocks into a larger one, the
new larger block isn't getting fixed correctly
(although that seems</div>
<div>unlikely). Using VerifyBlockOffsetArray and
asserting if the forward walk is more than a
card's worth in length</div>
<div>would seem to be one way to catch that badness
closer to when it happens...</div>
<div><br>
</div>
<div>Thanks for the update, Jon!</div>
<div>-- ramki</div>
<div>
<div class="h5">
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0
0 0 .8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span><font
color="#888888"><br>
<br>
Jon</font></span>
<div>
<div><br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>It might be worthwhile (if it
is possible) to do a bisect on
whatever the first known JDK
(7uXX?) where the performance
regression appears and try to
narrow down the change that may
have caused the regression, using
James' test program. I have a
sneaking suspicion (because of the
elided assert) that it might well
be a change I had made towards
closing some
BOT-walking/card-scanning cracks
when initializing CMS objects ca
6uXX.</div>
<div><br>
</div>
<div>James, in case you have read
thus far, can you provide the
exact version number of the 6uXX
Oracle jdk you used that did _not_
show the performance regression
for your test program?</div>
<div><br>
</div>
<div>Thanks!</div>
<div>-- ramki</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, May
12, 2015 at 1:50 PM, Jon Masamitsu
<span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:jon.masamitsu@oracle.com"
target="_blank">jon.masamitsu@oracle.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex">
<div bgcolor="#FFFFFF"
text="#000000"> Ramki,<br>
<br>
The logarithmic BOT is taking
big steps back to<br>
find the start of a large
object. But something is odd.<br>
<br>
Is there a reason why the BOT
entries would decrease for a
while<br>
and then increase without a
value that gives the start of
an object?<br>
<br>
&_array->_offset_array[index]
= 0xffffffff6550f600
"DDDDDDDDDDDDDD ..."<br>
<br>
and then down the road.<br>
<br>
print
&_array->_offset_array[0x110600U],
_array->_offset_array[0x10f600U
><br>
&_array->_offset_array[0x110600U]
= 0xffffffff65510600
"CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC<br>
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD<br>
DDDDDDDDDDDDDDDDDDDDDDDDDD<br>
<br>
'D' is 0x44<br>
<br>
N_words is 0x40<span><font
color="#888888"><br>
<br>
Jon</font></span>
<div>
<div><br>
<br>
<div>On 05/12/2015 08:46
AM, Srinivas Ramakrishna
wrote:<br>
</div>
<blockquote type="cite">
<div>Thanks for that
background Jon; I'll
try and look at the
8uxx code and run
James's example when I
get a few spare cycles
(which won't be soon,
unfortunately).</div>
<div><br>
</div>
<div>-- ramki<br>
<br>
ysr1729</div>
<div><br>
On May 8, 2015, at
14:25, Jon Masamitsu
<<a
moz-do-not-send="true"
href="mailto:jon.masamitsu@oracle.com" target="_blank">jon.masamitsu@oracle.com</a>>
wrote:<br>
<br>
</div>
<blockquote type="cite">
<div> Ramki,<br>
<br>
I compared the
memory/blockOffsetTable.*
files between 6u35
and 7 and there <br>
were essentially no
differences. The
.cpp file had some
changes in
assertions.<br>
The other two were
the same except for
comments.<br>
<br>
Between 6 and 6u35
(didn't pin point
it), we changed from
the rather<br>
weak N_powers=3 :-)
to N_powers=14.<br>
<br>
There seem to be two
problematic cases.<br>
<br>
1) Spike in ParNew
times that lasts for
2 (or a few) GC's<br>
2) Continually
growing ParNew times<br>
<br>
From just reading
the CR<br>
<br>
<a
moz-do-not-send="true"
href="https://bugs.openjdk.java.net/browse/JDK-8079274" target="_blank">https://bugs.openjdk.java.net/browse/JDK-8079274</a><br>
<br>
the test case seems
to reproduce 1).
The test case<br>
has a large object
that lives till the
application exits.<br>
If it were a broken
logarithmic BOT, I
would not expect<br>
the ParNew pauses to
shrink again. <br>
<br>
Jon<br>
<br>
<br>
<br>
<div>On 5/8/2015
11:49 AM, Srinivas
Ramakrishna wrote:<br>
</div>
<blockquote
type="cite">
<div dir="ltr">This
is indeed
interesting...
<div><br>
</div>
<div>Jesper, did
you figure out
why (as James
seems to note)
the problem is
seen in JDK 8
but not in JDK
6 (he's using
ParNew in both
cases). Also,
isn't the BOT
logarithmic,
which will
allow threads
to "quickly"
traverse back
to the start
of even a
large object?
Did something
change in the
BOT that might
have led to
what James
sees as a
regression, or
did the range
that he
characterizes
as a "blind
spot" merely
move from one
place to
another?</div>
<div><br>
</div>
<div>Unfortunately,
I have not
been keeping
up with the
changes in
this area and
am not working
with JDK 6
code much, so
can't offer
any clues, but
am surprised
that this
would be a
regression. (I
would expect
the same
performance
pathologies in
this code to
exist in 6 and
forward -- has
it been
established
conclusively
that this is
indeed a
regression?)</div>
<div><br>
</div>
<div>thanks!</div>
<div>-- ramki</div>
</div>
<div
class="gmail_extra"><br>
<div
class="gmail_quote">On
Fri, May 8,
2015 at 6:40
AM, James Lent
<span
dir="ltr"><<a
moz-do-not-send="true" href="mailto:jlent@tivo.com" target="_blank">jlent@tivo.com</a>></span>
wrote:<br>
<blockquote
class="gmail_quote"
style="margin:0
0 0
.8ex;border-left:1px
#ccc
solid;padding-left:1ex">
<div
bgcolor="#FFFFFF"
text="#000000">
The
explanation of
the problem
added by
Jesper
Wilhelmsson on
<span> <span
title="2015-05-06
08:59">2015-05-06
08:59 </span></span>to
JDK-8079274 is
interesting,
but, not very
satisfying. <br>
<br>
1) The problem
affects more
than this one
specific
breakpoint.
It impacts
objects in at
least the
range of about
480M to 512M.
The comment "A
larger object
probably
triggers some
higher level
code that are
able to skip
scanning of
this part of
the memory"
does not
inspire
confidence. It
makes even
more concerned
that we do not
understand
this issue and
that there
could be other
triggers. It
seems like
"the higher
level code"
has a 32M
blind spot.
Further I can
create
scenarios
where the GC
code takes
progressively
longer to
clear the
issue.
Presumably it
completes the
lower block
scan during
each GC
cycle. Why
does it take
longer and
longer each
time (my guess
is it has
something to
do with the
number of
objects moved
to survivor
space). If I
cache 1 in 100
instead of 1
in 10 then
there are 11
progressively
slower
ParNewGC
(after several
fast one) and
the issue
takes minutes
to clear: <br>
<br>
$
/usr/lib/jvm/java-7-oracle/bin/java
-verbose:gc
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-Xmx4G -Xms4G
LongParNewPause
536870384 100
100 <br>
0.308:
[GC0.308:
[ParNew:
272640K->3029K(306688K),
0.0113180
secs]
796927K->527316K(4160256K),
0.0114040
secs] [Times:
user=0.05
sys=0.00,
real=0.02
secs] <br>
0.367:
[GC0.367:
[ParNew:
275669K->7766K(306688K),
0.0140840
secs]
799956K->532053K(4160256K),
0.0141630
secs] [Times:
user=0.04
sys=0.01,
real=0.02
secs] <br>
0.430:
[GC0.430:
[ParNew:
280406K->10057K(306688K),
0.0120810
secs]
804693K->534344K(4160256K),
0.0121690
secs] [Times:
user=0.05
sys=0.00,
real=0.02
secs] <br>
0.490:
[GC0.490:
[ParNew:
282697K->15383K(306688K),
0.0128540
secs]
806984K->539670K(4160256K),
0.0129390
secs] [Times:
user=0.05
sys=0.00,
real=0.02
secs] <br>
0.551:
[GC0.551:
[ParNew:
288023K->21006K(306688K),
0.0146480
secs]
812310K->545294K(4160256K),
0.0147650
secs] [Times:
user=0.05
sys=0.01,
real=0.01
secs] <br>
0.613:
[GC0.613:
[ParNew:
293646K->26805K(306688K),
0.0141250
secs]
817934K->551092K(4160256K),
0.0142050
secs] [Times:
user=0.05
sys=0.00,
real=0.01
secs] <br>
0.675:
[GC0.675:
[ParNew:
299445K->24258K(306688K),
0.4816840
secs]
823732K->551436K(4160256K),
0.4817620
secs] [Times:
user=1.90
sys=0.00,
real=0.48
secs] <br>
1.205:
[GC1.205:
[ParNew:
296898K->17914K(306688K),
3.4216550
secs]
824076K->547788K(4160256K),
3.4217180
secs] [Times:
user=13.50
sys=0.00,
real=3.42
secs] <br>
4.674:
[GC4.674:
[ParNew:
290554K->17433K(306688K),
6.3406820
secs]
820428K->550031K(4160256K),
6.3407410
secs] [Times:
user=25.00
sys=0.02,
real=6.34
secs] <br>
11.062:
[GC11.062:
[ParNew:
290073K->17315K(306688K),
10.9493130
secs]
822671K->552636K(4160256K),
10.9493890
secs] [Times:
user=42.98
sys=0.03,
real=10.95
secs] <br>
22.059:
[GC22.059:
[ParNew:
289955K->17758K(306688K),
16.0539650
secs]
825276K->555798K(4160256K),
16.0540250
secs] [Times:
user=62.44
sys=0.09,
real=16.05
secs] <br>
38.161:
[GC38.161:
[ParNew:
290398K->17745K(306688K),
20.6900820
secs]
828438K->558512K(4160256K),
20.6901420
secs] [Times:
user=80.79
sys=0.08,
real=20.69
secs] <br>
58.898:
[GC58.898:
[ParNew:
290385K->21826K(306688K),
24.6781700
secs]
831152K->565310K(4160256K),
24.6782710
secs] [Times:
user=95.58
sys=0.05,
real=24.68
secs] <br>
83.625:
[GC83.625:
[ParNew:
294466K->21834K(306688K),
30.3501800
secs]
837950K->568044K(4160256K),
30.3502610
secs] [Times:
user=115.80
sys=0.11,
real=30.35
secs] <br>
114.022:
[GC114.022:
[ParNew:
294474K->21836K(306688K),
33.6648800
secs]
840684K->570772K(4160256K),
33.6649390
secs] [Times:
user=127.38
sys=0.14,
real=33.67
secs] <br>
147.736:
[GC147.736:
[ParNew:
294476K->18454K(306688K),
38.9828430
secs]
843412K->570117K(4160256K),
38.9829040
secs] [Times:
user=149.14
sys=0.15,
real=38.99
secs] <br>
186.766:
[GC186.766:
[ParNew:
291094K->26602K(306688K),
39.8726730
secs]
842757K->580977K(4160256K),
39.8727330
secs] [Times:
user=156.35
sys=0.11,
real=39.88
secs] <br>
226.686:
[GC226.686:
[ParNew:
299242K->26612K(306688K),
0.0174470
secs]
853617K->583714K(4160256K),
0.0175300
secs] [Times:
user=0.06
sys=0.00,
real=0.02
secs] <br>
<br>
2) The
suggestion to
use only 1
thread is not
acceptable to
me. Use
parallel GC
with one
thread?
Further it did
not work for
me (in fact it
made things
worse).
"user" time
does now equal
"real" time
(and the new
size is 1/4 as
big) so I
think the
parameter took
effect: <br>
<br>
$
/usr/lib/jvm/java-8-oracle/bin/java
-verbose:gc
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-Xmx4G -Xms4G
-XX:ParallelGCThreads=1
LongParNewPause
536870384 100
10 <br>
0.226: [GC
(Allocation
Failure)
0.226:
[ParNew:
68160K->6980K(76672K),
0.0489756
secs]
592447K->531267K(4185792K),
0.0490718
secs] [Times:
user=0.05
sys=0.00,
real=0.05
secs] <br>
0.291: [GC
(Allocation
Failure)
0.291:
[ParNew:
75140K->8512K(76672K),
29.8707281
secs]
599427K->539479K(4185792K),
29.8707954
secs] [Times:
user=29.80
sys=0.00,
real=29.88
secs] <br>
30.180: [GC
(Allocation
Failure)
30.180:
[ParNew:
76672K->8093K(76672K),
30.6357699
secs]
607639K->545772K(4185792K),
30.6358257
secs] [Times:
user=30.57
sys=0.00,
real=30.63
secs] <br>
60.837: [GC
(Allocation
Failure)
60.837:
[ParNew:
76253K->6830K(76672K),
69.2115255
secs]
613932K->552169K(4185792K),
69.2115783
secs] [Times:
user=69.05
sys=0.01,
real=69.22
secs] <br>
130.068: [GC
(Allocation
Failure)
130.068:
[ParNew:
74990K->8226K(76672K),
132.5118903
secs]
620329K->560381K(4185792K),
132.5119459
secs] [Times:
user=132.21
sys=0.00,
real=132.52
secs] <br>
262.603: [GC
(Allocation
Failure)
262.603:
[ParNew:
76386K->8512K(76672K),
0.0400138
secs]
628541K->568748K(4185792K),
0.0400687
secs] [Times:
user=0.04
sys=0.00,
real=0.04
secs] <br>
<br>
3)
Applications
should not be
expected to
tune the GC
settings to
account for
the size of
one static
object. The
connection
between the
size of this
object and the
GC performance
was not easy
to even find.
Further the
parameter
ParGCCardsPerStrideChunk
is pretty low
level and not
one I think
many
application
developers are
familiar with.<br>
<br>
Error: VM
option
'ParGCCardsPerStrideChunk'
is diagnostic
and must be
enabled via
-XX:+UnlockDiagnosticVMOptions.<span><br>
Error: Could
not create the
Java Virtual
Machine.<br>
Error: A fatal
exception has
occurred.
Program will
exit.<br>
<br>
</span> 4) I
can not
reproduce this
issue in
Oracle 6.
This appears
to be a
regression.
How can fixing
a regression
be called an
enhancement
request? <br>
<span> <br>
<br>
<br>
<br>
<br>
<br>
<br>
<hr> <font
color="Gray"
face="Arial"
size="1"><br>
This email and
any
attachments
may contain
confidential
and privileged
material for
the sole use
of the
intended
recipient. Any
review,
copying, or
distribution
of this email
(or any
attachments)
by others is
prohibited. If
you are not
the intended
recipient,
please contact
the sender
immediately
and
permanently
delete this
email and any
attachments.
No employee or
agent of TiVo
Inc. is
authorized to
conclude any
binding
agreement on
behalf of TiVo
Inc. by email.
Binding
agreements
with TiVo Inc.
may only be
made by a
signed written
agreement.<br>
</font> </span></div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</blockquote>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div>
</div>
</div>
</blockquote>
</div>
</div>
</div>
<br>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature">
<div dir="ltr">
<div><span style="font-size:12.8000001907349px">Jungwoo Ha |
Java Platform Team | <a moz-do-not-send="true"
href="mailto:jwha@google.com" target="_blank">jwha@google.com</a></span><br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</body>
</html>