RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator
Stefan Karlsson
stefan.karlsson at oracle.com
Thu Jun 19 12:45:13 UTC 2014
Hi all,
I have a patch that we have been using in the G1 Class Unloading project
to lower the remark times. This changes Compiler code, so I would like
to get feedback from the Compiler team.
http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8047362
The patch builds upon the patch in:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html
Summary from the bug report:
---
Creation of RelocIterators show up high in profiles of the remark phase,
in the G1 Class Unloading project.
There's a pattern in the nmethod/codecache code to create a
RelocIterator and then materialize a CompiledIC:
RelocIterator iter(this, low_boundary);
while(iter.next()) {
if (iter.type() == relocInfo::virtual_call_type) {
CompiledIC *ic = CompiledIC_at(iter.reloc());
CompiledIC_at is implemented as:
new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));
And one of the first thing CompiledIC::CompiledIC(const nmethod* nm,
NativeCall* call) does is to create a new RelocIterator:
...
address ic_call = call->instruction_address();
...
RelocIterator iter(nm, ic_call, ic_call+1);
bool ret = iter.next();
assert(ret == true, "relocInfo must exist at this address");
assert(iter.addr() == ic_call, "must find ic_call");
I would like to propose that we pass down the RelocIterator that we
already have, instead of creating a new.
---
I've previously received feedback that this seems like reasonable thing
to do, but that the parameter to the new CompileIC_at should take a
const RelocIterator* instead of RelocIterator*. I couldn't do that
without changing a significant amount of Compiler code, so I have left
it out for now. Any opinions on how to handle that?
To give an idea of the performance difference, I temporarily added the
following code:
void CodeCache::iterate_through_CIs(int style) {
int count;
FOR_ALL_ALIVE_NMETHODS(nm) {
RelocIterator iter(nm);
while(iter.next()) {
if (iter.type() == relocInfo::virtual_call_type ||
iter.type() == relocInfo::opt_virtual_call_type) {
if (style > 0) {
CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) :
CompiledIC_at(iter.reloc());
if (ic->ic_destination() == (address)0xdeadb000) {
gclog_or_tty->print_cr("ShouldNotReachHere");
}
}
}
}
}
}
and then measured how long time it took to execute
iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.
The results are:
iterate_through_CIs(0): 1.210833 s // No CompiledICs created
iterate_through_CIs(1): 1.976557 s // New style
iterate_through_CIs(2): 9.924209 s // Old style
Testing:
A similar version has been used and thoroughly been tested together
with the other G1 Class Unloading changes. This exact version has so far
only been tested with Kitchensink and SpecJVM2008 compiler.compiler.
What test lists would be appropriate to test this with?
thanks,
StefanK
More information about the hotspot-gc-dev
mailing list