RFR: 8047326: Add a version of CompiledIC_at that doesn't create a new RelocIterator

Stefan Karlsson stefan.karlsson at oracle.com
Thu Jun 19 12:45:13 UTC 2014


Hi all,

I have a patch that we have been using in the G1 Class Unloading project 
to lower the remark times.  This changes Compiler code, so I would like 
to get feedback from the Compiler team.

http://cr.openjdk.java.net/~stefank/8047362/webrev.00/
https://bugs.openjdk.java.net/browse/JDK-8047362

The patch builds upon the patch in:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-June/014358.html


Summary from the bug report:
---
Creation of RelocIterators show up high in profiles of the remark phase, 
in the G1 Class Unloading project.

There's a pattern in the nmethod/codecache code to create a 
RelocIterator and then materialize a CompiledIC:

     RelocIterator iter(this, low_boundary);
     while(iter.next()) {
       if (iter.type() == relocInfo::virtual_call_type) {
         CompiledIC *ic = CompiledIC_at(iter.reloc());

CompiledIC_at is implemented as:
   new CompiledIC(call_site->code(), nativeCall_at(call_site->addr()));

And one of the first thing CompiledIC::CompiledIC(const nmethod* nm, 
NativeCall* call) does is to create a new RelocIterator:
...
address ic_call = call->instruction_address();
...
   RelocIterator iter(nm, ic_call, ic_call+1);
   bool ret = iter.next();
   assert(ret == true, "relocInfo must exist at this address");
   assert(iter.addr() == ic_call, "must find ic_call");

I would like to propose that we pass down the RelocIterator that we 
already have, instead of creating a new.
---


I've previously received feedback that this seems like reasonable thing 
to do, but that the parameter to the new CompileIC_at should take a 
const RelocIterator* instead of RelocIterator*. I couldn't do that 
without changing a significant amount of Compiler code, so I have left 
it out for now. Any opinions on how to handle that?


To give an idea of the performance difference, I temporarily added the 
following code:
void CodeCache::iterate_through_CIs(int style) {
   int count;
   FOR_ALL_ALIVE_NMETHODS(nm) {
     RelocIterator iter(nm);
     while(iter.next()) {
       if (iter.type() == relocInfo::virtual_call_type ||
           iter.type() == relocInfo::opt_virtual_call_type) {
         if (style > 0) {
           CompiledIC *ic = style == 1 ? CompiledIC_at(&iter) : 
CompiledIC_at(iter.reloc());
           if (ic->ic_destination() == (address)0xdeadb000) {
             gclog_or_tty->print_cr("ShouldNotReachHere");
           }
         }
       }
     }
   }
}

and then measured how long time it took to execute 
iterate_through_CIs(style) 1000 times with style == {0, 1, 2}.

The results are:
  iterate_through_CIs(0): 1.210833 s // No CompiledICs created
  iterate_through_CIs(1): 1.976557 s // New style
  iterate_through_CIs(2): 9.924209 s // Old style


Testing:
   A similar version has been used and thoroughly been tested together 
with the other G1 Class Unloading changes. This exact version has so far 
only been tested with Kitchensink and SpecJVM2008 compiler.compiler. 
What test lists would be appropriate to test this with?


thanks,
StefanK




More information about the hotspot-gc-dev mailing list