[9] RFR (M): 8057967: CallSite dependency tracking scales devastatingly poorly

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Wed Apr 1 20:56:50 UTC 2015


http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/hotspot/
http://cr.openjdk.java.net/~vlivanov/8057967/webrev.00/jdk/
https://bugs.openjdk.java.net/browse/JDK-8057967

HotSpot JITs inline very aggressively through CallSites. The 
optimistically treat CallSite target as constant, but record a nmethod 
dependency to invalidate the compiled code once CallSite target changes.

Right now, such dependencies have call site class as a context. This 
context is too coarse and it leads to context pollution: if some 
CallSite target changes, VM needs to enumerate all nmethods which 
depends on call sites of such type.

As performance analysis in the bug report shows, it can sum to 
significant amount of work.

While working on the fix, I investigated 3 approaches:
   (1) unique context per call site
   (2) use CallSite target class
   (3) use a class the CallSite instance is linked to

Considering call sites are ubiquitous (e.g. 10,000s on some octane 
benchmarks), loading a dedicated class for every call site is an 
overkill (even VM anonymous).

CallSite target class 
(MethodHandle.form->LambdaForm.vmentry->MemberName.clazz->Class<?>) is 
also not satisfactory, since it is a compiled LambdaForm VM anonymous 
class, which is heavily shared. It gets context pollution down, but 
still the overhead is quite high.

So, I decided to focus on (3) and ended up with a mixture of (2) & (3).

Comparing to other options, the complications of (3) are:
   - CallSite can stay unlinked (e.g. CallSite.dynamicInvoker()), so 
there should be some default context VM can use

   - CallSite instances can be shared and it shouldn't keep the context 
class from unloading;

It motivated a scheme where CallSite context is initialized lazily and 
can change during lifetime. When CallSite is linked with an indy 
instruction, it's context is initialized. Usually, JIT sees CallSite 
instances with initialized context (since it reaches them through indy), 
but if it's not the case and there's no context yet, JIT sets it to 
"default context", which means "use target call site".

I introduced CallSite$DependencyContext, which represents a nmethod 
dependency context and points (indirectly) to a Class<?> used as a context.

Context class is referenced through a phantom reference 
(sun.misc.Cleaner to simplify cleanup). Though it's impossible to 
extract referent using Reference.get(), VM can access it directly by 
reading corresponding field. Unlike other types of references, phantom 
references aren't cleared automatically. It allows VM to access context 
class until cleanup is performed. And cleanup resets the context to 
NULL, in addition to invalidating all relevant dependencies.

There are 3 context states a CallSite instance can be in:
   (1) NULL: no depedencies
   (2) DependencyContext.DEFAULT_CONTEXT: dependencies are stored in 
call site target class
   (3) DependencyContext for some class: dependencies are stored on the 
class DependencyContext instance points to

Every CallSite starts w/o a context (1) and then lazily gets one ((2) or 
(3) depending on the situation).

State transitions:
   (1->3): When a CallSite w/o a context (1) is linked with some indy 
call site, it's owner is recorded as a context (3).

   (1->2): When JIT needs to record a dependency on a target of a 
CallSite w/o a context(1), it sets the context to DEFAULT_CONTEXT and 
uses target class to store the dependency.

   (3->1): When context class becomes unreachable, a cleanup hook 
invalidates all dependencies on that CallSite and resets the context to 
NULL (1).

Only (3->1) requires dependency invalidation, because there are no 
depedencies in (1) and (2->1) isn't performed.

(1->3) is done in Java code (CallSite.initContext) and (1->2) is 
performed in VM (ciCallSite::get_context()). The updates are performed 
by CAS, so there's no need in additional synchronization. Other 
operations on VM side are volatile (to play well with Java code) and 
performed with Compile_lock held (to avoid races between VM operations).

Some statistics:
   Box2D, latest jdk9-dev
     - CallSite instances: ~22000

     - invalidated nmethods due to CallSite target changes: ~60

     - checked call_site_target_value dependencies:
       - before the fix: ~1,600,000
       - after the fix:        ~600

Testing:
   - dedicated test which excercises different state transitions
   - jdk/java/lang/invoke, hotspot/test/compiler/jsr292, nashorn

Thanks!

Best regards,
Vladimir Ivanov


More information about the mlvm-dev mailing list