Request for review (L): 7071653: JSR 292: call site change notification should be pushed not pulled

Vladimir Kozlov vladimir.kozlov at oracle.com
Sun Aug 7 15:35:29 PDT 2011


Christian,

You need to add big comment to the new code in templateTable_<arch>.cpp explaining what it does and why.

Why on sparc you use ld_ptr() to load from cache but on X86 and X64 you use movl() (only 32  bit)?

Add assert(byte_no == -1, ) to default: case to make sure you got all cases above it.

I am concern about using next short branch in new code in templateTable_sparc.cpp:

cmp_and_br_short(..., L_patch_done);  // don't patch

There is __ stop() call which generates a lot of code so that label L_patch_done could be far.


Why you added new #include into ciEnv.cpp and nmethod.cpp, what code needs it? Nothing else is changed in these files.

I don't like assignments in condition and implicit NULL checks. Can you change check_dependency() to next?:

       klassOop check_dependency() {
         klassOop result = check_klass_dependency(NULL);
         if (result != NULL) return result;
         return check_call_site_dependency(NULL);
       }

In interpreterRuntime.cpp initialize marked:  int marked = 0;

Why you did not leave "volatile" call site inlining with guard? You did not explain why virtual call is fine for it.


Vladimir

On 8/5/11 6:32 AM, Christian Thalinger wrote:
> http://cr.openjdk.java.net/~twisti/7071653
>
> 7071653: JSR 292: call site change notification should be pushed not pulled
> Reviewed-by:
>
> Currently every speculatively inlined method handle call site has a
> guard that compares the current target of the CallSite object to the
> inlined one.  This per-invocation overhead can be removed if the
> notification is changed from pulled to pushed (i.e. deoptimization).
>
> I had to change the logic in TemplateTable::patch_bytecode to skip
> bytecode quickening for putfield instructions when the put_code
> written to the constant pool cache is zero.  This is required so that
> every execution of a putfield to CallSite.target calls out to
> InterpreterRuntime::resolve_get_put to do the deoptimization of
> depending compiled methods.
>
> I also had to change the dependency machinery to understand other
> dependencies than class hierarchy ones.  DepChange got the super-type
> of two new dependencies, KlassDepChange and CallSiteDepChange.
>
> Tested with JRuby tests and benchmarks, hand-written testcases, JDK
> tests and vm.mlvm tests.
>
> Here is the speedup for the JRuby fib benchmark (first is JDK 7 b147,
> second with 7071653).  Since the CallSite targets don't change during
> the runtime of this benchmark we can see the performance benefit of
> eliminating the guard:
>
> $ jruby --server bench/bench_fib_recursive.rb 5 35
>    0.883000   0.000000   0.883000 (  0.854000)
>    0.715000   0.000000   0.715000 (  0.715000)
>    0.712000   0.000000   0.712000 (  0.712000)
>    0.713000   0.000000   0.713000 (  0.713000)
>    0.713000   0.000000   0.713000 (  0.712000)
>
> $ jruby --server bench/bench_fib_recursive.rb 5 35
>    0.772000   0.000000   0.772000 (  0.742000)
>    0.624000   0.000000   0.624000 (  0.624000)
>    0.621000   0.000000   0.621000 (  0.621000)
>    0.622000   0.000000   0.622000 (  0.622000)
>    0.622000   0.000000   0.622000 (  0.621000)
>


More information about the hotspot-compiler-dev mailing list