Invokedynamic deoptimization issues

Wed Jan 14 18:57:36 UTC 2015

I'm using invokedynamic bytecode:

https://github.com/spullara/mustache.java/blob/master/indy/src/main/java/com/github/mustachejava/indy/IndyWrapper.java

If you run the benchmark with "-direct -indy -codegenReflectionOH -n 10"
you will see interesting behavior. It is able to optimize indy and to be as
fast as the direct call but around the 7th iteration it optimizes it again
and makes it 10x slower. When I use your flags, I see that there is little
difference in the optimization output however:

samair11:indybenchmark sam$ diff indy1.txt indy2.txt

1c1,2

<    2408  396 %     4       indybench.Main::timeIndy @ 16 (73 bytes)

---

>    6482  412       4       indybench.Main::timeIndy (73 bytes)

>                               @ 0   java.lang.System::currentTimeMillis
(0 bytes)   (intrinsic)

4c5

<                                \-> TypeProfile (604309/604309 counts) =
com/github/mustachejava/indy/W_Main_someMethod_a36851e6_61ca_4b15_b94e_7c73cdc71d1c

---

>                                \-> TypeProfile (4204749/4204749 counts) =
com/github/mustachejava/indy/W_Main_someMethod_a36851e6_61ca_4b15_b94e_7c73cdc71d1c

8,9c9,10

<                                    \-> TypeProfile (6124/56502 counts) =
com/github/mustachejava/codegen/CompiledGuards1

<                                    \-> TypeProfile (50378/56502 counts) =
com/github/mustachejava/codegen/CompiledGuards3

---

>                                    \-> TypeProfile (6124/100293 counts) =
com/github/mustachejava/codegen/CompiledGuards1

>                                    \-> TypeProfile (94169/100293 counts)
= com/github/mustachejava/codegen/CompiledGuards3

33a35

>                               @ 21   java.lang.Integer::intValue (5
bytes)   accessor

55a58

>

Sam

Output:

samair11:indybenchmark sam$ ./target/appassembler/bin/indybench -direct
-indy -codegenReflectionOH -n 10

codegen reflection OH: 736

indy wrapper: 66

direct: 16

-----------------

codegen reflection OH: 1136

indy wrapper: 40

direct: 25

-----------------

codegen reflection OH: 577

indy wrapper: 4

direct: 5

-----------------

codegen reflection OH: 658

indy wrapper: 4

direct: 6

-----------------

codegen reflection OH: 567

indy wrapper: 3

direct: 6

-----------------

codegen reflection OH: 658

indy wrapper: 4

direct: 6

-----------------

codegen reflection OH: 625

indy wrapper: 5

direct: 6

-----------------

codegen reflection OH: 648

indy wrapper: 121

direct: 6

-----------------

codegen reflection OH: 618

indy wrapper: 65

direct: 7

-----------------

codegen reflection OH: 553

indy wrapper: 68

direct: 6

-----------------

On Wed, Jan 14, 2015 at 10:40 AM, Vladimir Ivanov <
vladimir.x.ivanov at oracle.com> wrote:

> Sam,
>
> The first thing I usually do is run with:
> -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
>
> I haven't looked at your benchmarks yet, but my main suspicion is that
> MethodHandles aren't compile-time constants, so inlining fails. Do you use
> invokedynamic bytecode or call MethodHandle.invokeExact()/invoke()? In
> the latter case, you should store MethodHandles in static final fields.
> Otherwise, JIT won't inline them.
>
> Best regards,
> Vladimir Ivanov
>
>
> On 1/14/15 8:23 PM, Sam Pullara wrote:
>
>> Hi,
>>
>> I've implemented a bunch of different ways for mustache.java to get data
>> from names fields and methods. One thing I have noticed running benchmarks
>> is that accessing the fields/methods in different ways can cause
>> invokedynamic to either never be optimized or actually be deoptimized
>> later
>> during the running of the benchmarks (20x+ difference in performance).
>> Here
>> is the code I have been running:
>>
>> https://github.com/spullara/indybenchmark
>>
>> I was wondering what the best way to debug this would be?
>>
>> Thanks,
>> Sam
>>
>>