Boxed types and constat propagation

Rémi Forax forax at univ-mlv.fr
Sat Apr 21 17:12:54 PDT 2012


See inlined comments :)

On 04/22/2012 12:02 AM, Kohsuke Kawaguchi wrote:
> Hi,
>
> I was inspired by the talk by Charles in JAX 2012 and was playing with
> invokedynamic a bit. I'm observing what seems like a constant
> propagation failure, which I'd imagine would affect some important use
> cases, so I wanted to check if I'm not doing something stupid.
>
> I've used Apache JEXL [1] as my toy "dynamic language". My basic
> strategy was to convert an expression into a graph of MethodHandles.
> This is a fairly straight-forward process, where each node in the JEXL
> AST is converted into function of the type (JexlContext)->Object.
> JexlContext represents the context object for an evaluation.
>
> I then compiled the expression "30+12" to see how well it'd optimize,
> which essentially does the following:
>
> --------------------
> import static java.lang.invoke.MethodHandles.*;
>
>      public void test1() throws Throwable {
>          // builds 30+12 as tree
>          MethodHandle a = constant(Object.class, 30);
>          MethodHandle b = constant(Object.class, 12);
>
>          MethodHandle h =
> lookup().unreflect(getClass().getMethod("add",int.class,int.class));
>          MethodHandle r =
> foldArguments(foldArguments(h,asReturnType(int.class,a)),asReturnType(int.class,b));
>          r = Sandbox.wrap(r);
>
>          assertEquals(42, r.invokeWithArguments());
>      }
>
>      public static int add(int a, int b) {
>          return a+b;
>      }
>
>      public static MethodHandle asReturnType(Class type, MethodHandle h) {
>          return h.asType(MethodType.methodType(type,h.type()));
>      }
> --------------------
>
> I was hoping that this would optimize to "return 42", but on my JDK7u3
> on linux-amd64, it only gets optimized to the followig:
>
> --------------------
>    # {method} 'invokedynamic' '()I' in 'Gen0'
>    #           [sp+0x20]  (sp of caller)
>    0x00007f785ca29700: push   %rbp
>    0x00007f785ca29701: sub    $0x10,%rsp
>    0x00007f785ca29705: nop                       ;*synchronization entry
>                                                  ; - Gen0::invokedynamic at -1
>    0x00007f785ca29706: mov    $0x7d66619d8,%r10  ;   {oop(a
> 'java/ang/nteger' = 30)}
>    0x00007f785ca29710: mov    0xc(%r10),%eax
>    0x00007f785ca29714: mov    $0x7d66618b8,%r10  ;   {oop(a
> 'java/ang/nteger' = 12)}
>    0x00007f785ca2971e: add    0xc(%r10),%eax     ;*iadd
>                                                  ; -
> GuardedIntAddTest::add at 2 (line 27)
>                                                  ; -
> java.lang.invoke.MethodHandle::invokeExact at 14
>                                                  ; - Gen0::invokedynamic at 0
>    0x00007f785ca29722: add    $0x10,%rsp
>    0x00007f785ca29726: pop    %rbp
>    0x00007f785ca29727: test   %eax,0x5b2c8d3(%rip)        # 0x00007f7862556000
>                                                  ;   {poll_return}
>    0x00007f785ca2972d: retq
> --------------------
>
> So as you can see, 30 and 12 are not recognized as constants.

You're right, 30 and 12 are not recognized as int constant
but they are recognized as java/lang/Integer constant.

0x00007f785ca29706: mov    $0x7d66619d8,%r10  ;   {oop(a 'java/ang/nteger' = 30)}

so escape analysis works but Hotspot doesn't trust
final field thus doesn't consider the value in the Integer
as a constant.

>
> I think this would affect dynamic languages that treat primitives and
> reference types interchangeably, which is the majority.  If I
> understand correctly, those languages need to compose method handlers
> of the type "(...)->Object", like I did, and rely on the inlining to
> discover unnecessary boxing/unboxing. p.18 in JSR 292 cookbook [2] is
> affected by this, too, since it uses a similar MethodHandle types.
>
>
> After a few more experiments, I realize the root cause of this isn't
> so much as JSR-292 but more in HotSpot. For example, the following
> method produces the following assembly code, and as you can see it's
> failing to optimize body() into just "return false".

In fact, Hotspot doesn't fail.
But Hotspot also obfuscates the code :(

> So my question is:
>
>   - Am I missing something?

yes :)
in that case, the body of 'body' is
(there is a ret at 0x00007fadf244130c)

0x00007fadf24412e6: mov    $0x7d6602fe0,%r10  ;   {oop(a 'java/lang/Class' = 'java/lang/Boolean')}
   0x00007fadf24412f0: mov    0x74(%r10),%r8d    ;*getstatic FALSE
                                                 ; -
java.lang.Boolean::valueOf at 10 (line 149)
                                                 ; -
BoxedBooleanInlineTest::bool1 at 1 (line 29)
                                                 ; -
BoxedBooleanInlineTest::body at 1 (line 21)
   0x00007fadf24412f4: movzbl 0xc(%r12,%r8,8),%r11d
   0x00007fadf24412fa: test   %r11d,%r11d
   0x00007fadf24412fd: jne    0x00007fadf244130d  ;*iconst_0
                                                 ; -
BoxedBooleanInlineTest::body at 24 (line 21)
   0x00007fadf24412ff: xor    %eax,%eax          ;*ireturn


The code is fully optimized xor %eax, %eax put 0 in eax,
which is by convention the register containing the return value
and the rest of the code is not used.
The question here is why Hostspot generates the unnecessary codes
above the xor instruction. The code tests that zero is equals to zero
before returning zero :)

>   - Is there any reason behind why HotSpot fails to treat boxed
> constants like real constants? Is that because HotSpot doesn't trust
> 'final'?

Yes.

>   - How do other language implementers cope with this?

You are the first as far as I know to use only a tree of method handles
to implement expressions. The rest of us generates bytecodes
and have a compiler that does constant propagation.

Rémi



More information about the mlvm-dev mailing list