Boxed types and constat propagation

Kohsuke Kawaguchi kk at kohsuke.org
Sat Apr 21 15:02:23 PDT 2012


Hi,

I was inspired by the talk by Charles in JAX 2012 and was playing with
invokedynamic a bit. I'm observing what seems like a constant
propagation failure, which I'd imagine would affect some important use
cases, so I wanted to check if I'm not doing something stupid.

I've used Apache JEXL [1] as my toy "dynamic language". My basic
strategy was to convert an expression into a graph of MethodHandles.
This is a fairly straight-forward process, where each node in the JEXL
AST is converted into function of the type (JexlContext)->Object.
JexlContext represents the context object for an evaluation.

I then compiled the expression "30+12" to see how well it'd optimize,
which essentially does the following:

--------------------
import static java.lang.invoke.MethodHandles.*;

    public void test1() throws Throwable {
        // builds 30+12 as tree
        MethodHandle a = constant(Object.class, 30);
        MethodHandle b = constant(Object.class, 12);

        MethodHandle h =
lookup().unreflect(getClass().getMethod("add",int.class,int.class));
        MethodHandle r =
foldArguments(foldArguments(h,asReturnType(int.class,a)),asReturnType(int.class,b));
        r = Sandbox.wrap(r);

        assertEquals(42, r.invokeWithArguments());
    }

    public static int add(int a, int b) {
        return a+b;
    }

    public static MethodHandle asReturnType(Class type, MethodHandle h) {
        return h.asType(MethodType.methodType(type,h.type()));
    }
--------------------

I was hoping that this would optimize to "return 42", but on my JDK7u3
on linux-amd64, it only gets optimized to the followig:

--------------------
  # {method} 'invokedynamic' '()I' in 'Gen0'
  #           [sp+0x20]  (sp of caller)
  0x00007f785ca29700: push   %rbp
  0x00007f785ca29701: sub    $0x10,%rsp
  0x00007f785ca29705: nop                       ;*synchronization entry
                                                ; - Gen0::invokedynamic at -1
  0x00007f785ca29706: mov    $0x7d66619d8,%r10  ;   {oop(a
'java/ang/nteger' = 30)}
  0x00007f785ca29710: mov    0xc(%r10),%eax
  0x00007f785ca29714: mov    $0x7d66618b8,%r10  ;   {oop(a
'java/ang/nteger' = 12)}
  0x00007f785ca2971e: add    0xc(%r10),%eax     ;*iadd
                                                ; -
GuardedIntAddTest::add at 2 (line 27)
                                                ; -
java.lang.invoke.MethodHandle::invokeExact at 14
                                                ; - Gen0::invokedynamic at 0
  0x00007f785ca29722: add    $0x10,%rsp
  0x00007f785ca29726: pop    %rbp
  0x00007f785ca29727: test   %eax,0x5b2c8d3(%rip)        # 0x00007f7862556000
                                                ;   {poll_return}
  0x00007f785ca2972d: retq
--------------------

So as you can see, 30 and 12 are not recognized as constants.

I think this would affect dynamic languages that treat primitives and
reference types interchangeably, which is the majority.  If I
understand correctly, those languages need to compose method handlers
of the type "(...)->Object", like I did, and rely on the inlining to
discover unnecessary boxing/unboxing. p.18 in JSR 292 cookbook [2] is
affected by this, too, since it uses a similar MethodHandle types.


After a few more experiments, I realize the root cause of this isn't
so much as JSR-292 but more in HotSpot. For example, the following
method produces the following assembly code, and as you can see it's
failing to optimize body() into just "return false".

So my question is:

 - Am I missing something?
 - Is there any reason behind why HotSpot fails to treat boxed
constants like real constants? Is that because HotSpot doesn't trust
'final'?
 - How do other language implementers cope with this?

--------------------
    public boolean body() {
        return bool1() && bool2();
    }

    private Boolean bool2() {
        return true;
    }

    private Boolean bool1() {
        return false;
    }
--------------------

  # {method} 'body' '()Z' in 'BoxedBooleanInlineTest'
  #           [sp+0x20]  (sp of caller)
  0x00007fadf24412c0: mov    0x8(%rsi),%r10d
  0x00007fadf24412c4: shl    $0x3,%r10
  0x00007fadf24412c8: cmp    %r10,%rax
  0x00007fadf24412cb: jne    0x00007fadf24138a0  ;   {runtime_call}
  0x00007fadf24412d1: xchg   %ax,%ax
  0x00007fadf24412d4: nopl   0x0(%rax,%rax,1)
  0x00007fadf24412dc: xchg   %ax,%ax
[Verified Entry Point]
  0x00007fadf24412e0: push   %rbp
  0x00007fadf24412e1: sub    $0x10,%rsp
  0x00007fadf24412e5: nop                       ;*synchronization entry
                                                ; -
BoxedBooleanInlineTest::body at -1 (line 21)
  0x00007fadf24412e6: mov    $0x7d6602fe0,%r10  ;   {oop(a
'java/ang/lass' = 'java/ang/oolean')}
  0x00007fadf24412f0: mov    0x74(%r10),%r8d    ;*getstatic FALSE
                                                ; -
java.lang.Boolean::valueOf at 10 (line 149)
                                                ; -
BoxedBooleanInlineTest::bool1 at 1 (line 29)
                                                ; -
BoxedBooleanInlineTest::body at 1 (line 21)
  0x00007fadf24412f4: movzbl 0xc(%r12,%r8,8),%r11d
  0x00007fadf24412fa: test   %r11d,%r11d
  0x00007fadf24412fd: jne    0x00007fadf244130d  ;*iconst_0
                                                ; -
BoxedBooleanInlineTest::body at 24 (line 21)
  0x00007fadf24412ff: xor    %eax,%eax          ;*ireturn
                                                ; -
BoxedBooleanInlineTest::body at 25 (line 21)
  0x00007fadf2441301: add    $0x10,%rsp
  0x00007fadf2441305: pop    %rbp
  0x00007fadf2441306: test   %eax,0x5b42cf4(%rip)        # 0x00007fadf7f84000
                                                ;   {poll_return}
  0x00007fadf244130c: retq
  0x00007fadf244130d: mov    0x70(%r10),%r10d   ;*getstatic TRUE
                                                ; -
java.lang.Boolean::valueOf at 4 (line 149)
                                                ; -
BoxedBooleanInlineTest::bool2 at 1 (line 25)
                                                ; -
BoxedBooleanInlineTest::body at 11 (line 21)
  0x00007fadf2441311: movzbl 0xc(%r12,%r10,8),%r11d
  0x00007fadf2441317: test   %r11d,%r11d
  0x00007fadf244131a: je     0x00007fadf24412ff  ;*ifeq
                                                ; -
BoxedBooleanInlineTest::body at 17 (line 21)
  0x00007fadf244131c: mov    $0x1,%eax
  0x00007fadf2441321: jmp    0x00007fadf2441301

--------------------


[1] http://commons.apache.org/jexl/reference/syntax.html
[2] http://wiki.jvmlangsummit.com/images/9/93/2011_Forax.pdf
-- 
Kohsuke Kawaguchi


More information about the mlvm-dev mailing list