[9] RFR (L): 8072008: Emit direct call instead of linkTo* for recursive indy/MH.invoke* calls

Thu Nov 19 15:25:48 UTC 2015

Thanks a lot for the thorough review, John!

Updated version:
   http://cr.openjdk.java.net/~vlivanov/8072008/webrev.02/

Latest changes only:
   http://cr.openjdk.java.net/~vlivanov/8072008/webrev.01_02/

> I'm slightly uncomfortable with making CallGenerator into a
> state-ful type.  Maybe:
>
> -+   cg->set_override_symbolic_info(true); ++   cg =
> cg->with_override_symbolic_info(true);
>
> To do this cleanly requires a virtual (re-constructor) with lots of
> defs.
>
> Perhaps it can be done dirtily also, as long as there is a note
> saying "if these objects are ever cached, a fresh copy must be
> created here", inside the dirty implementation of
> with_override_symbolic_info.
>
> (Or add virtual CallGenerator::fresh_copy, and keep the
> side-effecting thing?)
Oh, I haven't thought about caching CallGenerators. Having 
override_symbolic_info in CallGenerator doesn't play well with it.

override_symbolic_info is entangled with JVMState passed into 
CallGenerator::generate(JVMState*). So, depending on the context 
symbolic info can be either valid or invalid.

I haven't found a good way to fix it yet. I'll leave it as is for now, 
but will continue thinking about alternatives.

Maybe a special check in CallGenerator::generate() to catch failed 
inlines through MH.linkTo/invokeBasic?

> /SharedRuntime::resolve_static_call_C/s/static/opt_virtual/ — for
> invokeBasic, which is not static
I haven't seen opt_virtual calls for invokeBasic. I think it can be 
either static or opt_virtual depending on where
mh.lform.vmentry.vmtarget points to. But right now compiled LFs contain 
only static method, so it is always compiled into a static call:
method, so it is e.g.:
   0x0000000112721797: callq  0x000000011263ba80  ; ImmutableOopMap{}
                                                 ;*invokevirtual
invokeBasic {reexecute=0 rethrow=0 return_oop=1}
                                                 ; -
java.lang.invoke.InvokeTest::invokeBasic at 3 (line 138)
                                                 ;   {static_call
java.lang.invoke.LambdaForm$reinvoker/559450121::dontInline}

> Too bad linkToFoo methods don't profile.  Even static methods should
> profile the first argument (if a reference).  File a bug for this,
> and point out the place in for_method_handle_inline which suffers?
> It would help indy also.  I think Roland did most of the work
> already.
Are you talking about JDK-8015417: "profile pollution after call through
invokestatic to shared code" [1]?

What kind of benefits do you expect for MH.linkTo* case?
Not sure profiling would help MH.linkTo* methods, since they are used
only in compiled LambdaForms for DirectMethodHandles and hence heavily
shared.

Profile pollution is not the only reason inlining can fail. Inlining
policy decisions (e.g. recursive call pruning) also affects that.
Actually, recursive call case was initially reported by Remi.

> The change near "seemingly deficient out-count" seems like a
> stand-alone bug fix.  How did you find it?
The tests I wrote crashed. It was not a problem before, since call site
argument count matched between bytecode and a method in call generator.

But as a side effect of replacing MH.linkTo* with a direct/virtual call
appendix argument is pruned. So, they don't match anymore and I have to
switch to using bytecode information.

> Suggest a comment in callGenerator.hpp:
>
> // If override_symbolic_info is true, then the CG::_method value may
> differ from the symbolic reference at the current bytecode, and the
> runtime linker (e.g., SharedRuntime::resolve_static_call_C) will be
> told to consult the CG's method, not the current bytecode's method.
> It is an error if the methods have incompatible descriptors, or if
> the methods differ but override_symbolic_info is false.
Added. It is not as strict as described here. For example, replacing
MH.linkTo* by the target means method arities don't match since appendix
argument is not removed.

> Suggest adding asserts somewhere central (doCall?) to check the "it
> is an error" statements (and maybe move those statements next to the
> asserts, instead of in the header file).
Done (see doCall.cpp:607).

> The _invokeBasic intrinsic corresponds to Bytecode::_invokehandle,
> not Bytecode::_invokestatic.  The sharedRuntime code associates it
> with invokestatic.  This seems murky to me.
Fixed.

It's invokehandle on bytecode level, but after the compiler inlines
through the call, it becomes a direct call in compiled code.

Feeding Bytecode::_invokehandle into LinkResolver doesn't work, since it
expects resolved klass to be MethodHandle. But it is not the case
anymore, since the compiler already eliminated the indirection.

Do you see a better solution than patching bc before performing method 
resolution?

> Suggest renaming compute_bc to
> MethodHandles::signature_polymorphic_intrinsic_bytecode (next to
> spi_name).
Done.

> It's way confusing to have variables "callinfo" and "call_info".
> Suggest s/call_info/attached_method/.  Likewise
> s/extract_attached_call_info/extract_attached_method/.  And
> s/resolve_attached_call_info/resolve_attached_method/.
Renamed as you suggest.

> s/use it's holder/use its holder/  — Its genitive is "its"; yes it's
> confusing.
Fixed.

> It would be much better if there were one call to retrieve_receiver
> instead of two.  The duplication of logic is confusing.  Strongly
> suggest factoring the existing logic into a common subroutine call.
I've refactored SharedRuntime::find_callee_info_helper to avoid repeated
work and cover attached resolved method case more uniformly.

> Also, the near-copy resolve_attached_call_info of
> LinkResolver::resolve_invoke (in a different file!) is frustratingly
> confusing.  ("Now, where have I seen that switch statement before?")
> I suggest factoring them together somehow, or at least putting the
> twin copies side by side in linkResolver.cpp.
Moved resolve_attached_call_info to LinkResolver::resolve_invoke overload.

> Basically, the logic says:  a. get receiver r1, b. resolve method m1,
> c. if it is a redirectable intrinsic, then: d. get receiver r2, e.
> get attached method m2, f. resolve method m2.  This is too
> complicated for what is really intended, which is (get attached
> method || get symbolic method) && get receiver && resolve.
>
> The whole thing about the dynamic receiver doesn't really belong in
> the static resolution mix, except that we are supporting monomorphic
> inline caches along this path.  I'd rather get the method completely
> determined, and then (if needed) do the receiver trick.
I tried to achieve that with SharedRuntime::find_callee_info_helper
refactoring. Do you like how it is shaped now?

> Where else in the system is there code like
> extract_attached_call_info (which loops over reloc-info looking for
> something)?  Most uses of RelocIterator are in codeBlob, nmethod, or
> relocInfo, and the only ones in sharedRuntime are all about call site
> patching, not (initial) linking.  Suggest putting it in nmethod or
> relocInfo, so it is easy to find.
Moved iteration logic into nmethod::attached_method().

> Again, this is a good change.  Let's treat symbolic method
> references and attached methods as symmetrically as possible.  The
> result will clarify the JVM's core logic, IMO, since currently
> bytecode unpacking is mixed confusingly into the immediately
> following phase of method resolution.
I've spent some time thinking about how to treat attached methods more 
uniformly and avoid specializing them for MH linkers case. But it 
requires additional information along with attached method - invocation 
mode (to be able to resolve a method w.r.t. receiver). It would make 
SharedRuntime::find_callee_info_helper completely MH agnostic. But does 
it worth it?

Best regards,
Vladimir Ivanov

[1] https://bugs.openjdk.java.net/browse/JDK-8015417