The Great Startup Problem

Mon Sep 1 09:28:58 UTC 2014

Am 01.09.2014 09:07, schrieb Vladimir Ivanov:
> Jochen,
>
> The stack traces you provide are so long due to LambdaForm
> interpretation. Most of the stack frames are the following:
> java.lang.invoke.LambdaForm$NamedFunction.invokeWithArguments(LambdaForm.java:1147)
>
> java.lang.invoke.LambdaForm.interpretName(LambdaForm.java:625)
> java.lang.invoke.LambdaForm.interpretWithArguments(LambdaForm.java:604)
> java.lang.invoke.LambdaForm$LFI.479397964.interpret_L(LambdaForm$LFI:-1)
>
> We are aware about it and plan to improve the situation in 8u40.
>
> The idea is to precompile (to bytecode) the every element of method
> handle chain  when indy call site is bound. It allows to skip LF
> interpretation and hence reduce worst case stack usage.
>
> Stack usage won't be constant though. Each compiled LF being executed
> consumes 1 stack frame, so for a method handle chain of N elements, it's
> invocation consumes ~N stack frames.
>
> Is it acceptable and solves the problem for you?

Let me ask you what you consider as acceptable. I am quite interested in 
the JVM-engineers point of view here.

As for Groovy... the 55 frames are the case of a one argument method 
call and cached. The lines you mention appear there 9 times, so I would 
safe 27 frames, leaving 28. That's half of what it was before and as 
such better for sure, but 28 is still a lot I would think. But I think 
you did not mean that. You did mean every handle involved becomes one 
frame. So how many frames will it be in this simple case:

I have a no-arg method call, I have to do a guard to see if the receiver 
runtime class stays the same. I my reading that makes, 1 for dropping 
stuff from the callsite, since the callsite has more arguments than the 
method (meta information). Then 1 for the guard, even if the guard is 
true. One for the type transformation of the guard, since we work with 
exact types and the callsite as well as the target method do have types. 
Then the handle for the target method itself, together with another type 
transformation.

That makes 5 frames in between. 5 is worlds better than 53.

> We discussed an idea to generate custom bytecodes (single method) for
> the whole method handle chain (and have only 1 extra stack frame per MH
> invocation), but it defeats memory footprint reduction we are trying to
> archieve with LambdaForm sharing.

I wonder if that is the case for Groovy as well. Our old callsite 
mechanism does have only 1 frame (upon second execution). Because by 
then we generated a class for the callsite that does all the argument 
transformation, checks and target method execution. So compared to that 
I would not expect a memory increase.

bye Jochen

-- 
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org