The Great Startup Problem

Mon Sep 1 13:24:31 UTC 2014

Jochen,

>> Is it acceptable and solves the problem for you?
>
> Let me ask you what you consider as acceptable. I am quite interested in
> the JVM-engineers point of view here.
"N frames per chain of N method handles" looks reasonable for me, but it 
depends on average number of transformations users apply. If the case of 
deep method handle chains is common in practice, we need to optimize for 
it as well and linear dependency in stack space may be too much.

> As for Groovy... the 55 frames are the case of a one argument method
> call and cached. The lines you mention appear there 9 times, so I would
> safe 27 frames, leaving 28. That's half of what it was before and as
> such better for sure, but 28 is still a lot I would think. But I think
> you did not mean that. You did mean every handle involved becomes one
> frame. So how many frames will it be in this simple case:
>
> I have a no-arg method call, I have to do a guard to see if the receiver
> runtime class stays the same. I my reading that makes, 1 for dropping
> stuff from the callsite, since the callsite has more arguments than the
> method (meta information). Then 1 for the guard, even if the guard is
> true. One for the type transformation of the guard, since we work with
> exact types and the callsite as well as the target method do have types.
> Then the handle for the target method itself, together with another type
> transformation.
>
> That makes 5 frames in between. 5 is worlds better than 53.
Ok, 5 additional frames for simple case. Is such overhead tolerable for 
you? Or do you need smaller number of intermediate frames?

What are your estimate for complex case? What's the worst case in Groovy?

>> We discussed an idea to generate custom bytecodes (single method) for
>> the whole method handle chain (and have only 1 extra stack frame per MH
>> invocation), but it defeats memory footprint reduction we are trying to
>> archieve with LambdaForm sharing.
>
> I wonder if that is the case for Groovy as well. Our old callsite
> mechanism does have only 1 frame (upon second execution). Because by
> then we generated a class for the callsite that does all the argument
> transformation, checks and target method execution. So compared to that
> I would not expect a memory increase.
We are looking for ways to significantly reduce memory consumption of 
JSR292 implementation. Inlining of LFs from call site means 1 anonymous 
class per indy call site. Comparing to fully customized LambdaForms, it 
should give noticeable savings due to smaller number of anonymous 
classes being loaded. But it doesn't comply with ultimate goal of fixed 
set of combinators used to implement all possible behaviors.

Best regards,
Vladimir Ivanov