deduplicating lambda methods

Sat Mar 3 01:17:19 UTC 2018

I think this is a great idea.  And it synergizes well with some other 
work we've got in the pipe.

It's a shame that if you use the same lambda twice in the same source 
file, we desugar two separate lambda$nnn methods, and spin two separate 
lambda proxy classes.  Deduplicating the lambda$nnn methods will address 
the former; a separate effort, where we are using the new 
"constantdynamic" instead of "invokedynamic" to evaluate method refs and 
non-capturing lambdas, will address the latter (once the former is 
addressed.)  So this will make Java programs more efficient overall.  
(The condy translation will get us deduplication for free for method 
references, but not for lambdas.)

I think AST comparison is likely to be easier and more effective. And it 
doesn't have to be perfect; if it gets fooled by occasional differences, 
that's OK, as long as it doesn't merge lambdas that are actually 
different.  And you don't have the generated code until later, when its 
likely harder to do the merging.  There's a whole pass in the compiler 
pipeline for lambda method desugaring (LambdaToMethod), so there's an 
obvious place to do this transformation.

Another consideration is serializable lambdas; the scheme for 
serializable lambdas involves a parallel generation path for 
deserialization.  I suspect that its probably best to just avoid 
serializable lambdas entirely, at least at first.

Is there any trickiness with capturing lambdas?  I don't think so -- I 
think we can merge these too, although I suspect the return on that 
effort is lower.  I'll bet the most common case is lambdas like e -> e, 
x -> System.out.println(x), etc.

I know you have some good tools at Google for codebase statistics.  
Maybe you could pull together data on how often lambdas are duplicated 
within a source file, and of the duplicated lambdas, what percentage are 
stateless and non-serializable?

On 3/2/2018 8:03 PM, Liam Miller-Cushon wrote:
> Hello,
>
> I'm interested in adding support for deduplicating lambda methods to 
> javac. The idea is that if a compilation unit contains two lambdas 
> that are identical (including any captured state and the functional 
> interface they implement) we could re-use the same implementation 
> method for both.
>
> I understand there might have been some prior discussion about this. 
> Is there interest in investigating the feature? What sort of technical 
> considerations have been identified so far?
>
> I have been thinking about a couple of questions:
>
> 1) How to identifying duplicates: I have a prototype that runs during 
> lambda desugaring and identifies duplicates by diffing ASTs. Is that 
> the best place for deduplication, or it worth considering comparing 
> generated code instead of ASTs?
>
> 2) Debug info: the optimization is safe if line numbers are not being 
> emitted. If they are, is there a way to deduplicate the methods 
> without breaking debug info?
>
> Thanks,
> Liam