deduplicating lambda methods

Mon Mar 5 11:37:46 UTC 2018

The overall plan seems pretty solid; thanks for looking into this. As 
for (1), I think comparing bytecodes would be simpler from a technical 
perspective, as there's less fuzziness about it - you just compare the 
stream of bytecodes and, if they are the same, deduplicate. But it's 
more a problem for our compiler which happens to be very AST centric, so 
lots of passes are described in terms of AST manipulations, not bytecode 
ones - which I think means that this one should be AST-based too.

As Brian mentioned, I would start simple, by ignoring edge cases such as 
serializable lambdas and/or deduplicating in the presence of debugging info.

One question I have for the VM gurus: how does deduplication affect JIT 
optimizations? E.g. assume we have two lambdas used in two very 
different way in the same source file, which happen to share the same 
code - e.g. something like:

Object o -> { if (o instanceof String) { System.err.println("Hello!"); }

one use site always calls it with a string parameter, another always 
calls it with an int parameter.

Now, with indy, I believe we can still optimize them fully, given that 
indy has some knowledge/state about the site making the call, so the two 
call won't be treated as 'identical' by the JIT and the profiling info 
won't be merged (I guess). But what if, someday, we were to replace indy 
with condy here? Would we lose performances?

Cheers
Maurizio

On 03/03/18 01:03, Liam Miller-Cushon wrote:
> Hello,
>
> I'm interested in adding support for deduplicating lambda methods to javac.
> The idea is that if a compilation unit contains two lambdas that are
> identical (including any captured state and the functional interface they
> implement) we could re-use the same implementation method for both.
>
> I understand there might have been some prior discussion about this. Is
> there interest in investigating the feature? What sort of technical
> considerations have been identified so far?
>
> I have been thinking about a couple of questions:
>
> 1) How to identifying duplicates: I have a prototype that runs during
> lambda desugaring and identifies duplicates by diffing ASTs. Is that the
> best place for deduplication, or it worth considering comparing generated
> code instead of ASTs?
>
> 2) Debug info: the optimization is safe if line numbers are not being
> emitted. If they are, is there a way to deduplicate the methods without
> breaking debug info?
>
> Thanks,
> Liam