deduplicating lambda methods

Vicente Romero vicente.romero at oracle.com
Tue Mar 27 00:24:53 UTC 2018


Hi Liam,

looks good! one minor comment:

at TreeDiffer::visitIdent there is some code that was probably added for 
testing:

         Consumer<Integer> c = x -> {
             class Foo {}
             foo(Foo.class);
         };

along with the empty method: `void foo(Class<?> clazz) {}` that is 
defined above ::visitIdent.

there is no need for another iteration for this. Regarding the fuzzing 
test, if Maurizio agrees, I'm OK with going with this version of the 
patch, and add that one and more tests later on.

Thanks!
Vicente


On 03/26/2018 07:14 PM, Liam Miller-Cushon wrote:
> Hi Vicente,
>
> I pushed an updated webrev to:
> http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.04/ 
> <http://cr.openjdk.java.net/%7Ecushon/lambdadedup/webrev.04/>
>
> The main changes are:
> * I incorporated your patch for handling constant values.
> * The change now includes a test harness for lambda deduplication.
> * Deduplication is now skipped if debug info is enabled.
>
> The test asserts different groups of lambdas in the test input are
> deduplicated as expected, and has targeted coverage of diffing
> and hashing for all pairwise combinations of lambdas in the test input.
>
> I like Maurizio's idea of applying fuzzing here, but haven't yet
> made progress on that. As an aside, there might be a separate larger
> project here to investigate ways to fuzz javac APIs. I noticed there
> was some work done to fuzz the class reader [1].
>
> [1] 
> https://bugs.openjdk.java.net/browse/JDK-8054284?jql=labels%20%3D%20class_fuzzing
>
> Re: deduplicating lambdas that declare locals,
> e.g. : `() -> { int y = 42; return y; }`. I think that comparing locals by
> name is correct, assuming that the diff is not used on trees
> that can refer to locals declared in enclosing scopes (this is true for
> desugared lambda bodies, but not in general). Another option is to have
> the diff record local variable declarations it sees, and maintain a map
> of symbols that can be treated as equivalent. The current lambda parameter
> handling could be folded in to this--the lambda parameters would be 
> initial
> values in the map. This could also be generalized to other types of
> declarations, but that becomes problematic for declarations whose identity
> could escape the lambda. E.g. deduplicating two instances of
> `x -> { class Foo {}; baz(Foo.class); }` could change behaviour.
>
> Finally, I agree with Brian and Louis that we should have stronger
> evidence that more sophisticated handling of no-ops would be a win before
> adding complexity to support that. Inter-file handling of very common 
> lambdas
> might be a more beneficial future improvement.
>
>
> On Mon, Mar 26, 2018 at 6:21 AM Vicente Romero 
> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>
>     Hi all,
>
>     Although there are still areas under discussion I think that we should
>     go for a 'final', pushable version of the patch, which I think
>     should be
>     based on the last iteration sent by Liam, plus code based in the
>     changes
>     I proposed to don't go deeper in the tree comparison as soon as a
>     constant is found, plus regression tests. We can continue discussing
>     about the no-op filtering proposed by Bernard and if we have data
>     supporting that change, we can include it in a next iteration. I think
>     that we all mostly agree on the consistency of the state I'm
>     describing,
>     so I think we should go for it,
>
>     Thanks,
>     Vicente
>
>     On 03/02/2018 08:03 PM, Liam Miller-Cushon wrote:
>     > Hello,
>     >
>     > I'm interested in adding support for deduplicating lambda methods to
>     > javac. The idea is that if a compilation unit contains two lambdas
>     > that are identical (including any captured state and the functional
>     > interface they implement) we could re-use the same implementation
>     > method for both.
>     >
>     > I understand there might have been some prior discussion about this.
>     > Is there interest in investigating the feature? What sort of
>     technical
>     > considerations have been identified so far?
>     >
>     > I have been thinking about a couple of questions:
>     >
>     > 1) How to identifying duplicates: I have a prototype that runs
>     during
>     > lambda desugaring and identifies duplicates by diffing ASTs. Is that
>     > the best place for deduplication, or it worth considering comparing
>     > generated code instead of ASTs?
>     >
>     > 2) Debug info: the optimization is safe if line numbers are not
>     being
>     > emitted. If they are, is there a way to deduplicate the methods
>     > without breaking debug info?
>     >
>     > Thanks,
>     > Liam
>



More information about the amber-dev mailing list