deduplicating lambda methods
Louis Wasserman
lowasser at google.com
Mon Mar 5 19:55:19 UTC 2018
So, here's a start on some of the data questions that have been asked in
this thread. There's an attached PDF with a bunch of nice graphs and
tables discussing all lambdas in Google's codebase. This is a pretty
simplistic analysis, deduplicating lambdas syntactically modulo parameter
name and target type including generics, but a more subtle analysis might
merge some more things -- e.g. Function.identity(). Some other interesting
data points not in the PDF:
Among nongenerated files having any lambdas at all,
- 16.5% have at least one syntactic duplicate in them.
- The average number of lambdas with at least one syntactic duplicate is
0.24.
- The average number of synthetic methods you'd eliminate by
deduplicating within a file is 0.47.
The six most common (target type including generics, syntactic method body
modulo parameter naming) pairs across our entire codebase were:
(Runnable) () -> {} // 674
(Predicate<String>) str -> !str.isEmpty() // 640
(Function<String, String>) str -> str // 492
(Callable<Void>) () -> null // 259
(Predicate<String>) str -> !Strings.isNullOrEmpty(str) // 204
(Function<Long, Long>) x -> x // 177
(Worth mentioning explicitly: x -> x + 1 was a ways down, with only 56
occurrences for UnaryOperator<Integer> as the most common type.)
Liam and I are still working on collecting information on method references
and on statefulness and serializability.
> ---------- Forwarded message ----------
>> From: Brian Goetz <brian.goetz at oracle.com>
>> Date: Fri, Mar 2, 2018 at 5:17 PM
>> Subject: Re: deduplicating lambda methods
>> To: Liam Miller-Cushon <cushon at google.com>, amber-dev at openjdk.java.net,
>> Vicente-Arturo Romero-Zaldivar <vicente.romero at oracle.com>
>>
>
>
> I know you have some good tools at Google for codebase statistics. Maybe
>> you could pull together data on how often lambdas are duplicated within a
>> source file, and of the duplicated lambdas, what percentage are stateless
>> and non-serializable?
>
>
More information about the amber-dev
mailing list