deduplicating lambda methods

Sun Mar 18 15:43:59 UTC 2018

Clearly that's outside the scope of the current project, so this is all 
speculation.

One of the tradeoffs in attempting to deduplicate across classes is that 
there's a coordination cost.  If class A captures a lambda, and class B 
captures the same lambda, there's some benefit to them doing so 
independently; doing otherwise requires a cache, coordination, lookups, 
locks, etc.  You'd have to show that the hit rate is good enough to make 
up for this.

One place where it likely is -- and we don't currently do this -- is in 
the string concatenation bootstrap.  We currently spin a class for each 
string concatenation site, even though cases like "string and int" are 
surely duplicated many times.  Caching common shapes here is probably a 
much bigger win than caching duplicated lambdas; there is likely to be 
far more duplication.

For the few lambdas and method refs that are out at the "used a million 
times in the same app" end of the spectrum, the winning move there is 
probably some sort of jlink plugin that identifies the most common ones, 
factors them into static final fields of a synthetic class, and replaces 
captures with field references. Doing this in the JDK, though, is 
probably the wrong move, as any list of "famous lambdas" is likely to be 
wrong for 95% of applications, so this is something you want to generate 
by analysis over a specific application.

On 3/18/2018 11:17 AM, B. Blaser wrote:
> Hi Liam,
>
> On 18 March 2018 at 03:04, Liam Miller-Cushon <cushon at google.com> wrote:
>
> [...]
>
>> The updates to hashing are here:
>> http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.03/
> Still looking at the stats [1], we see that the highest duplication
> frequency of more than 10e3 appears on *cross-file* lambdas but falls
> to 10e2 on *intra-file* lambdas and your patch suggest an
> *intra-class* de-duplication with an even lower frequency (note also
> that the JIT profile issue [2] is still unaddressed)...
>
> So, I'm wondering if a set of well defined JIT compliant global shapes
> (somewhere in 'java.base') would be more efficient in term of class
> size reduction and performance?
>
> Bernard
>
> [1] https://drive.google.com/file/d/1abAR_bueU0Zxy4e9XfLVzVe2nwy_POQm/view?usp=sharing
> [2] http://mail.openjdk.java.net/pipermail/amber-dev/2018-March/002745.html