deduplicating lambda methods

Tue Mar 20 12:42:56 UTC 2018

On 03/19/2018 05:37 PM, Maurizio Cimadamore wrote:
>
>
>
> On 18/03/18 02:04, Liam Miller-Cushon wrote:
>> Hi Vicente,
>>
>> On Sat, Mar 17, 2018 at 11:44 AM Vicente Romero 
>> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>>
>>     overall looks good, but I think that a set of regression tests
>>     should be included along with the patch.
>>
>>
>> Definitely, I have some functional tests that I'll work on turning 
>> into a jtreg test.
>>
>> Do you have advice about how much test coverage is appropriate for 
>> tree comparison?
>> Getting to 100% branch coverage would require a lot of test cases. I 
>> could probably
>> generate some of them programatically. FWIW a certain amount of 
>> TreeDiffer was
>> generated and not hand-written, so any bugs are probably structural 
>> problems rather
>> than typos. (That doesn't lessen the need for tests, of 
>> course--there's still the risk of
>> adding typos when changes are made in the future.)
> How about a test harness that takes a source file, compiles, takes its 
> AST and then uses it as a 'golden' AST to compare it against other AST 
> obtained from:
>
> 1) compiling same source file with some random harmless alteration 
> (e.g. alpha renaming of identifiers)
> 2) compiling same source file with some structural alterations (e.g. 
> drop all bodies from nested blocks)
> 3) or, compile another user-defined alteration
>
> You could get both positive tests (compare AST against itself, or 
> against alpha-renamed trees), and negative tests (compare AST against 
> structural alterations).
>
> If we got good at generating such random alteration I bet we could 
> achieve quite good coverage by simply feeding already existing javac 
> test sources to the fuzzing machinery.

neat!

>
> Maurizio

Vicente

>>     at first glance it seemed to me that the hash function would
>>     return the same value for:
>>     (int i) -> i / 5; and (int i) -> 5 / i; I haven't test it but
>>     this was my impression.
>>
>>
>> The hash function currently traverses all nodes and hashes their 
>> tags, so for those two
>> examples it would see something like `[DIV, IDENT, LITERAL]` and 
>> `[DIV, LITERAL, IDENT]`,
>> which would not result in the same value.
>>
>> This does mean that `x + 42` and `y + 43` both hashed to the same 
>> value. I updated the
>> hash function to include literal values and symbols (with the same 
>> handling of method
>> parameters as the diff). With that change I am no longer seeing hash 
>> collisions that result in
>> unnecessary diffing. There are still cases where two lambda bodies 
>> hash to the same
>> value but have different target types, but that case doesn't require 
>> a tree diff (the check
>> that the target types are the same happens first).
>>
>> The updates to hashing are here:
>> http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.03/ 
>> <http://cr.openjdk.java.net/%7Ecushon/lambdadedup/webrev.03/>
>