deduplicating lambda methods

Sun Mar 18 02:04:04 UTC 2018

Hi Vicente,

On Sat, Mar 17, 2018 at 11:44 AM Vicente Romero <vicente.romero at oracle.com>
wrote:

> overall looks good, but I think that a set of regression tests should be
> included along with the patch.
>

Definitely, I have some functional tests that I'll work on turning into a
jtreg test.

Do you have advice about how much test coverage is appropriate for tree
comparison?
Getting to 100% branch coverage would require a lot of test cases. I could
probably
generate some of them programatically. FWIW a certain amount of TreeDiffer
was
generated and not hand-written, so any bugs are probably structural
problems rather
than typos. (That doesn't lessen the need for tests, of course--there's
still the risk of
adding typos when changes are made in the future.)

> at first glance it seemed to me that the hash function would return the
> same value for:
> (int i) -> i / 5; and (int i) -> 5 / i; I haven't test it but this was my
> impression.
>

The hash function currently traverses all nodes and hashes their tags, so
for those two
examples it would see something like `[DIV, IDENT, LITERAL]` and `[DIV,
LITERAL, IDENT]`,
which would not result in the same value.

This does mean that `x + 42` and `y + 43` both hashed to the same value. I
updated the
hash function to include literal values and symbols (with the same handling
of method
parameters as the diff). With that change I am no longer seeing hash
collisions that result in
unnecessary diffing. There are still cases where two lambda bodies hash to
the same
value but have different target types, but that case doesn't require a tree
diff (the check
that the target types are the same happens first).

The updates to hashing are here:
http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.03/