deduplicating lambda methods

Vicente Romero vicente.romero at oracle.com
Sat Mar 17 18:44:53 UTC 2018


Hi Liam,


On 03/16/2018 09:48 PM, Liam Miller-Cushon wrote:
> Hi,
>
> An updated version is at:
> http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.02/ 
> <http://cr.openjdk.java.net/%7Ecushon/lambdadedup/webrev.02/>

overall looks good, but I think that a set of regression tests should be 
included along with the patch.

>
> On Fri, Mar 16, 2018 at 11:46 AM Maurizio Cimadamore 
> <maurizio.cimadamore at oracle.com 
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
>     Having a tree scanner accepting a visitor parameter would be good -
>     although can be done outside the scope of this patch.
>
>
> Sounds good, I'll add it to my list to investigate later.
>
>     Note that, in the meantime, it's rather easy to mimic the visitor
>     parameter idiom - just define a 'scan' method like this:
>
>
> Thanks! I switched to com.sun.tools.javac.comp.TreeScanner using your 
> approach.
>
> On Fri, Mar 16, 2018 at 12:20 PM Vicente Romero 
> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>
>     TreeHasher should skip
>     parenthesis of expressions, to obtain the same hash for, for example:
>     return (i + j); and return i + j;
>
>
> Done, thanks.
>
>     Also, as a bonus, consider constant expressions, we probably want:
>     return 5; and return 2 + 3; to be equal.
>
>     I think that these two use cases will need a "normalization" step
>     to be
>     applied to the tree and store in the map, set, only a normalized
>     version
>     of the tree. I understand that at least considering constants could be
>     out of the scope of this patch, so I can see it as a follow up
>     development.
>
>
> Are you envisioning this as something that would happen as part of tree
> comparision,

yes because the current effort on constant folding won't be rewriting 
the trees, at least not all of them only a small fraction of them. 
Although as I said there is no need to do this now I don't think that 
the constant folding project will make this task easier in the future. 
Unless we decide to rewrite all the trees holding a constant value.

> or something that would fall out of other work around constant
> folding? If support for comprehensive compile-time constant folding 
> happens,
> and it's performed prior to lambda deduplication, then your examples would
> be supported for free without the need for a more sophisticated diff.
> On Fri, Mar 16, 2018 at 1:07 PM Vicente Romero 
> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>
>     I could be wrong but it seems to me that TreeHasher generates too many
>     false positives as the hash number obtained won't have any topological
>     information about the tree. In other words, it seems to me that
>     the hash
>     generated will be the same regardless of the way the tree is
>     traversed.
>     I think that we are interested in obtaining a hash that is a
>     topological
>     descriptor of the tree. This will probably imply making TreeHasher a
>     full fledged visitor.
>
>
> The hash function needs more work. I am still planning to collect data
> on how often collisions are happening as part of quantifying the
> compile-time performance overhead.
>
> However, the collisions I am seeing right now are mostly from identifiers
> and literals, and not from differences in the topology. I'm having trouble
> thinking of uncontrived examples where the topology of two ASTs is
> different, but the result of traversing all nodes and recording their tags
> is the same. Do you think that's going to be common, and did you have
> examples in mind?

at first glance it seemed to me that the hash function would return the 
same value for:
(int i) -> i / 5; and (int i) -> 5 / i; I haven't test it but this was 
my impression.

Vicente


More information about the amber-dev mailing list