deduplicating lambda methods
Vicente Romero
vicente.romero at oracle.com
Sat Mar 17 18:44:53 UTC 2018
Hi Liam,
On 03/16/2018 09:48 PM, Liam Miller-Cushon wrote:
> Hi,
>
> An updated version is at:
> http://cr.openjdk.java.net/~cushon/lambdadedup/webrev.02/
> <http://cr.openjdk.java.net/%7Ecushon/lambdadedup/webrev.02/>
overall looks good, but I think that a set of regression tests should be
included along with the patch.
>
> On Fri, Mar 16, 2018 at 11:46 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
> Having a tree scanner accepting a visitor parameter would be good -
> although can be done outside the scope of this patch.
>
>
> Sounds good, I'll add it to my list to investigate later.
>
> Note that, in the meantime, it's rather easy to mimic the visitor
> parameter idiom - just define a 'scan' method like this:
>
>
> Thanks! I switched to com.sun.tools.javac.comp.TreeScanner using your
> approach.
>
> On Fri, Mar 16, 2018 at 12:20 PM Vicente Romero
> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>
> TreeHasher should skip
> parenthesis of expressions, to obtain the same hash for, for example:
> return (i + j); and return i + j;
>
>
> Done, thanks.
>
> Also, as a bonus, consider constant expressions, we probably want:
> return 5; and return 2 + 3; to be equal.
>
> I think that these two use cases will need a "normalization" step
> to be
> applied to the tree and store in the map, set, only a normalized
> version
> of the tree. I understand that at least considering constants could be
> out of the scope of this patch, so I can see it as a follow up
> development.
>
>
> Are you envisioning this as something that would happen as part of tree
> comparision,
yes because the current effort on constant folding won't be rewriting
the trees, at least not all of them only a small fraction of them.
Although as I said there is no need to do this now I don't think that
the constant folding project will make this task easier in the future.
Unless we decide to rewrite all the trees holding a constant value.
> or something that would fall out of other work around constant
> folding? If support for comprehensive compile-time constant folding
> happens,
> and it's performed prior to lambda deduplication, then your examples would
> be supported for free without the need for a more sophisticated diff.
> On Fri, Mar 16, 2018 at 1:07 PM Vicente Romero
> <vicente.romero at oracle.com <mailto:vicente.romero at oracle.com>> wrote:
>
> I could be wrong but it seems to me that TreeHasher generates too many
> false positives as the hash number obtained won't have any topological
> information about the tree. In other words, it seems to me that
> the hash
> generated will be the same regardless of the way the tree is
> traversed.
> I think that we are interested in obtaining a hash that is a
> topological
> descriptor of the tree. This will probably imply making TreeHasher a
> full fledged visitor.
>
>
> The hash function needs more work. I am still planning to collect data
> on how often collisions are happening as part of quantifying the
> compile-time performance overhead.
>
> However, the collisions I am seeing right now are mostly from identifiers
> and literals, and not from differences in the topology. I'm having trouble
> thinking of uncontrived examples where the topology of two ASTs is
> different, but the result of traversing all nodes and recording their tags
> is the same. Do you think that's going to be common, and did you have
> examples in mind?
at first glance it seemed to me that the hash function would return the
same value for:
(int i) -> i / 5; and (int i) -> 5 / i; I haven't test it but this was
my impression.
Vicente
More information about the amber-dev
mailing list