[External] : Re: RFR: 8291769: Translation of switch with record patterns could be improved

Sun Aug 7 08:02:32 UTC 2022

> From: "jan lahoda" <jan.lahoda at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Jan Lahoda" <jlahoda at openjdk.org>, "Brian Goetz" <brian.goetz at oracle.com>,
> "compiler-dev" <compiler-dev at openjdk.org>
> Sent: Friday, August 5, 2022 11:54:05 PM
> Subject: Re: [External] : Re: RFR: 8291769: Translation of switch with record
> patterns could be improved

> On 05. 08. 22 22:52, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote:

>> [...]

>>>>> In any case, I'm a bit afraid that we could enhance a translation patch
>>>>> forever, always adding new cases that can be optimized. So, I think we
>>>>> need to draw a line somewhere where the patch provides a benefit while
>>>>> still being testable and reviewable. With a possible separate
>>>>> improvements, if they prove to be useful. It is possible I haven't drawn
>>>>> the line far enough, but looking at the patch, it seems complex enough.

>>>> I think the issue is that a list of pattern is not the right intermediary
>>>> structure to try to optimize the pattern matching, a decision tree is a better
>>>> intermediary representation (this is how pattern matching is usually optimized,
>>>> i've not invented something new :) )

>>>> Here is an example of such decision tree: [
>>>> https://urldefense.com/v3/__https://github.com/forax/switch-pattern-combinator__;!!ACWV5N9M2RV99hQ!JevAQE8IkRf4n-BMKA2sMTNFfJ5sAyJ1xopR2O9rq_45-xqaHOgirecR8E4zIGB_RgMb27G3tlcHEuGADjk$
>>>> |
>>>> https://urldefense.com/v3/__https://github.com/forax/switch-pattern-combinator__;!!ACWV5N9M2RV99hQ!JevAQE8IkRf4n-BMKA2sMTNFfJ5sAyJ1xopR2O9rq_45-xqaHOgirecR8E4zIGB_RgMb27G3tlcHEuGADjk$
>>>> ]

>>> I am not really opposed to having an intermediate structure, but I think
>>> the overall cost of the structure is likely to be significant, and it
>>> should provide significantly better code to offset that.

>> I agree.

>>> Looking at the page, I see:

>>> The advantage of a decision tree is that if two patterns have a common
>>> prefix, the code for the common prefix is generated only once.

>>> I wonder if there's a *conceptual* difference between what this PR is
>>> doing and what your code is doing. The PR here is trying to take common
>>> prefixes (of viable cases) and generate code for them only once. I don't
>>> think that not optimizing

>>> case A(String s) -> {}

>>> case A a -> {}

>>> is a conceptual difference, but if this is something that is seen as
>>> critical, I am fairly sure I can implement that without creating a brand
>>> new structure and code to convert the AST into it and back.

>> You need the intermediary structure when the common prefix are not subsequent

>> case A(B b) ->
>> case B(A a) ->
>> case A(C c) ->

>> Here the first and the last case shares a common prefix.

> Sorry, but I don't see why I can't do this particular thing without a
> specialized intermediate representation.
The patterns are a partial orders, because we can optimize the ones with a common prefix, we need a sort, a decision tree is equivalent to a sort. 
There are several advantages to use a decision tree instead of a sort, 
- it's easier to visualize 
- it mimics the the usual hand-written code (you don't repeat the instanceofs, you enclose them) 
- i believe that for the code generation, doing local decisions on each node is good enough, we do not need to propagate information in between nodes. 

> A bigger question, I think, is when it is safe to do so, and when not.
yes, not all re-ordering are safe, we need to keep some constraints. Fortunately, the ordering problems only arise when there are guards. 
The idea is that for each node of the decision tree, you should generate the code in that order, first for the record pattern, then the code for the guards then the code for the type pattern, in that order. 
If the guards are recorded in the same order as in the pattern matching (this is equivalent to say that the sort should be stable in term of guards), we should be ok. 

As you said, the current spec allows to freely mix type patterns and type pattern + guard pattern with the same type, this has to be fixed at the spec level. 
I think it's reasonable to introduce an order between the type pattern + guard pattern and the type pattern on the same type, because it's more readable and because we do not want users to care to much about the order of the patterns to get good performance. You should get good performance by default, and not have to crawle StackOverflow to learn the arcane of finding the "right order" for patterns. 

Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/compiler-dev/attachments/20220807/f31904fe/attachment.htm>