[External] : Re: Semantics of multiple patterns with the same prefix in a switch

Mon Feb 1 20:00:06 UTC 2021

Indeed.  (Which reinforces why I have been so irritatingly insistent on 
surprise-free refactorability in defining the abstract semantics of 
pattern matching.)

On 2/1/2021 2:42 PM, Guy Steele wrote:
>
>> On Feb 1, 2021, at 8:57 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>>
>>> I fully agree, whatever we decide, the order should be left-to-right.
>>>
>>> Given that the VMs/JITs are already able to share common sub expressions if a method can be inlined and there is no side effect, and that allowing optimizations to maybe occur is not WORA.
>>>
>>> I see two possible semantics,
>>>    - either the compiler and runtime metafactory do not share of the computation of the same pattern (the VM/JIT still can)
>>>    - or we allow the computation of patterns with the same prefix (to keep the left-to-right order) to be shared even if they are maybe side effects.
>>>      "With the same prefix" meaning that the patterns on the left that share the same surface syntax will share the same computation results.
>>>
>>> In the first case, we only rely on the VM/JIT magic, in the other case, we hope that no sane user will write a pattern method with side effect.
>>>
>>> So said differently, does the second proposed semantics make sense or is too clever ?
>> Let's posit that it is "clever" (maybe just enough, maybe too much), and evaluate the other side of the equation: what's the benefit?
>>
>> Clearly, if I have a 27-case switch statement:
>>
>>      case Foo(A a):
>>      case Foo(B b):
>>      ...
>>      case Foo(Z z):
>>      default:
>>
>> then there is some benefit.  And surely this will describe _some_ switch statements.  But, will it describe a lot of them?  Or are we more likely to see things like:
>>
>>      case Foo(A a):
>>      case Foo(B b):
>>      case Bar(C c):
>>      case Bar(D d):
>>      case Wooga(E e):
>>
>> in which "case" we will not see all that much benefit by optimizing away a stray invocation in the second and fourth cases.
>>
>> Where I think the lion's share of the optimization opportunity here is trying to prune cases based on the head type.  If, for example, we know that it's a Foo, then cases 3,4,5 can be known to not apply, and if we fail to match cases 1 or 2, we're done (or we fall into the default.)  I think exposing these optimizations -- which have the chance to reduce O(n) switch dispatch into O(1) -- is where the highest leverage is.
>>
>>
>> Given that, my proposal is:
>>   - Let's take as a baseline that we're going to do the dumb and predictable thing, for now
>>   - Work on our translation strategy and see how much we can squeeze out just based on the easy stuff
>>   - Evaluate how much incrementally the "clever head" strategy would buy;
>>   - Evaluate how much incrementally the "if the user does side-effects in a pattern, it's their own fault" approach would buy.
> And let’s also keep in mind that if we have the “dumb and predictable thing” as a baseline, then, after all, the programmer also has some control over the situation: if performance of
>
>      case Foo(A a):
>      case Foo(B b):
>      case Bar(C c):
>      case Bar(D d):
>      case Wooga(E e):
>
> is not satisfactory because it’s a Foo 99% of the time (that’s why the Foo cases were listed firs, after all?!), then the switch can easily be refactored (with IDE assistance?) to either
>
>      case Foo z:
> 	switch (z) {
>              case Foo(A a):
>              case Foo(B b):
>          }
>      case Bar z:
> 	switch (z) {
>              case Bar(C c):
>              case Bar(D d):
>          }
>      case Wooga z:
> 	switch (z) {
>              case Wooga(E e):
>          }
>
> (which works in the general case where patterns may have more than one argument) or
>
>      case Foo(var z):
> 	switch (z) {
>              case A a:
>              case B b:
>          }
>      case Bar(var z):
> 	switch (z) {
>              case C c:
>              case D d:
>          }
>      case Wooga(var z):
> 	switch (z) {
>              case E e:
>          }
>
> (which is more concise in this specific case), and in fact the three styles can be mixed if you want:
>
>      case Foo z:
> 	switch (z) {
>              case Foo(A a):
>              case Foo(B b1, B b2):
>          }
>      case Bar(var z):
> 	switch (z) {
>              case C c:
>              case D d:
>          }
>      case Wooga(E e):
>
> Thus the programmer can always micromanage the dispatch structure if performance is an issue.  And the performance model for the baseline approach is pretty straightforward and should be easy to understand.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210201/9c65c4f5/attachment-0001.htm>