Semantics of multiple patterns with the same prefix in a switch

Sun Jan 31 22:57:21 UTC 2021

TO match Box(Foo f):

First, we statically check for applicability.
  - First, we do an overload selection on Box deconstructors; assume we 
find Box(Shape s).
  - The match operand is Object, and the target type of the pattern 
Box(...) is Box.  So we first check to see if it is even possible to 
cast Object to Box, by the rules of cast conversion, and whether that 
cast would be unchecked.  (This would prevent us from saying `anInteger 
instanceof String s`, because casting Integer to String is known to fail 
at compile time.)
  - Assuming we pass the above, we then move on to the nested pattern.  
THe Box deconstructor yields, presumably, a Shape.  So we do the same 
applicability game for `Shape` and `Foo f`.
  - Assuming we pass the above, the match compiles.

As for runtime...

  - SInce the match operand is not already a Box, we first do 
`instanceof Box`.
  - Since the Box deconstructor is total on boxes, if that test passes, 
we run the Box deconstructor.  It yields Shape s.
  - Since the inner pattern is not "trivial", we take the Shape s, and 
match it to the inner pattern `Foo f`.  That's a non-total type pattern, 
so we test if `s instanceof Foo`.

I believe I specified a translation model in a recent mail that should 
spit out the translation, using LetExpr blocks.  This is the semantics 
of matching a _single_ pattern.

Your question, I believe, is whether in such a switch, we may, may not, 
or must, run the deconstructor twice when we get to the second Box(...) 
case.  We haven't yet written down the rules for this, though of course 
we've thought about it.  As you point out, if the deconstructor has a 
side-effect, the number of invocations would be visible.

I think there are two realistic options:

  - The compiler and runtime _cannot_ try to optimize away the redundant 
invocation.
  - The compiler and runtime _may_ try to optimize away redundant 
invocations, but there is no guarantee it does so.

The third option -- that we _must_ do so -- can be quickly seen to 
becomes unrealistic as patterns get more complicated and harder to 
analyze.  So the first question is do we just give up and guarantee the 
obvious execution, or do we give ourselves license to be clever, but 
less deterministic.

Note that nested translation you wrote down is _not_ the obvious one; it 
is already a "clever" one.  THe obvious if-else chain is:

     if (object instanaceof Box(Circle circle)) { ... }
     else if (object instanceof Box(Square square)) { ... }

which can be further unrolled to:

     if (object instanaceof Box(Shape s) && s instanceof Circle circle) 
{ ... }
     else if (object instanceof Box(Shape s) && s instanceof Square 
square) { ... }

The one that you suggest is already an optimization of the above.  And 
while I can see my way to _allowing_ compilers to perform that 
transformation, I am having a harder time _specifying_ a particular 
optimization that would guarantee a deterministic number of executions 
that properly normalizes over all pattern trees.  (For the simple cases, 
of course it is obvious.)

Here's an only slightly less trivial example.  Imagine the Foo 
deconstructor is Foo(Object).

     case Foo(Frog f, Baz(Integer i)): ...
     case Foo(Tadpole t, Baz(Float f)): ...

The two patterns have a lot in common, not just the outer Foo test, but 
the right Baz test.  There is a translation that minimizes the number of 
both Foo and Baz executions, but it's not necessarily the obvious 
transformation.

But, we're already in the weeds of translation details and 
optimization.  A much simpler question that has to be answered first is: 
are we even willing to say "side effects in patterns can't be relied on 
to happen with the obvious arity and order, and if you have them, good 
luck to you"? Similarly, if we have a nested pattern P(Q, R), do we 
promise that the Q and R tests happen left to right, or let the compiler 
do what it likes?

On 1/31/2021 4:59 PM, Remi Forax wrote:
> Just to be sure, when we have a switch like
>
>    Object object = ...
>    switch(object) {
>      case Box(Circle circle): ...
>      case Box(Square square): ...
>    }
>
> We have agreed that case Box(Foo ..) is equivalent to an instanceof + a call to the deconstructor,
> but i don't think we have agree what the equivalent code should be,
> either
>
>    if (object instanceof Box(var value)) {
>      if (value instanceof Circle circle) { ... }
>      else if (value instanceof Square square) { ... }
>    }
>    
> or
>
>    if ((object instanceof Box(var value)) && (value instanceof Circle circle)) { ... }
>    else if ((object instanceof Box(var value)) && (value instanceof Square square)) { ... }
>
>
> It's important because if Box is a class with a deconstructor, i can insert a side effect in it and see how many times the deconstructor of Box is called.
>
> Given that i don't see the point of calling the deconstructor twice, in my mind, the switch is equivalent to the code with nested ifs and not the one with plain ifs.
> But re-reading the last document sent by Brian, i'm not sure anymore.
>
> regards,
> Rémi
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20210131/decae3ef/attachment.htm>