PM design question: Scopes

Sat Nov 4 15:15:37 UTC 2017

This mail is mostly about constraints; suggestions for addressing them 
will come separately.

An existing roadblock on scopes is that the scope of a local declared in 
a switch today is the entire switch, even though it is DU in most of the 
switch:

     switch (x) {
         case 1:
             int a = 3;
             break;
         case 2:
             use(a);  // error, not DA
             break;
         case 3:
             int a = 4; // error, a already in scope
     }

This was an unfortunate decision (that was part and parcel of the 
overly-literal copying of switch semantics from C), but something we 
have to at least minimally accommodate.

However, as the number of locals in switches increases (each case may 
have several binding variables), this status quo will become more and 
more annoying; users will not want to do this:

     case Foo(var a): ... break;
     case Bar(var aa): ... break;
     case Baz(var aaa): ... break;

when it's "obvious" the various binding variables are disjoint. Users 
are going to want to be able to reuse binding variables, and reasonably 
so; they may even want to reuse the same name with a different type in 
different cases:

     case Integer n:
     case Long n:
     case Float n:

And again, this is reasonable.  So any solution we come up with should 
accomodate this desire.

Finally, there is the matter of unbalanced ifs.  I would really like to 
accomodate this use case:

     if (!(x matches Foo(var a))
         throw new NotFooException();
     // use a

To accomodate this, at least for unbalanced ifs, the scope of binding 
variables declared in the conditional would have to extend to the end of 
the scope, as if the unbalanced if were desugared into a balanced one:

     if (!(x matches Foo(var a))
         throw new NotFooException();
     else {
         // rest of block goes here
         // use a
     }

OTOH, if we do this, then for pattern variables that are DU after the if 
(imagine we hadn't inverted the condition), they will be polluting the 
scope after the if, since they will be in scope but not DA, much like 
the existing rules regarding locals declared inside switches.

The flow-scoping rules (alluded to, but not fully written out in Gavin's 
note) are beautiful, and result in binding variables being in scope 
wherever they make sense to be (when they are DA), and not in scope 
where they are not, but I worry they are a bit too un-Java-ish (even 
though they are really just DA/DU in a separate guise.)  And the fact 
that they leave "scoping holes" is disturbing to some people.  There's a 
few ways to deal with this.

As an additional constraint, we'd like it to be the case that, if you 
refactor a switch into an if-else chain, the scopes of binding variables 
are as consistent as possible between the two different ways to say the 
same thing.

On 11/3/2017 6:44 AM, Gavin Bierman wrote:
>
>
>     Scopes
>
> Java has five constructs that introduce fresh variables into scope: 
> the local variable declaration statement, the for statement, the 
> try-with-resources statement, the catch block, and lambda expressions. 
> The first, local variable declaration statements, introduce variables 
> that are in scope for the rest of the block that it is declared in. 
> The others introduce variables that are limited in their scope.
>
> The addition of pattern matching brings a new expression, |matches|, 
> and extends the |switch| statement. Both these constructs can now 
> introduce fresh (and, if the pattern match succeeds, definitely 
> assigned (DA)) variables. But the question is /what is the scope of 
> these ‘pattern’ variables/?
>
> Let us consider the pattern matching constructs in turn. First the 
> |switch| statement:
>
> |switch (o) { case int i: ... case .. }|
>
> What is the scope of the pattern variable |i|? There are a range of 
> options.
>
> 1.
>
>     The scope of the pattern variable is from the start of the switch
>     statement until the end of the enclosing block.
>
>     In this case the pattern variable is in scope but would be
>     definitely unassigned (DU) immediately after the switch statement.
>
>     |switch (o) { case int i : ... // DA ... // DA case T t : // i is
>     in scope } ... // i in still in scope and DU|
>
>   * *+ve* Simple
>   * *-ve* Can’t simply reuse a pattern variable in the same switch
>     statement (without some form of shadowing)
>   * *-ve* Pattern variable poisons the rest of the block
>
> 2.
>
>     The scope of the pattern variable extends only to the end of the
>     switch block.
>
>     In this case the pattern variable would be considered DA only for
>     the statements between the current case label and the subsequent
>     case labeled statement. For example:
>
>     |switch (o) { case int i : ... // DA ... // DA case T t : // i is
>     in scope but not DA } ... // i not in scope|
>
>   * *+ve* Simple
>   * *+ve* Pattern variables not poisoned in subsequent statements in
>     the rest of the block
>   * *+ve* Similar technique to |for| identifiers (not a new idea)
>   * *-ve* Can’t simply reuse a pattern variable in the same switch
>     statement (without some form of shadowing)
>
> 3.
>
>     The scope of the pattern variable extends only to the next case label.
>
>     |switch (o) { case int i : ... // in scope and DA ... // in scope
>     and DA case T i : // int i not in scope, so can re-use } ... // i
>     not in scope|
>
>   * *+ve* Simple syntactic rule
>   * *+ve* Allows reuse of pattern variable in the same switch statement.
>   * *-ve* Doesn’t make sense for fallthrough
>
> *NOTE* This final point is important - supporting fallthrough impacts 
> on what solution we might choose for scoping of pattern variables. (We 
> could not support fallthrough and instead support OR patterns - a 
> further design dimension.)
>
> *ASIDE* Should we support a |switch| /expression/; it seems clear that 
> scoping should be treated in the same way as it is for lambda expressions.
>
> The |matches| expression is unusual in that it is an /expression/ that 
> introduces a fresh variable. What is the scope of this variable? We 
> want it to be more than the expression itself, as we want the 
> following example code to be correct:
>
> |if (e matches String s) { System.out.println("It's a string - " + s); }|
>
> In other words, the variable introduced by the pattern needs to be in 
> scope for an enclosing IfThen statement.
>
> However, a |match| expression could be nested within another 
> expression. It seems reasonable that the patterns variables are in 
> scope for at least the rest of the expression. For example:
>
> |(e matches String s || s.length() > 0) |
>
> Here the |s| should be in scope for the subexpression 
> |s.length| (although it is not DA). In contrast:
>
> |(e matches String s && s.length() > 0)|
>
> Here the |s| is both in scope and DA for the subexpression |s.length|.
>
> However, what about the following:
>
> |if (s.length() > 0 && e matches String s) { System.out.println(s); }|
>
> Given the idea that a pattern variable flows from the inside-out to 
> the enclosing statement, it would appear that |s| is in scope for the 
> subexpression |s.length|; although it is not DA. Unless we want scopes 
> to be non-contiguous, we will have to accept this rather odd situation 
> (consider where |s| shadows a field). [This appears to be what happens 
> in the current C# compiler.]
>
> Now let’s consider how far a pattern variable flows wrt its enclosing 
> statement. We have a range of options:
>
> 1.
>
>     The scope is both the statement that the match expression occurs
>     in and the rest of the block. In this scenario,
>
>     |if (o matches T t) { ... } else { ... }|
>
>     is treated as equivalent to the following pseudo-code (where
>     |match-and-bind| is a fictional pattern matching construct that
>     pattern-matches and binds to a variable that has already been
>     declared)
>
>     |T t; if (o match-and-bind t) { // t in scope and DA } else { // t
>     in scope and DU } // t in scope and DU|
>
>     This is how the current C# compiler works (although the spec
>     describes the next option; so perhaps this is a bug).
>
> 2.
>
>     The scope is just the statement that the match expression occurs
>     in. In this scenario,
>
>     |if (o matches T t) { ... } else { } ...|
>
>     is treated as equivalent to the pseudo-code
>
>     |{ T t; if (o match-and-bind t) { // t in scope and DA } else { //
>     t in scope and DU // thus declaration int t = 42; is not allowed.
>     } } // t not in scope ...|
>
> This restricted scope allows reuse of pattern variables, e.g.
>
> |if (o matches T x) { ... } if (o matches S x) { ... }|
>
> 3.
>
>     The scope of the pattern variable is determined by a flow analysis
>     of the enclosing statement. (It could be thought of as a
>     refinement of option b.) This is currently implemented in the
>     prototype compiler. For example:
>
>     |if (!!(o matches T t)) { // t in scope } else { // t not in scope }|
>
>   * *+ve* Code will work in the presence of most refactorings
>   * *+ve* We have this code working already :-)
>   * *-ve* This is a break to the existant notion of scope as a
>     contiguous program fragment. A scope can now have holes in it.
>     Will users ever understand this? (Although they are /very/ similar
>     to the flow-based rules for DA/DU.)
>
> *ASIDE* Regardless of whether we opt for (b) or (c) we may consider a 
> further extension where we allow the scope to extend beyond the 
> current statement for the case of an unbalanced |if| statement. For 
> example
>
> |``` if (!(o matches T t)) { return; } // t in scope ... return; ```|
>
>   * *+ve* Supports a common idiom where else blocks are not needed
>   * *-ve* Yet further complication of notion of scope.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20171104/e4c5979d/attachment-0001.html>