PM design question: Scopes

Mon Nov 13 19:16:29 UTC 2017

I’m late to this discussion because I’ve been traveling.  But I do have a comment about Scopes and pattern matching (see bottom).

> On Nov 3, 2017, at 6:44 AM, Gavin Bierman <gavin.bierman at oracle.com> wrote:
> 
> Scopes
> 
> Java has five constructs that introduce fresh variables into scope: the local variable declaration statement, the for statement, the try-with-resources statement, the catch block, and lambda expressions. The first, local variable declaration statements, introduce variables that are in scope for the rest of the block that it is declared in. The others introduce variables that are limited in their scope.
> 
> The addition of pattern matching brings a new expression, matches, and extends the switch statement. Both these constructs can now introduce fresh (and, if the pattern match succeeds, definitely assigned (DA)) variables. But the question is what is the scope of these ‘pattern’ variables?
> 
> Let us consider the pattern matching constructs in turn. First the switch statement:
> 
> switch (o) {
>     case int i: ...
>     case ..
> }
> What is the scope of the pattern variable i? There are a range of options.
> 
> The scope of the pattern variable is from the start of the switch statement until the end of the enclosing block.
> 
> In this case the pattern variable is in scope but would be definitely unassigned (DU) immediately after the switch statement.
> 
> switch (o) {
>     case int i : ... // DA
>                  ... // DA
>     case T t :       // i is in scope 
> }
> ... // i in still in scope and DU
> +ve Simple
> -ve Can’t simply reuse a pattern variable in the same switch statement (without some form of shadowing)
> -ve Pattern variable poisons the rest of the block
> The scope of the pattern variable extends only to the end of the switch block.
> 
> In this case the pattern variable would be considered DA only for the statements between the current case label and the subsequent case labeled statement. For example:
> 
> switch (o) {
>     case int i : ... // DA
>                  ... // DA
>     case T t :       // i is in scope but not DA
> }
> ... // i not in scope
> +ve Simple
> +ve Pattern variables not poisoned in subsequent statements in the rest of the block
> +ve Similar technique to for identifiers (not a new idea)
> -ve Can’t simply reuse a pattern variable in the same switch statement (without some form of shadowing)
> The scope of the pattern variable extends only to the next case label.
> 
> switch (o) {
>     case int i : ... // in scope and DA
>                  ... // in scope and DA
>     case T i :       // int i not in scope, so can re-use
> }
> ... // i not in scope
> +ve Simple syntactic rule
> +ve Allows reuse of pattern variable in the same switch statement.
> -ve Doesn’t make sense for fallthrough
> NOTE This final point is important - supporting fallthrough impacts on what solution we might choose for scoping of pattern variables. (We could not support fallthrough and instead support OR patterns - a further design dimension.)
> 
> ASIDE Should we support a switch expression; it seems clear that scoping should be treated in the same way as it is for lambda expressions.
> 
> The matches expression is unusual in that it is an expression that introduces a fresh variable. What is the scope of this variable? We want it to be more than the expression itself, as we want the following example code to be correct:
> 
> if (e matches String s) {
>     System.out.println("It's a string - " + s);
> }
> In other words, the variable introduced by the pattern needs to be in scope for an enclosing IfThen statement.
> 
> However, a match expression could be nested within another expression. It seems reasonable that the patterns variables are in scope for at least the rest of the expression. For example:
> 
> (e matches String s || s.length() > 0) 
> Here the s should be in scope for the subexpression s.length (although it is not DA). In contrast:
> 
> (e matches String s && s.length() > 0)
> Here the s is both in scope and DA for the subexpression s.length.
> 
> However, what about the following:
> 
> if (s.length() > 0 && e matches String s) {
>     System.out.println(s);
> }
> Given the idea that a pattern variable flows from the inside-out to the enclosing statement, it would appear that s is in scope for the subexpression s.length; although it is not DA. Unless we want scopes to be non-contiguous, we will have to accept this rather odd situation (consider where s shadows a field). [This appears to be what happens in the current C# compiler.]
> 
> Now let’s consider how far a pattern variable flows wrt its enclosing statement. We have a range of options:
> 
> The scope is both the statement that the match expression occurs in and the rest of the block. In this scenario,
> 
> if (o matches T t) {
>     ... 
> } else {
>     ...
> }
> is treated as equivalent to the following pseudo-code (where match-and-bind is a fictional pattern matching construct that pattern-matches and binds to a variable that has already been declared)
> 
> T t;
> if (o match-and-bind t) {
>     // t in scope and DA
> } else {
>     // t in scope and DU
> }
> // t in scope and DU
> This is how the current C# compiler works (although the spec describes the next option; so perhaps this is a bug).
> 
> The scope is just the statement that the match expression occurs in. In this scenario,
> 
> if (o matches T t) {
> ... 
> } else {
> 
> }
> ...
> is treated as equivalent to the pseudo-code
> 
> { T t;
>   if (o match-and-bind t) {
>       // t in scope and DA
>   } else {
>       // t in scope and DU
>       // thus declaration int t = 42; is not allowed.
>   }
> }
> // t not in scope
> ...
> This restricted scope allows reuse of pattern variables, e.g.
> 
> if (o matches T x) { ... }
> if (o matches S x) { ... }
> The scope of the pattern variable is determined by a flow analysis of the enclosing statement. (It could be thought of as a refinement of option b.) This is currently implemented in the prototype compiler. For example:
> 
> if (!!(o matches T t)) {
>      // t in scope
> } else {
>      // t not in scope
> }
> +ve Code will work in the presence of most refactorings
> +ve We have this code working already :-)
> -ve This is a break to the existant notion of scope as a contiguous program fragment. A scope can now have holes in it. Will users ever understand this? (Although they are very similar to the flow-based rules for DA/DU.)
> ASIDE Regardless of whether we opt for (b) or (c) we may consider a further extension where we allow the scope to extend beyond the current statement for the case of an unbalanced if statement. For example
> 
> ```
> if (!(o matches T t)) {
>     return;
> }
> // t in scope 
> ...
> return;
> ```
> +ve Supports a common idiom where else blocks are not needed
> -ve Yet further complication of notion of scope.
> 

Here is a fourth possibility for `switch`:

4. The scope of a pattern variable bound by a `case` label extends only to the next case label.  Moreover, it is allowed to shadow a local variable declared earlier in the switch block.

    Fallthrough has a special treatment:  When falling through a case label, an implicit assignment is performed to every variable bound by the case label.

    Falling through a case label is permitted only if:
	(a) Every variable name bound by the case label is also definitely assigned after the statement, local variable declaration, or other case label that precedes it in the switch block.
	(b) The type of every variable bound by the case label is a type to which the type of the same variable—when regarded from just after the statement, local variable declaration, or other case label that precedes the case label in the switch block—can be assigned.

    The value implicitly assigned by fallthrough to a variable bound by the case label is the value of the variable of the same name as of just before the case label.  (This would have to be a piece of magic not expressible by a simple source-code-level rewriting.)

    Example:

	switch (o) {
		case Cons(int i, int j, T x):			// int i, int j, and T x are in scope and DA.
			int k = x.size();			// For backward compatibility, scope of int k extends to end of switch block
			String z = “baz”;			// For backward compatibility, scope of String z extends to end of switch block
			// At this point int i, int j, T x, int k, and String z are in scope and DA.
			// As we fall through the next case label (which binds i, j, and k), we get
			//   implicit assignments of int i to int i, int j to long j, and int k to long k.
		case Cons(int i, long j, long k):		// int i, long j, and long k are in scope and DA;
									// previous int i, int j, and T x (bound by first case label) are not in scope;
									// int k is in scope but shadowed by long k;
									// String z is in scope and neither DA nor DU.
			…
	}

	+ve	Allows reuse of pattern variable in the same switch statement.
	+ve	Allows useful forms of fallthrough
	+ve	Compatible with previous treatment of fallthrough for case labels that bind no variables
	-ve	Introduces a mild (benign?) form of shadowing into the language

An alternative is to require the types to match exactly for the implicit assignments rather than using assignment conversion.  This would be less flexible but perhaps also less confusing.

—Guy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20171113/7e3bb9ce/attachment-0001.html>