[patterns] reconsidering how the created variable name binds

Reinier Zwitserloot reinier at zwitserloot.com
Sat Dec 28 17:51:53 UTC 2019


An alternative proposal for how the variable declarations inherent in a
pattern matching 'instanceof'/switch on expression's type bind:

The local variable created by such a construct (so, the 's' in: 'x
instanceof String s') exists only within the lexical context within which
it is created _AND_ only if it is definitely assigned.

>From what I can tell by reading amber-spec-experts and amber-dev, right now
the 2 considered alternatives are those two aspects by themselves; not
combined.

So, either [A] as per lexical scoping rules of LocalDeclaration nodes,
similar to where the 'i' exists when writing 'for (int i = ... ;;) ', or
[B] everywhere that s would be 'definitively assigned', as in, anywhere
where the compiler can guarantee that x was in fact instanceof String, even
if that means it is _OUTSIDE_ the lexical scope where you do the instanceof
check.

The current draft spec chooses the [B] option. The [A] option seems to have
been considered but disregarded (see amber-spec-experts, Gavin Bierman -
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2019-December/001837.html
)

I propose that both must be required instead.

Here is the clear downside to the [B]-only option (the current draft spec),
courtesy of Tagir Valeev (
https://twitter.com/tagir_valeev/status/1210431331332689920 ):

    static /*?final?*/ boolean FLAG = true;
    static String v = "field";
    public void test() {
        String obj = "Pattern match";
        if (!(obj instanceof String v)) {
            // This branch is never taken.
            while (FLAG) ; // endless
        }

        System.out.println(v);

this will print 'field' if the FLAG variable isn't final, but it prints
'Pattern match' if it is. And that is correct according to the draft spec –
that's what the [B] choice means. I take it we can all agree this is
unfortunate.

The reason Gavin gave for not going with [A]-only is that it makes this
code not compile:

    if (x instanceof Point p) {
    } else {
        if (y instanceof Point p) {}
    }

Because the second instanceof is re-using the name of an existing and
in-scope local variable declaration, and currently java doesn't let you
shadow a local with a local, and Gavin feels (probably correctly) that
allowing shadowing in such cases is itself confusing and a high-impact
change. It really messes with switch too, forcing you to think up a
different name for every case.

But, combine the two requirements and you suffer neither downside.

some examples:

 -- example 1 --

    if (!(obj instanceof String newDecl)) {
    } else {
        // newDecl is in lexical scope here, and definitely assigned, thus,
it exists.
    }

 -- example 2 --

    if (!(obj instanceof String newDecl)) {
        while (true) ;
    } else {
        // newDecl in scope
    }

   // ... but newDecl does not exist here; this is a break from draft spec.
It avoids the problem in Tagir's tweet.

 -- example 3 --

    if (!(obj instanceof String newDecl)) {
    }

newDecl exists nowhere. In fact, I propose that this code is a compiler
error, for the same reason that for (;;) int x = 5; is a compiler error:
local var decls that have zero locations where they exist aren't legal
(this is currently implemented in the JLS by explicitly mentioning that the
'statement' part of an if/do/while/for cannot be a LocalDeclaration,
separately for each such 'braces are optional' control construct. Thus,
adding this 'can't make obviously pointless new variables this way' rule
may take some effort in updating the JLS).

 -- example 4 --

    String formatted = switch (obj) {
        case Integer i -> /* i is definitely assigned here, thus bound */
String.format(...);
        case String s -> /* it's not definitely assigned here, thus i
doesn't exist here */ "";
        case Index i -> /* thus making this legal */ String.format(...);
    };

 -- example 5 --

    if (x instanceof Point p) {
        // p exists here
    } else {
        // but not here – lexically in scope but not DA.
        if (y instanceof Point p) {
            // thus, this is legal.
        }
    }

Potential issues with this proposal:

* Do I fully grasp the places where this construct is going to be used by
the java community? So far I can imagine this being used in `switch` as per
example 4, the straightforward: (if x instanceof Type t) { /* use t here */
}, and maybe in && chaining: return other instanceof Point p && p.x ==
this.x && p.y == this.y; and that's as far as I get. All such cases
continue to work as intended with this proposal.

* This presumably isn't where the planning around pattern matching is going
to end; later java versions will bring more features here, such as
destructing valhalla inline types, or even any type for which the compiler
can ascertain how to destruct them. Does this proposal get in the way of
such future features?

* Is it bending over backwards to prevent an 'academic' java puzzler? Like
many java puzzlers, the construct used in Tagir's FLAG puzzler is probably
unlikely to commonly show up except in academic 'look what hoops I'm making
the compiler jump through!' examples. Nevertheless, the puzzler in the
tweet does suggest a plausible source of real confusion to me, and I'd like
to investigate if it can be avoided entirely.

 --Reinier Zwitserloot


More information about the amber-dev mailing list