Yield as contextual keyword (was: Call for bikeshed -- break replacement in expression switch)

Mon May 20 14:47:57 UTC 2019

I would vote for option E - a real keyword : break-with.

Regards,
Manoj

From:	Guy Steele <guy.steele at oracle.com>
To:	Brian Goetz <brian.goetz at oracle.com>
Cc:	amber-spec-experts <amber-spec-experts at openjdk.java.net>
Date:	05/18/2019 12:11 AM
Subject:	[EXTERNAL] Re: Yield as contextual keyword (was: Call for
            bikeshed -- break replacement in expression switch)
Sent by:	"amber-spec-experts"
            <amber-spec-experts-bounces at openjdk.java.net>

I (somewhat reluctantly, but with an appreciation for the pragmatics of the
situation) support option B.

—Guy

      On May 17, 2019, at 12:57 PM, Brian Goetz <brian.goetz at oracle.com>
      wrote:

      As was pointed out in Keyword Management for the Java Language (
      https://openjdk.java.net/jeps/8223002), contextual keywords are a
      compromise, and their compromises vary by lexical position.  Let’s
      take a more organized look at the costs and options for doing `yield`
      as a contextual keyword.

      But, before we do, let’s put this in context (heh): methods called
      yield() are rare (there’s only one in the JDK), and blocks on the RHS
      of an arrow-switch are rare, so we’re talking about the interaction
      of two corner cases.

      Let’s take the following example.

      class C {
        /* 1 */  void yield(int x) { }

        void m(int y) {
            /* 2 */  yield (1);
            /* 3 */  yield 1;

            int z = switch (y) {
                case 0 -> {
                    /* 4 */  yield (1);
                }
                case 1 -> {
                    /* 5 */  yield 1;
                }
                default -> 42;
            }
        }
      }

      First, requirements:

      For usage (1), this has to be a valid method declaration.

      For usage (2), this has to be a method invocation.

      For usage (3), this has to be some sort of compilation error.

      For usage (4), there is some discussion to be had.

      For usage (5), this has to be a yield statement.

      (1) is not problematic, as the yield-statement production is not in
      play at all when parsing method declarations.

      (3) is not problematic, as there is no ambiguity between
      method-invocation and yield-statement, and yield-statement is not
      allowed here.  (Even if the operand were an identifier, not a numeric
      literal, it would not be ambiguous with a local variable declaration,
      because `yield` will not be permitted as a type identifier.).

      (5) is not problematic, as there is no ambiguity between method
      invocation and yield-statement.

      Let’s talk about (2) and (4).

      Let’s assume the parser production only allows yield statement inside
      of a block on the RHS of an arrow-switch (and maybe some other
      contexts in the future, but not all blocks).  Let’s call these
      “switchy blocks” for clarity.  That means that (2) is similarly
      unambiguous to (3), and will be parsed as a method invocation.  So
      this is really all about (4).

      OPTION A: DISALLOW YIELD (E)
      ----------------------------

      In this option, we disallow yield statements whose argument is a
      parenthesized expression, instead parsing them as method invocations.
      Most such invocations will fail as there is unlikely to be a yield()
      method in scope.

      From a parser perspective, this is straightforward enough; we need an
      alternate Expression production which omits “parenthesized
      expression.”

      From a user perspective, I think this is likely to be a sharp edge,
      as I would expect it to be more common to want to use a parenthesized
      operand than there will be a yield method in scope.

      OPTION B: DISALLOW UNQUALIFIED INVOCATION
      -----------------------------------------

      From a parser perspective, this is similarly straightforward: inside
      a switchy block, give the rule `yield <expr>` a higher priority than
      method invocation.  The compiler can warn on this ambiguity, if we
      like.

      From a user perspective, users wanting to invoke yield() methods
      inside switchy blocks will need to qualify the receiver (Foo.yield(),
      this.yield(), etc).

      The cost is that a statement “yield (e)” parses to different things
      in different contexts; in a switchy block, it is a yield statement,
      the rest of the time, it is a method invocation.

      I think this is much less likely to cause user distress than Option
      A, because it is rare that there is an unqualified yield(x) method in
      scope.  (And, given every yield() method I can think of, you’d likely
      never call one from a switchy block anyway, as they are
      side-effectful and blocking.). And in the case of collision, there is
      a clear workaround if the user really wanted a method invocation, and
      the compiler can deliver a warning when there is actual ambiguity.

      OPTION C: SYMBOL-DRIVEN PARSING
      -------------------------------

      In this option, the context-sensitivity of parsing includes a check
      for whether a `yield()` method is in scope.  I think we can rule this
      out as overly heroic; constraining parsing to be aware of the symbol
      table is asking a lot of compilers.

      OPTION D: BOTH WAYS
      -------------------

      In this option, we proceed as with Option A, but when we get to
      symbol analysis, if we are in a switchy block and there is no yield()
      method in scope, we rewrite the tree to be a yield statement instead.

      OPTION E: A REAL KEYWORD
      ------------------------

      The pain above is an artifact of choosing a contextual keyword; on
      the scale of contextual pain, this rates a “mild”, largely because
      true collisions are likely to be quite rare, and there is no backward
      compatibility concern.  So while choosing a real keyword (break-with)
      would be cleaner, I don’t think the users will like it as much.

      My opinions: I think C is pretty much a non-starter, and IMO B is
      measurably more attractive than A.  Option D is not as terrible as C
      but seems overly heroic, as we try to avoid tree-rewriting in
      attribution.  I don’t think the pain of either A or B merits grabbing
      for E.