We need more keywords, captain!
Guy Steele
guy.steele at oracle.com
Thu Jan 17 11:23:28 UTC 2019
I am persuaded by your argument. I thought we should consider break-return, but I am now convinced that overall break-with is the better choice.
—Guy
Sent from my iPhone
> On Jan 17, 2019, at 9:45 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>
> I think i prefer break-with,
> the problem of break-return is that people will write it break return without the hyphen, break return is in my opinion too close to return if you read the code too fast and a break return without a value means nothing unlike a regular return.
>
> I like break-with because it's obvious that you have to say with what value you want to break, which is exactly the issue we have with the current break syntax.
>
> So i vote for break-with instead of break,
> as Brian said, the expression switch is currently a preview feature of 12 so we can still tweak the syntax a bit.
>
> Rémi
>
> ----- Mail original -----
>> De: "Guy Steele" <guy.steele at oracle.com>
>> À: "Brian Goetz" <brian.goetz at oracle.com>
>> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
>> Envoyé: Mardi 8 Janvier 2019 18:23:36
>> Objet: Re: We need more keywords, captain!
>
>> Actually, even better than `break-with` would be `break-return`. It’s clearly a
>> kind of `break`, and also clearly a kind of `return`.
>>
>> I think maybe this application alone has won me over to the idea of hyphenated
>> keywords.
>>
>> (Then again, for this specific application we don’t even need the hyphen; we
>> could just write `break return v;`.)
>>
>> —Guy
>>
>>> On Jan 8, 2019, at 12:35 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>>
>>> When discussing this today at our compiler meeting, we realized a few more
>>> places where the lack of keywords produce distortions we don't even notice. In
>>> expression switch, we settled on `break value` as the way to provide a value
>>> for a switch expression when the shorthand (`case L -> e`) doesn't suffice, but
>>> this was painful for everyone. It's painful for users because there's now work
>>> required to disambiguate whether `break foo` is a labeled break or a value
>>> break; it was even more painful to specify, because a new form of abrupt
>>> completion had to be threaded through the spec.
>>>
>>> Being able to call this something like `break-with v` (or some other derived
>>> keyword) would have made this all a lot simpler. (BTW, we can still do this,
>>> since expression-switch is still in preview.)
>>>
>>> Moral of the story: even just a few minutes of brainstorming led us to several
>>> applications of this approach that we hadn't seen a few days ago.
>>>
>>>> On 1/8/2019 10:22 AM, Brian Goetz wrote:
>>>> This document proposes a possible move that will buy us some breathing room in
>>>> the perpetual problem where the keyword-management tail wags the
>>>> programming-model dog.
>>>>
>>>>
>>>> ## We need more keywords, captain!
>>>>
>>>> Java has a fixed set of _keywords_ (JLS 3.9) which are not allowed to
>>>> be used as identifiers. This set has remained quite stable over the
>>>> years (for good reason), with the exceptions of `assert` added in 1.4,
>>>> `enum` added in 5, and `_` added in 9. In addition, there are also
>>>> several _reserved identifiers_ (`true`, `false`, and `null`) which
>>>> behave almost like keywords.
>>>>
>>>> Over time, as the language evolves, language designers face a
>>>> challenge; the set of keywords imagined in version 1.0 are rarely
>>>> suitable for expressing all the things we might ever want our language
>>>> to express. We have several tools at our disposal for addressing this
>>>> problem:
>>>>
>>>> - Eminent domain. Take words that were previously identifiers, and
>>>> turn them into keywords, as we did with `assert` in 1.4.
>>>>
>>>> - Recycle. Repurpose an existing keyword for something that it was
>>>> never really meant for (such as using `default` for annotation
>>>> values or default methods).
>>>>
>>>> - Do without. Find a way to pick a syntax that doesn't require a
>>>> new keyword, such as using `@interface` for annotations instead of
>>>> `annotation` -- or don't do the feature at all.
>>>>
>>>> - Smoke and mirrors. Create the illusion of context-dependent
>>>> keywords through various linguistic heroics (restricted keywords,
>>>> reserved type names.)
>>>>
>>>> In any given situation, all of these options are on the table -- but
>>>> most of the time, none of these options are very good. The lack of
>>>> reasonable options for extending the syntax of the language threatens
>>>> to become a significant impediment to language evolution.
>>>>
>>>> #### Why not "just" make new keywords?
>>>>
>>>> While it may be legal for us to declare `i` to be a keyword in a
>>>> future version of Java, this would likely break every program in the
>>>> world, since `i` is used so commonly as an identifier. (When the
>>>> `assert` keyword was added in 1.4, it broke every testing framework.)
>>>> The cost of remediating the effect of such incompatible changes varies
>>>> as well; invalidating a name choice for a local variable has a local
>>>> fix, but invalidating the name of a public type or an interface
>>>> method might well be fatal.
>>>>
>>>> Additionally, the keywords we're likely to want to reclaim are often
>>>> those that are popular as identifiers (e.g., `value`, `var`,
>>>> `method`), making such fatal collisions more likely. In some cases,
>>>> if the keyword candidate in question is sufficiently rarely used as an
>>>> identifier, we might still opt to take that source-compatibility hit
>>>> -- but names that are less likely to collide (e.g.,
>>>> `usually_but_not_always_final`) are likely not the ones we want in our
>>>> language. Realistically, this is unlikely to be a well we can go to
>>>> very often, and the bar must be very high.
>>>>
>>>> #### Why not "just" live with the keywords we have?
>>>>
>>>> Reusing keywords in multiple contexts has ample precedent in
>>>> programming languages, including Java. (For example, we (ab)use `final`
>>>> for "not mutable", "not overridable", and "not extensible".)
>>>> Sometimes, using an existing keyword in a new context is natural and
>>>> sensible, but usually it's not our first choice. Over time, as the
>>>> range of demands we place on our keyword set expands, this may well
>>>> descend into the ridiculous; no one wants to use `null final` as a way
>>>> of negating finality. (While one might think such things are too
>>>> ridiculous to consider, note that we received serious-seeming
>>>> suggestions during JEP 325 to use `new switch` to describe a switch
>>>> with different semantics. Presumably to be followed by `new new
>>>> switch` in ten years.)
>>>>
>>>> Of course, one way to live without making new keywords is to stop
>>>> evolving the language entirely. While there are some who think this
>>>> is a fine idea, doing so because of the lack of available tokens would
>>>> be a silly reason. We are convinced that Java has a long life ahead of
>>>> it, and developers are excited about new features that enable to them
>>>> to write more expressive and reliable code.
>>>>
>>>> #### Why not "just" make contextual keywords?
>>>>
>>>> At first glance, contextual keywords (and their friends, such as
>>>> reserved type identifiers) may appear to be a magic wand; they let us
>>>> create the illusion of adding new keywords without breaking existing
>>>> programs. But the positive track record of contextual keywords hides
>>>> a great deal of complexity and distortion.
>>>>
>>>> Each grammar position is its own story; contextual keywords that might
>>>> be used as modifiers (e.g., `readonly`) have different ambiguity
>>>> considerations than those that might be use in code (e.g., a `matches`
>>>> expression). The process of selecting a contextual keyword is not a
>>>> simple matter of adding it to the grammar; each one requires an
>>>> analysis of potential current and future interactions. Similarly,
>>>> each token we try to repurpose may have its own special
>>>> considerations; for example, we could justify the use of `var` as a
>>>> reserved type name because because the naming conventions are so
>>>> broadly adhered to. Finally, the use of contextual keywords in
>>>> certain syntactic positions can create additional considerations for
>>>> extending the syntax later.
>>>>
>>>> Contextual keywords create complexity for specifications, compilers,
>>>> and IDEs. With one or two special cases, we can often deal well
>>>> enough, but if special cases were to become more pervasive, this would
>>>> likely result in more significant maintenance costs or bug tail. While
>>>> it is easy to dismiss this as “not my problem”, in reality, this is
>>>> everybody’s problem. IDEs often have to guess whether a use of a
>>>> contextual keyword is a keyword or identifier, and it may not have
>>>> enough information to make a good guess until it’s seen more input.
>>>> This results in worse user highlighting, auto-completion, and
>>>> refactoring abilities — or worse. These problems quickly become
>>>> everyone's problems.
>>>>
>>>> So, while contextual keywords are one of the tools in our toolbox,
>>>> they should also be used sparingly.
>>>>
>>>> #### Why is this a problem?
>>>>
>>>> Aside from the obvious consequences of these problems (clunky syntax,
>>>> complexity, bugs), there is a more insidious hidden cost --
>>>> distortion. The accidental details of keyword management pose a
>>>> constant risk of distortion in language design.
>>>>
>>>> One could consider the choice to use `@interface` instead of
>>>> `annotation` for annotations to be a distortion; having a descriptive
>>>> name rather than a funky combination of punctuation and keyword would
>>>> surely have made it easier for people to become familiar with
>>>> annotations.
>>>>
>>>> In another example, the set of modifiers (`public`, `private`,
>>>> `static`, `final`, etc) is not complete; there is no way to say “not
>>>> final” or “not static”. This, in turn, means that we cannot create
>>>> features where variables or classes are `final` by default, or members
>>>> are `static` by default, because there’s no way to denote the desire
>>>> to opt out of it. While there may be reasons to justify a locally
>>>> suboptimal default anyway (such as global consistency), we want to
>>>> make these choices deliberately, not have them made for us by the
>>>> accidental details of keyword management. Choosing to leave out a
>>>> feature for reasons of simplicity is fine; leaving it out because we
>>>> don't have a way to denote the obvious semantics is not.
>>>>
>>>> It may not be obvious from the outside, but this is a constant problem
>>>> in evolving the language, and an ongoing tax that we all pay, directly
>>>> or indirectly.
>>>>
>>>> ## We need a new source of keyword candidates
>>>>
>>>> Every time we confront this problem, the overwhelming tendency is to
>>>> punt and pick one of the bad options, because the problem only comes
>>>> along every once in a while. But, with the features in the pipeline, I
>>>> expect it will continue to come along with some frequency, and I’d
>>>> rather get ahead of it. Given that all of these current options are
>>>> problematic, and there is not even a least-problematic move that
>>>> applies across all situations, my inclination is to try to expand the
>>>> set of lexical forms that can be used as keywords.
>>>>
>>>> As a not-serious example, take the convention that we’ve used for
>>>> experimental features, where we prefix provisional keywords in
>>>> prototypes with two underscores, as we did with `__ByValue` in the
>>>> Valhalla prototype. (We commonly do this in feature proposals and
>>>> prototypes, mostly to signify “this keyword is a placeholder for a
>>>> syntax decision to be made later”, but also because it permits a
>>>> simple implementation that is unlikely to collide with existing code.)
>>>> We could, for example, carve out the space of identifiers that begin
>>>> with underscore as being reserved for keywords. Of course, this isn’t
>>>> so pretty, and it also means we'd have a mix of underscore and
>>>> non-underscore keywords, so it’s not a serious suggestion, as much as
>>>> an example of the sort of move we are looking for.
>>>>
>>>> But I do have a serious suggestion: allow _hyphenated_ keywords where
>>>> one or more of the terms are already keywords or reserved identifiers.
>>>> Unlike restricted keywords, this creates much less trouble for
>>>> parsing, as (for example) `non-null` cannot be confused for a
>>>> subtraction expression, and the lexer can always tell with fixed
>>>> lookahead whether `a-b` is three tokens or one. This gives us a lot
>>>> more room for creating new, less-conflicting keywords. And these new
>>>> keywords are likely to be good names, too, as many of the missing
>>>> concepts we want to add describe their relationship to existing
>>>> language constructs -- such as `non-null`.
>>>>
>>>> Here’s some examples where this approach might yield credible
>>>> candidates. (Note: none of these are being proposed here; this is
>>>> merely an illustrative list of examples of how this mechanism could
>>>> form keywords that might, in some particular possible future, be
>>>> useful and better than the alternatives we have now.)
>>>>
>>>> - `non-null`
>>>> - `non-final`
>>>> - `package-private` (the default accessibility for class members, currently not
>>>> denotable)
>>>> - `public-read` (publicly readable, privately writable)
>>>> - `null-checked`
>>>> - `type-static` (a concept needed in Valhalla, which is static relative to a
>>>> particular specialization of a class, rather than the class itself)
>>>> - `default-value`
>>>> - `eventually-final` (what the `@Stable` annotation currently suggests)
>>>> - `semi-final` (an alternative to `sealed`)
>>>> - `exhaustive-switch` (opting into exhaustiveness checking for statement
>>>> switches)
>>>> - `enum-class`, `annotation-class`, `record-class` (we might have chosen these
>>>> as an alternative to `enum` and `@interface`, had we had the option)
>>>> - `this-class` (to describe the class literal for the current class)
>>>> - `this-return` (a common request is a way to mark a setter or builder method
>>>> as returning its receiver)
>>>>
>>>> (Again, the point is not to debate the merits of any of these specific
>>>> examples; the point is merely to illustrate what we might be able to do
>>>> with such a mechanism.)
>>>>
>>>> Having this as an option doesn't mean we can't also use the other
>>>> approaches when they are suitable; it just means we have more, and
>>>> likely less fraught, options with which to make better decisions.
>>>>
>>>> There are likely to be other lexical schemes by which new keywords can
>>>> be created without impinging on existing code; this one seems credible
>>>> and reasonably parsable by both machines and humans.
>>>>
>>>> #### "But that's ugly"
>>>>
>>>> Invariably, some percentage of readers will have an immediate and
>>>> visceral reaction to this idea. Let's stipulate for the record that
>>>> some people will find this ugly. (At least, at first. Many such
>>>> reactions are possibly-transient (see what I did there?) responses
>>>> to unfamiliarity.)
>>>>
>>>>
More information about the amber-spec-experts
mailing list