From brian.goetz at oracle.com  Tue Sep  6 21:11:43 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 6 Sep 2022 17:11:43 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
Message-ID: <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>

We dropped this out of the record patterns JEP, but I think it is time 
to revisit this.

The concept of array patterns was pretty straightforward; they mimic the 
nesting and exhaustiveness rules of record patterns, they are just a 
different sort of container for nested patterns.? And they have an 
obvious duality with array creation expressions.

The main open question here was how we distinguish between "match an 
array of length exactly N" (where there are N nested patterns) and 
"match an array of length at least N".? We toyed with the idea of a 
"..." indicator to mean "more elements", but this felt a little forced 
and opened new questions.

It later occurred to me that there is another place to nest a pattern in 
an array pattern -- to match (and bind) the length. In the following, 
assume for sake of exposition that "_" is the "any" pattern (matches 
everything, binds nothing) and that we have some way to denote a 
constant pattern, which I'll denote here with a constant literal.

There is an obvious place to put this (optional) pattern: in between the 
brackets.? So:

 ??? case String[1] { P }:
 ??????????????? ^ a constant pattern

would match string arrays of length 1 whose sole element matches P.? And

 ??? case String[] { P, Q }

would match string arrays of length exactly 2, whose first two elements 
match P and Q respectively.? (If the length pattern is not specified, we 
infer a constant pattern whose constant is equal to the length of the 
nested pattern list.)

Matching a target to `String[L] { P0, .., Pn }` means

 ??? x instanceof String[] arr
 ??????? && arr.length matches L
 ??????? && arr.length >= n
 ??????? && arr[0] matches P0
 ??????? && arr[1] matches P1
 ??????? ...
 ??????? && arr[n] matches Pn

More examples:

 ??? case String[int len] { P }

would match string arrays of length >= 1 whose first element matches P, 
and further binds the array length to `len`.

 ??? case String[_] { P, Q }

would match string arrays of any length whose first two elements match P 
and Q.

 ??? case String[3] { }
 ??????????????? ^constant pattern

matches all string arrays of length 3.


This is a more principled way to do it, because the length is a part of 
the array and deserves a chance to match via nested patterns, just as 
with the elements, and it avoid trying to give "..." a new meaning.

The downside is that it might be confusing at first (though people will 
learn quickly enough) how to distinguish between an exact match and a 
prefix match.


On 1/5/2021 1:48 PM, Brian Goetz wrote:
> As we get into the next round of pattern matching, I'd like to 
> opportunistically attach another sub-feature: array patterns.? (This 
> also bears on the question of "how would varargs patterns work", which 
> I'll address below, though they might come later.)
>
> ## Array Patterns
>
> If we want to create a new array, we do so with an array construction 
> expression:
>
> ??? new String[] { "a", "b" }
>
> Since each form of aggregation should have its dual in destructuring, 
> the natural way to represent an array pattern (h/t to AlanM for 
> suggesting this) is:
>
> ??? if (arr instanceof String[] { var a, var b }) { ... }
>
> Here, the applicability test is: "are you an instanceof of String[], 
> with length = 2", and if so, we cast to String[], extract the two 
> elements, and match them to the nested patterns `var a` and `var b`.?? 
> This is the natural analogue of deconstruction patterns for arrays, 
> complete with nesting.
>
> Since an array can have more elements, we likely need a way to say 
> "length >= 2" rather than simply "length == 2".? There are multiple 
> syntactic ways to get there, for now I'm going to write
>
> ??? if (arr instanceof String[] { var a, var b, ... })
>
> to indicate "more".? The "..." matches zero or more elements and binds 
> nothing.
>
> <digression>
> People are immediately going to ask "can I bind something to the 
> remainder"; I think this is mostly an "attractive distraction", and 
> would prefer to not have this dominate the discussion.
> </digression>
>
> Here's an example from the JDK that could use this effectively:
>
> String[] limits = limitString.split(":");
> try {
> ??? switch (limits.length) {
> ??????? case 2: {
> ??????????? if (!limits[1].equals("*"))
> ??????????????? setMultilineLimit(MultilineLimit.DEPTH, 
> Integer.parseInt(limits[1]));
> ??????? }
> ??????? case 1: {
> ??????????? if (!limits[0].equals("*"))
> ??????????????? setMultilineLimit(MultilineLimit.LENGTH, 
> Integer.parseInt(limits[0]));
> ??????? }
> ??? }
> }
> catch(NumberFormatException ex) {
> ??? setMultilineLimit(MultilineLimit.DEPTH, -1);
> ??? setMultilineLimit(MultilineLimit.LENGTH, -1);
> }
>
> becomes (eventually)
>
> switch (limitString.split(":")) {
> ??????? case String[] { var _, Integer.parseInt(var i) } -> 
> setMultilineLimit(DEPTH, i);
> ? ? case String[] { Integer.parseInt(var i) } -> 
> setMultilineLimit(LENGTH, i);
> ??????? default -> { setMultilineLimit(DEPTH, -1); 
> setMultilineLimit(LENGTH, -1); }
> ??? }
>
> Note how not only does this become more compact, but the unchecked 
> "NumberFormatException" is folded into the match, rather than being a 
> separate concern.
>
>
> ## Varargs patterns
>
> Having array patterns offers us a natural way to interpret 
> deconstruction patterns for varargs records.? Assume we have:
>
> ??? void m(X... xs) { }
>
> Then a varargs invocation
>
> ??? m(a, b, c)
>
> is really sugar for
>
> ??? m(new X[] { a, b, c })
>
> So the dual of a varargs invocation, a varargs match, is really a 
> match to an array pattern.? So for a record
>
> ??? record R(X... xs) { }
>
> a varargs match:
>
> ??? case R(var a, var b, var c):
>
> is really sugar for an array match:
>
> ??? case R(X[] { var a, var b, var c }):
>
> And similarly, we can use our "more arity" indicator:
>
> ??? case R(var a, var b, var c, ...):
>
> to indicate that there are at least three elements.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220906/3e112158/attachment.htm>

From amaembo at gmail.com  Wed Sep  7 14:10:15 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Wed, 7 Sep 2022 16:10:15 +0200
Subject: Array patterns (and varargs patterns)
In-Reply-To: <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
Message-ID: <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>

Hello!

Honestly, to me this whole feature looks not very important. It's a
rare case in modern Java applications that business logic operates
with arrays directly. They are mostly used in low-level system code
where performance matters more than code elegance. Custom defined
named patterns for lists would be much more useful. Moreover, if named
patterns are supported, then array deconstruction could be implemented
in a library, without complicating the language specification (like `x
instanceof Arrays.of(String first, String next, String last)`).

With best regards,
Tagir Valeev.

On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> We dropped this out of the record patterns JEP, but I think it is time to revisit this.
>
> The concept of array patterns was pretty straightforward; they mimic the nesting and exhaustiveness rules of record patterns, they are just a different sort of container for nested patterns.  And they have an obvious duality with array creation expressions.
>
> The main open question here was how we distinguish between "match an array of length exactly N" (where there are N nested patterns) and "match an array of length at least N".  We toyed with the idea of a "..." indicator to mean "more elements", but this felt a little forced and opened new questions.
>
> It later occurred to me that there is another place to nest a pattern in an array pattern -- to match (and bind) the length.  In the following, assume for sake of exposition that "_" is the "any" pattern (matches everything, binds nothing) and that we have some way to denote a constant pattern, which I'll denote here with a constant literal.
>
> There is an obvious place to put this (optional) pattern: in between the brackets.  So:
>
>     case String[1] { P }:
>                 ^ a constant pattern
>
> would match string arrays of length 1 whose sole element matches P.  And
>
>     case String[] { P, Q }
>
> would match string arrays of length exactly 2, whose first two elements match P and Q respectively.  (If the length pattern is not specified, we infer a constant pattern whose constant is equal to the length of the nested pattern list.)
>
> Matching a target to `String[L] { P0, .., Pn }` means
>
>     x instanceof String[] arr
>         && arr.length matches L
>         && arr.length >= n
>         && arr[0] matches P0
>         && arr[1] matches P1
>         ...
>         && arr[n] matches Pn
>
> More examples:
>
>     case String[int len] { P }
>
> would match string arrays of length >= 1 whose first element matches P, and further binds the array length to `len`.
>
>     case String[_] { P, Q }
>
> would match string arrays of any length whose first two elements match P and Q.
>
>     case String[3] { }
>                 ^constant pattern
>
> matches all string arrays of length 3.
>
>
> This is a more principled way to do it, because the length is a part of the array and deserves a chance to match via nested patterns, just as with the elements, and it avoid trying to give "..." a new meaning.
>
> The downside is that it might be confusing at first (though people will learn quickly enough) how to distinguish between an exact match and a prefix match.
>
>
>
>
> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>
> As we get into the next round of pattern matching, I'd like to opportunistically attach another sub-feature: array patterns.  (This also bears on the question of "how would varargs patterns work", which I'll address below, though they might come later.)
>
> ## Array Patterns
>
> If we want to create a new array, we do so with an array construction expression:
>
>     new String[] { "a", "b" }
>
> Since each form of aggregation should have its dual in destructuring, the natural way to represent an array pattern (h/t to AlanM for suggesting this) is:
>
>     if (arr instanceof String[] { var a, var b }) { ... }
>
> Here, the applicability test is: "are you an instanceof of String[], with length = 2", and if so, we cast to String[], extract the two elements, and match them to the nested patterns `var a` and `var b`.   This is the natural analogue of deconstruction patterns for arrays, complete with nesting.
>
> Since an array can have more elements, we likely need a way to say "length >= 2" rather than simply "length == 2".  There are multiple syntactic ways to get there, for now I'm going to write
>
>     if (arr instanceof String[] { var a, var b, ... })
>
> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>
> <digression>
> People are immediately going to ask "can I bind something to the remainder"; I think this is mostly an "attractive distraction", and would prefer to not have this dominate the discussion.
> </digression>
>
> Here's an example from the JDK that could use this effectively:
>
> String[] limits = limitString.split(":");
> try {
>     switch (limits.length) {
>         case 2: {
>             if (!limits[1].equals("*"))
>                 setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>         }
>         case 1: {
>             if (!limits[0].equals("*"))
>                 setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>         }
>     }
> }
> catch(NumberFormatException ex) {
>     setMultilineLimit(MultilineLimit.DEPTH, -1);
>     setMultilineLimit(MultilineLimit.LENGTH, -1);
> }
>
> becomes (eventually)
>
>     switch (limitString.split(":")) {
>         case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>         case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>         default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>     }
>
> Note how not only does this become more compact, but the unchecked "NumberFormatException" is folded into the match, rather than being a separate concern.
>
>
> ## Varargs patterns
>
> Having array patterns offers us a natural way to interpret deconstruction patterns for varargs records.  Assume we have:
>
>     void m(X... xs) { }
>
> Then a varargs invocation
>
>     m(a, b, c)
>
> is really sugar for
>
>     m(new X[] { a, b, c })
>
> So the dual of a varargs invocation, a varargs match, is really a match to an array pattern.  So for a record
>
>     record R(X... xs) { }
>
> a varargs match:
>
>     case R(var a, var b, var c):
>
> is really sugar for an array match:
>
>     case R(X[] { var a, var b, var c }):
>
> And similarly, we can use our "more arity" indicator:
>
>     case R(var a, var b, var c, ...):
>
> to indicate that there are at least three elements.
>
>
>

From brian.goetz at oracle.com  Wed Sep  7 14:32:34 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 7 Sep 2022 10:32:34 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
Message-ID: <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>

I understand where this sentiment comes from.? But the motivation is 
somewhat more indirect than "people are falling over themselves to 
deconstruct arrays today".

Because deconstruction is the dual of aggregation, it is desirable for 
each of the forms of aggregation -- constructors, factories, etc -- to 
have pattern counterparts.? Not doing so creates asymmetries that make 
the whole thing seem more ad-hoc.? Many of the "not as important" 
pattern features we're working on now, are in the realm of "completing" 
the feature.

More importantly, array patterns are how we fully support varargs in 
records.? If we have a varargs record:

 ??? record VA(String... strings) { }

we can construct it with a varargs invocation

 ??? new VA("a", "b")

which is sugar for

 ??? new VA(new String[] { "a", "b" })

But we cannot yet deconstruct it with:

 ??? case VA(var a, var b)

and analogously, for a varargs record, the above is sugar for

 ??? case VA(String[] { var a, var b })

So it is not just about arrays.

I agree that named patterns are more useful, and we are working on them 
too.? But they are also a bigger feature (bringing in overload 
selection, reflection, translation, etc), so they will take longer. 
Whereas array patterns are really a remix of things we've already worked 
out -- nested patterns, exhaustiveness, etc.? In any case I would like 
to avoid leaving a trail of unfinished work, so cleaning up the loose 
ends on basic patterns first seems preferable before adding bigger new 
pattern features.

> Hello!
>
> Honestly, to me this whole feature looks not very important. It's a
> rare case in modern Java applications that business logic operates
> with arrays directly. They are mostly used in low-level system code
> where performance matters more than code elegance. Custom defined
> named patterns for lists would be much more useful. Moreover, if named
> patterns are supported, then array deconstruction could be implemented
> in a library, without complicating the language specification (like `x
> instanceof Arrays.of(String first, String next, String last)`).

I'm not sure how this Arrays.of pattern is going to work, unless we're 
willing to have overloads for every arity up to, say, 22? Otherwise, we 
need varargs, and varargs is sugar for an array pattern.

>
> With best regards,
> Tagir Valeev.
>
> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>> We dropped this out of the record patterns JEP, but I think it is time to revisit this.
>>
>> The concept of array patterns was pretty straightforward; they mimic the nesting and exhaustiveness rules of record patterns, they are just a different sort of container for nested patterns.  And they have an obvious duality with array creation expressions.
>>
>> The main open question here was how we distinguish between "match an array of length exactly N" (where there are N nested patterns) and "match an array of length at least N".  We toyed with the idea of a "..." indicator to mean "more elements", but this felt a little forced and opened new questions.
>>
>> It later occurred to me that there is another place to nest a pattern in an array pattern -- to match (and bind) the length.  In the following, assume for sake of exposition that "_" is the "any" pattern (matches everything, binds nothing) and that we have some way to denote a constant pattern, which I'll denote here with a constant literal.
>>
>> There is an obvious place to put this (optional) pattern: in between the brackets.  So:
>>
>>      case String[1] { P }:
>>                  ^ a constant pattern
>>
>> would match string arrays of length 1 whose sole element matches P.  And
>>
>>      case String[] { P, Q }
>>
>> would match string arrays of length exactly 2, whose first two elements match P and Q respectively.  (If the length pattern is not specified, we infer a constant pattern whose constant is equal to the length of the nested pattern list.)
>>
>> Matching a target to `String[L] { P0, .., Pn }` means
>>
>>      x instanceof String[] arr
>>          && arr.length matches L
>>          && arr.length >= n
>>          && arr[0] matches P0
>>          && arr[1] matches P1
>>          ...
>>          && arr[n] matches Pn
>>
>> More examples:
>>
>>      case String[int len] { P }
>>
>> would match string arrays of length >= 1 whose first element matches P, and further binds the array length to `len`.
>>
>>      case String[_] { P, Q }
>>
>> would match string arrays of any length whose first two elements match P and Q.
>>
>>      case String[3] { }
>>                  ^constant pattern
>>
>> matches all string arrays of length 3.
>>
>>
>> This is a more principled way to do it, because the length is a part of the array and deserves a chance to match via nested patterns, just as with the elements, and it avoid trying to give "..." a new meaning.
>>
>> The downside is that it might be confusing at first (though people will learn quickly enough) how to distinguish between an exact match and a prefix match.
>>
>>
>>
>>
>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>
>> As we get into the next round of pattern matching, I'd like to opportunistically attach another sub-feature: array patterns.  (This also bears on the question of "how would varargs patterns work", which I'll address below, though they might come later.)
>>
>> ## Array Patterns
>>
>> If we want to create a new array, we do so with an array construction expression:
>>
>>      new String[] { "a", "b" }
>>
>> Since each form of aggregation should have its dual in destructuring, the natural way to represent an array pattern (h/t to AlanM for suggesting this) is:
>>
>>      if (arr instanceof String[] { var a, var b }) { ... }
>>
>> Here, the applicability test is: "are you an instanceof of String[], with length = 2", and if so, we cast to String[], extract the two elements, and match them to the nested patterns `var a` and `var b`.   This is the natural analogue of deconstruction patterns for arrays, complete with nesting.
>>
>> Since an array can have more elements, we likely need a way to say "length >= 2" rather than simply "length == 2".  There are multiple syntactic ways to get there, for now I'm going to write
>>
>>      if (arr instanceof String[] { var a, var b, ... })
>>
>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>
>> <digression>
>> People are immediately going to ask "can I bind something to the remainder"; I think this is mostly an "attractive distraction", and would prefer to not have this dominate the discussion.
>> </digression>
>>
>> Here's an example from the JDK that could use this effectively:
>>
>> String[] limits = limitString.split(":");
>> try {
>>      switch (limits.length) {
>>          case 2: {
>>              if (!limits[1].equals("*"))
>>                  setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>          }
>>          case 1: {
>>              if (!limits[0].equals("*"))
>>                  setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>          }
>>      }
>> }
>> catch(NumberFormatException ex) {
>>      setMultilineLimit(MultilineLimit.DEPTH, -1);
>>      setMultilineLimit(MultilineLimit.LENGTH, -1);
>> }
>>
>> becomes (eventually)
>>
>>      switch (limitString.split(":")) {
>>          case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>          case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>          default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>      }
>>
>> Note how not only does this become more compact, but the unchecked "NumberFormatException" is folded into the match, rather than being a separate concern.
>>
>>
>> ## Varargs patterns
>>
>> Having array patterns offers us a natural way to interpret deconstruction patterns for varargs records.  Assume we have:
>>
>>      void m(X... xs) { }
>>
>> Then a varargs invocation
>>
>>      m(a, b, c)
>>
>> is really sugar for
>>
>>      m(new X[] { a, b, c })
>>
>> So the dual of a varargs invocation, a varargs match, is really a match to an array pattern.  So for a record
>>
>>      record R(X... xs) { }
>>
>> a varargs match:
>>
>>      case R(var a, var b, var c):
>>
>> is really sugar for an array match:
>>
>>      case R(X[] { var a, var b, var c }):
>>
>> And similarly, we can use our "more arity" indicator:
>>
>>      case R(var a, var b, var c, ...):
>>
>> to indicate that there are at least three elements.
>>
>>
>>


From brian.goetz at oracle.com  Wed Sep  7 17:41:33 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 7 Sep 2022 13:41:33 -0400
Subject: Unnamed variables and match-all patterns
Message-ID: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>

We've gone around and around a few times on "unnamed variables" 
(underscore), starting with JEP 302 (Lambda Leftovers).? We reclaimed 
the underscore token in Java 9 with the intention of using it for 
unnamed variables and "any" patterns.? Along the way, we ran into some 
hiccups, and it has sat on the shelf for a while.? Let's take it down, 
dust it off, and see if we have any more clarity than before.

There are three syntactic productions in which we might want to use 
underscore as a "don't care" indicator:

 ?- Unnamed variables.? Here, underscore stands in for a variable name.? 
When we declare a local variable, catch formal, pattern variable, etc, 
whose name is `_`, which has the effect of entering no new names in 
scope.? It becomes an "initialize-only" variable.

 ??? try { ... }
 ??? catch (FooException _) { throw new BarException("foo"); }

 ?- Partial inference.? Here, underscore stands in for a type name.? 
Today, we can infer type variables for generic method invocations and 
constructor invocations, but it is all-or-nothing.? Being able to denote 
"infer this type" would allow us to do partial inference:

 ??? foo.<String, _>m(...)

 ?- "Any" patterns.? Here, underscore is a pattern, which matches 
everything, and binds nothing.

 ??? case Foo(var s, _): ...

We don't have to do all of these; right now we're not considering 
partial inference, but the other two are reasonable options.? Unnamed 
variables have been a long-standing request; any patterns will likely be 
a common request soon as well.

For a match-all pattern, there is little to say other than "_" is one of 
the alternatives of the Pattern production, it is applicable to all 
types, it is unconditional on all types, and it has no bindings.? The 
specification already has a concept of "any" patterns; this is just 
making it denotable.


I think there is little controversy about using unnamed local variables 
(local variable declaration statements, catch formals, foreach induction 
variables, resources in try-with-resources) and unnamed lambda 
parameters.? What is common to all of these is that these are _pure 
implementation details_, where the author has elected to not give a name 
to a variable that is entirely implementation-facing.? This seems 
eminently reasonable.? Unnamed parameters can help eliminate errors by 
capturing design assumptions and make life easier for static analysis 
tools that like to point out unused variables.

Where we stumble is on method parameters, because method parameter names 
serve two masters -- the implementation (as the declaration of a 
variable) and the API (as part of the specification of what the method 
does.)? Among other things, we like to document the semantics of method 
parameters in Javadoc with the `@param` tag, but doing so requires a 
name (or inventing a new Javadoc mechanism like `@param #4`, likely a 
loser.)? Secondarily, sometimes parameter names are retained in the 
MethodParameters attribute, though that attribute (JVMS 4.7.24) already 
supports parameters without names by using a zero CP index.

With `var`, we drew a clear line of "implementation only" -- you can't 
infer a method return type, even for a private method, you can only use 
it for local variables and lambda formals.? This has been pretty 
successful.

We've explored a number of intermediate points on the spectrum with 
varying degrees of stability:

 ?A) Implementation only -- local variables, catch formals, for-loop 
induction variables, TWR resources, pattern variables, lambda formals
 ?B) "A++", where we add in method parameters of anonymous classes
 ?C) Adding in method parameters _for non-initial declarations_ -- allow 
unnamed parameters only for methods that override a method from a 
supertype, ensuring that there is a real specification of what the 
parameters mean.
 ?D) Anything goes, any method parameter can be unnamed, throwing 
specification to the wind.

A is a stable point, and has the advantage of mostly lining up with 
where we can use `var`.? But users will surely grumble that they can't 
use it for implementations of methods from supertypes.? As this feature 
request predates lambdas and patterns, giving it to lambdas and patterns 
but not ordinary methods might feel a bit mean.

The motivation for B is obvious -- to support smooth refactoring between 
lambdas and inner classes -- but is not a very stable point, as one will 
immediately ask "what about refactoring to named classes".

C feels attractive, though there would surely be complaints too; it 
excludes constructors and static methods (which might sometimes want 
unnamed parameters when a parameter is no longer used, but stays around 
for binary compatibility), and even some initial declarations.? But, 
these cases are likely to be somewhat more rare, so I don't object to 
leaving these aside. The main concern is that this might feel 
arbitrary.? There is also the possibility for some confusion; it is not 
obvious what it means when you override a method that already has an 
unnamed parameter.? Can you give it a name and use it?? It is a little 
weird that the lack of name applies only to the implementation of the 
method, but somehow bleeds into the specification.? There is also some 
impact on Javadoc, as well as lingering concerns that there are other 
shoes to drop other than Javadoc and MethodParameters.

D is also stable, but feels like it makes the language less safe, by 
making some methods unspecifiable.? On the other hand, the people who 
might use it for initial declarations, static methods, etc, are also the 
sort of people who probably don't write specification anyway (otherwise 
they would realize that they are depriving their callers of useful 
information.)

In (C), Javadoc could insert an `@implNote` that says something like 
"this implementation ignores the value of parameters <x> and <y> from 
declaring method Foo::bar".? In (D), it could say "ignores its 3rd and 
4th parameter", or insert synthetic @param tags for parameters whose 
name is something like "<unnamed>".

Past discussions seemed to gravitate toward either A or D, which are 
also the simplest / most stable points.? I guess it becomes a question 
of getting over the "makes the language less safe" concerns.

Regardless, I'd like to see if we can quantify the "lingering concerns 
about other shoes to drop."
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220907/45c0575c/attachment-0001.htm>

From forax at univ-mlv.fr  Wed Sep  7 21:43:42 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 7 Sep 2022 23:43:42 +0200 (CEST)
Subject: Unnamed variables and match-all patterns
In-Reply-To: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
Message-ID: <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 7, 2022 7:41:33 PM
> Subject: Unnamed variables and match-all patterns

> We've gone around and around a few times on "unnamed variables" (underscore),
> starting with JEP 302 (Lambda Leftovers). We reclaimed the underscore token in
> Java 9 with the intention of using it for unnamed variables and "any" patterns.
> Along the way, we ran into some hiccups, and it has sat on the shelf for a
> while. Let's take it down, dust it off, and see if we have any more clarity
> than before.

> There are three syntactic productions in which we might want to use underscore
> as a "don't care" indicator:

> - Unnamed variables. Here, underscore stands in for a variable name. When we
> declare a local variable, catch formal, pattern variable, etc, whose name is
> `_`, which has the effect of entering no new names in scope. It becomes an
> "initialize-only" variable.

> try { ... }
> catch (FooException _) { throw new BarException("foo"); }

> - Partial inference. Here, underscore stands in for a type name. Today, we can
> infer type variables for generic method invocations and constructor
> invocations, but it is all-or-nothing. Being able to denote "infer this type"
> would allow us to do partial inference:

> foo.<String, _>m(...)

> - "Any" patterns. Here, underscore is a pattern, which matches everything, and
> binds nothing.

> case Foo(var s, _): ...

> We don't have to do all of these; right now we're not considering partial
> inference, but the other two are reasonable options. Unnamed variables have
> been a long-standing request; any patterns will likely be a common request soon
> as well.

> For a match-all pattern, there is little to say other than "_" is one of the
> alternatives of the Pattern production, it is applicable to all types, it is
> unconditional on all types, and it has no bindings. The specification already
> has a concept of "any" patterns; this is just making it denotable.

> I think there is little controversy about using unnamed local variables (local
> variable declaration statements, catch formals, foreach induction variables,
> resources in try-with-resources) and unnamed lambda parameters. What is common
> to all of these is that these are _pure implementation details_, where the
> author has elected to not give a name to a variable that is entirely
> implementation-facing. This seems eminently reasonable. Unnamed parameters can
> help eliminate errors by capturing design assumptions and make life easier for
> static analysis tools that like to point out unused variables.

> Where we stumble is on method parameters, because method parameter names serve
> two masters -- the implementation (as the declaration of a variable) and the
> API (as part of the specification of what the method does.) Among other things,
> we like to document the semantics of method parameters in Javadoc with the
> `@param` tag, but doing so requires a name (or inventing a new Javadoc
> mechanism like `@param #4`, likely a loser.) Secondarily, sometimes parameter
> names are retained in the MethodParameters attribute, though that attribute
> (JVMS 4.7.24) already supports parameters without names by using a zero CP
> index.

> With `var`, we drew a clear line of "implementation only" -- you can't infer a
> method return type, even for a private method, you can only use it for local
> variables and lambda formals. This has been pretty successful.

> We've explored a number of intermediate points on the spectrum with varying
> degrees of stability:

> A) Implementation only -- local variables, catch formals, for-loop induction
> variables, TWR resources, pattern variables, lambda formals
> B) "A++", where we add in method parameters of anonymous classes
> C) Adding in method parameters _for non-initial declarations_ -- allow unnamed
> parameters only for methods that override a method from a supertype, ensuring
> that there is a real specification of what the parameters mean.
> D) Anything goes, any method parameter can be unnamed, throwing specification to
> the wind.

> A is a stable point, and has the advantage of mostly lining up with where we can
> use `var`. But users will surely grumble that they can't use it for
> implementations of methods from supertypes. As this feature request predates
> lambdas and patterns, giving it to lambdas and patterns but not ordinary
> methods might feel a bit mean.

> The motivation for B is obvious -- to support smooth refactoring between lambdas
> and inner classes -- but is not a very stable point, as one will immediately
> ask "what about refactoring to named classes".

> C feels attractive, though there would surely be complaints too; it excludes
> constructors and static methods (which might sometimes want unnamed parameters
> when a parameter is no longer used, but stays around for binary compatibility),
> and even some initial declarations. But, these cases are likely to be somewhat
> more rare, so I don't object to leaving these aside. The main concern is that
> this might feel arbitrary. There is also the possibility for some confusion; it
> is not obvious what it means when you override a method that already has an
> unnamed parameter. Can you give it a name and use it? It is a little weird that
> the lack of name applies only to the implementation of the method, but somehow
> bleeds into the specification. There is also some impact on Javadoc, as well as
> lingering concerns that there are other shoes to drop other than Javadoc and
> MethodParameters.

> D is also stable, but feels like it makes the language less safe, by making some
> methods unspecifiable. On the other hand, the people who might use it for
> initial declarations, static methods, etc, are also the sort of people who
> probably don't write specification anyway (otherwise they would realize that
> they are depriving their callers of useful information.)

> In (C), Javadoc could insert an `@implNote` that says something like "this
> implementation ignores the value of parameters <x> and <y> from declaring
> method Foo::bar". In (D), it could say "ignores its 3rd and 4th parameter", or
> insert synthetic @param tags for parameters whose name is something like
> "<unnamed>".

> Past discussions seemed to gravitate toward either A or D, which are also the
> simplest / most stable points. I guess it becomes a question of getting over
> the "makes the language less safe" concerns.

> Regardless, I'd like to see if we can quantify the "lingering concerns about
> other shoes to drop."

There is a C-bis, where '_' is allowed for private methods but that's not important. 

As a teacher, i vote for A, APIs should be documented, giving a good name to a parameter is usually the first step. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220907/ef73d26d/attachment.htm>

From forax at univ-mlv.fr  Wed Sep  7 22:01:04 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 8 Sep 2022 00:01:04 +0200 (CEST)
Subject: Array patterns (and varargs patterns)
In-Reply-To: <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
 <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
Message-ID: <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Tagir Valeev" <amaembo at gmail.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 7, 2022 4:32:34 PM
> Subject: Re: Array patterns (and varargs patterns)

> I understand where this sentiment comes from.? But the motivation is
> somewhat more indirect than "people are falling over themselves to
> deconstruct arrays today".
> 
> Because deconstruction is the dual of aggregation, it is desirable for
> each of the forms of aggregation -- constructors, factories, etc -- to
> have pattern counterparts.? Not doing so creates asymmetries that make
> the whole thing seem more ad-hoc.? Many of the "not as important"
> pattern features we're working on now, are in the realm of "completing"
> the feature.
> 
> More importantly, array patterns are how we fully support varargs in
> records.? If we have a varargs record:
> 
> ??? record VA(String... strings) { }
> 
> we can construct it with a varargs invocation
> 
> ??? new VA("a", "b")
> 
> which is sugar for
> 
> ??? new VA(new String[] { "a", "b" })
> 
> But we cannot yet deconstruct it with:
> 
> ??? case VA(var a, var b)
> 
> and analogously, for a varargs record, the above is sugar for
> 
> ??? case VA(String[] { var a, var b })
> 
> So it is not just about arrays.
> 
> I agree that named patterns are more useful, and we are working on them
> too.? But they are also a bigger feature (bringing in overload
> selection, reflection, translation, etc), so they will take longer.
> Whereas array patterns are really a remix of things we've already worked
> out -- nested patterns, exhaustiveness, etc.? In any case I would like
> to avoid leaving a trail of unfinished work, so cleaning up the loose
> ends on basic patterns first seems preferable before adding bigger new
> pattern features.
> 
>> Hello!
>>
>> Honestly, to me this whole feature looks not very important. It's a
>> rare case in modern Java applications that business logic operates
>> with arrays directly. They are mostly used in low-level system code
>> where performance matters more than code elegance. Custom defined
>> named patterns for lists would be much more useful. Moreover, if named
>> patterns are supported, then array deconstruction could be implemented
>> in a library, without complicating the language specification (like `x
>> instanceof Arrays.of(String first, String next, String last)`).
> 
> I'm not sure how this Arrays.of pattern is going to work, unless we're
> willing to have overloads for every arity up to, say, 22? Otherwise, we
> need varargs, and varargs is sugar for an array pattern.

For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
So i agree with Tagir, let's figure out how named patterns work first.

I see also other reasons to not specify the array pattern now,
- record with a varargs are quite rare so people are not desperately in need for the corresponding pattern,
- the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong).

R?mi 

> 
>>
>> With best regards,
>> Tagir Valeev.
>>
>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>> We dropped this out of the record patterns JEP, but I think it is time to
>>> revisit this.
>>>
>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>> container for nested patterns.  And they have an obvious duality with array
>>> creation expressions.
>>>
>>> The main open question here was how we distinguish between "match an array of
>>> length exactly N" (where there are N nested patterns) and "match an array of
>>> length at least N".  We toyed with the idea of a "..." indicator to mean "more
>>> elements", but this felt a little forced and opened new questions.
>>>
>>> It later occurred to me that there is another place to nest a pattern in an
>>> array pattern -- to match (and bind) the length.  In the following, assume for
>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>> denote here with a constant literal.
>>>
>>> There is an obvious place to put this (optional) pattern: in between the
>>> brackets.  So:
>>>
>>>      case String[1] { P }:
>>>                  ^ a constant pattern
>>>
>>> would match string arrays of length 1 whose sole element matches P.  And
>>>
>>>      case String[] { P, Q }
>>>
>>> would match string arrays of length exactly 2, whose first two elements match P
>>> and Q respectively.  (If the length pattern is not specified, we infer a
>>> constant pattern whose constant is equal to the length of the nested pattern
>>> list.)
>>>
>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>
>>>      x instanceof String[] arr
>>>          && arr.length matches L
>>>          && arr.length >= n
>>>          && arr[0] matches P0
>>>          && arr[1] matches P1
>>>          ...
>>>          && arr[n] matches Pn
>>>
>>> More examples:
>>>
>>>      case String[int len] { P }
>>>
>>> would match string arrays of length >= 1 whose first element matches P, and
>>> further binds the array length to `len`.
>>>
>>>      case String[_] { P, Q }
>>>
>>> would match string arrays of any length whose first two elements match P and Q.
>>>
>>>      case String[3] { }
>>>                  ^constant pattern
>>>
>>> matches all string arrays of length 3.
>>>
>>>
>>> This is a more principled way to do it, because the length is a part of the
>>> array and deserves a chance to match via nested patterns, just as with the
>>> elements, and it avoid trying to give "..." a new meaning.
>>>
>>> The downside is that it might be confusing at first (though people will learn
>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>
>>>
>>>
>>>
>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>
>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>> attach another sub-feature: array patterns.  (This also bears on the question
>>> of "how would varargs patterns work", which I'll address below, though they
>>> might come later.)
>>>
>>> ## Array Patterns
>>>
>>> If we want to create a new array, we do so with an array construction
>>> expression:
>>>
>>>      new String[] { "a", "b" }
>>>
>>> Since each form of aggregation should have its dual in destructuring, the
>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>> is:
>>>
>>>      if (arr instanceof String[] { var a, var b }) { ... }
>>>
>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>> to the nested patterns `var a` and `var b`.   This is the natural analogue of
>>> deconstruction patterns for arrays, complete with nesting.
>>>
>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>> rather than simply "length == 2".  There are multiple syntactic ways to get
>>> there, for now I'm going to write
>>>
>>>      if (arr instanceof String[] { var a, var b, ... })
>>>
>>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>>
>>> <digression>
>>> People are immediately going to ask "can I bind something to the remainder"; I
>>> think this is mostly an "attractive distraction", and would prefer to not have
>>> this dominate the discussion.
>>> </digression>
>>>
>>> Here's an example from the JDK that could use this effectively:
>>>
>>> String[] limits = limitString.split(":");
>>> try {
>>>      switch (limits.length) {
>>>          case 2: {
>>>              if (!limits[1].equals("*"))
>>>                  setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>          }
>>>          case 1: {
>>>              if (!limits[0].equals("*"))
>>>                  setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>          }
>>>      }
>>> }
>>> catch(NumberFormatException ex) {
>>>      setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>      setMultilineLimit(MultilineLimit.LENGTH, -1);
>>> }
>>>
>>> becomes (eventually)
>>>
>>>      switch (limitString.split(":")) {
>>>          case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>          case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>          default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>      }
>>>
>>> Note how not only does this become more compact, but the unchecked
>>> "NumberFormatException" is folded into the match, rather than being a separate
>>> concern.
>>>
>>>
>>> ## Varargs patterns
>>>
>>> Having array patterns offers us a natural way to interpret deconstruction
>>> patterns for varargs records.  Assume we have:
>>>
>>>      void m(X... xs) { }
>>>
>>> Then a varargs invocation
>>>
>>>      m(a, b, c)
>>>
>>> is really sugar for
>>>
>>>      m(new X[] { a, b, c })
>>>
>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>> array pattern.  So for a record
>>>
>>>      record R(X... xs) { }
>>>
>>> a varargs match:
>>>
>>>      case R(var a, var b, var c):
>>>
>>> is really sugar for an array match:
>>>
>>>      case R(X[] { var a, var b, var c }):
>>>
>>> And similarly, we can use our "more arity" indicator:
>>>
>>>      case R(var a, var b, var c, ...):
>>>
>>> to indicate that there are at least three elements.
>>>
>>>

From brian.goetz at oracle.com  Wed Sep  7 22:13:57 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 7 Sep 2022 18:13:57 -0400
Subject: Unnamed variables and match-all patterns
In-Reply-To: <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>
Message-ID: <cf99168c-156b-dd74-de6b-2e77c4f5c580@oracle.com>


>
> As a teacher, i vote for A, APIs should be documented, giving a good 
> name to a parameter is usually the first step.
>

I'm willing to consider starting with A, though I think we should admit 
that the most likely reaction if we do that is "you idiots got it wrong 
again, we waited 25 years for underscore, and you don't even let us do 
it in the most obvious places."? So I don't think "do A and never do 
anything about method parameters" is going to fly, though it is 
potentially a reasonable incremental step on the way there to get people 
used to unnamed things.

From brian.goetz at oracle.com  Wed Sep  7 22:15:04 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 7 Sep 2022 18:15:04 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
 <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
 <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>
Message-ID: <d87c0168-5a10-7c4e-7026-4d9ee04ab024@oracle.com>


> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?

Its a named pattern, but to work, it would need varargs patterns -- and 
array patterns are the underpinnings of varargs, just as array creation 
is the underpinning of varargs invocation.? We're not going to do 
varargs patterns differently than we do varargs invocation, just to 
avoid doing array patterns -- that would be silly.

> I see also other reasons to not specify the array pattern now,
> - record with a varargs are quite rare so people are not desperately in need for the corresponding pattern,
> - the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong).

As I've said, the fact that people are not desperate for this yet 
(though obviously you and Tagir want varargs patterns, so there is some 
demand for it) is not the primary reason to do this now.? The symmetry 
between aggregation and deconstruction is very, very, very important to 
people understanding properly how pattern matching fits into the 
language.? I am trying to button up the sources of asymmetry in the 
patterns we have before moving on to cool new patterns.? Otherwise we 
leave a trail of accidental complexity behind us, where certain things 
are reversible and others are not, for no apparent reason.? (Primitives 
in type patterns are in this category too, and we'll be returning to 
them very soon.)

So I'm not going to hold up the discussion of named patterns for array 
patterns (I'm working on a document for named patterns too, but its much 
longer), but I'm also not going to hold up array patterns until we get 
named patterns done either.? I want to close up the holes in what we've 
already built before laying the next layer.

(As to List and Map patterns, these will have to be co-designed with 
List and Map literals, which will likely require some additional 
groundwork.? They're a ways away, we're building a tower layer by layer.)

>
> R?mi
>
>>> With best regards,
>>> Tagir Valeev.
>>>
>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>>> We dropped this out of the record patterns JEP, but I think it is time to
>>>> revisit this.
>>>>
>>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>>> container for nested patterns.  And they have an obvious duality with array
>>>> creation expressions.
>>>>
>>>> The main open question here was how we distinguish between "match an array of
>>>> length exactly N" (where there are N nested patterns) and "match an array of
>>>> length at least N".  We toyed with the idea of a "..." indicator to mean "more
>>>> elements", but this felt a little forced and opened new questions.
>>>>
>>>> It later occurred to me that there is another place to nest a pattern in an
>>>> array pattern -- to match (and bind) the length.  In the following, assume for
>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>>> denote here with a constant literal.
>>>>
>>>> There is an obvious place to put this (optional) pattern: in between the
>>>> brackets.  So:
>>>>
>>>>       case String[1] { P }:
>>>>                   ^ a constant pattern
>>>>
>>>> would match string arrays of length 1 whose sole element matches P.  And
>>>>
>>>>       case String[] { P, Q }
>>>>
>>>> would match string arrays of length exactly 2, whose first two elements match P
>>>> and Q respectively.  (If the length pattern is not specified, we infer a
>>>> constant pattern whose constant is equal to the length of the nested pattern
>>>> list.)
>>>>
>>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>>
>>>>       x instanceof String[] arr
>>>>           && arr.length matches L
>>>>           && arr.length >= n
>>>>           && arr[0] matches P0
>>>>           && arr[1] matches P1
>>>>           ...
>>>>           && arr[n] matches Pn
>>>>
>>>> More examples:
>>>>
>>>>       case String[int len] { P }
>>>>
>>>> would match string arrays of length >= 1 whose first element matches P, and
>>>> further binds the array length to `len`.
>>>>
>>>>       case String[_] { P, Q }
>>>>
>>>> would match string arrays of any length whose first two elements match P and Q.
>>>>
>>>>       case String[3] { }
>>>>                   ^constant pattern
>>>>
>>>> matches all string arrays of length 3.
>>>>
>>>>
>>>> This is a more principled way to do it, because the length is a part of the
>>>> array and deserves a chance to match via nested patterns, just as with the
>>>> elements, and it avoid trying to give "..." a new meaning.
>>>>
>>>> The downside is that it might be confusing at first (though people will learn
>>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>>
>>>>
>>>>
>>>>
>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>>
>>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>>> attach another sub-feature: array patterns.  (This also bears on the question
>>>> of "how would varargs patterns work", which I'll address below, though they
>>>> might come later.)
>>>>
>>>> ## Array Patterns
>>>>
>>>> If we want to create a new array, we do so with an array construction
>>>> expression:
>>>>
>>>>       new String[] { "a", "b" }
>>>>
>>>> Since each form of aggregation should have its dual in destructuring, the
>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>>> is:
>>>>
>>>>       if (arr instanceof String[] { var a, var b }) { ... }
>>>>
>>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>>> to the nested patterns `var a` and `var b`.   This is the natural analogue of
>>>> deconstruction patterns for arrays, complete with nesting.
>>>>
>>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>>> rather than simply "length == 2".  There are multiple syntactic ways to get
>>>> there, for now I'm going to write
>>>>
>>>>       if (arr instanceof String[] { var a, var b, ... })
>>>>
>>>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>>>
>>>> <digression>
>>>> People are immediately going to ask "can I bind something to the remainder"; I
>>>> think this is mostly an "attractive distraction", and would prefer to not have
>>>> this dominate the discussion.
>>>> </digression>
>>>>
>>>> Here's an example from the JDK that could use this effectively:
>>>>
>>>> String[] limits = limitString.split(":");
>>>> try {
>>>>       switch (limits.length) {
>>>>           case 2: {
>>>>               if (!limits[1].equals("*"))
>>>>                   setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>>           }
>>>>           case 1: {
>>>>               if (!limits[0].equals("*"))
>>>>                   setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>>           }
>>>>       }
>>>> }
>>>> catch(NumberFormatException ex) {
>>>>       setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>>       setMultilineLimit(MultilineLimit.LENGTH, -1);
>>>> }
>>>>
>>>> becomes (eventually)
>>>>
>>>>       switch (limitString.split(":")) {
>>>>           case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>>           case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>>           default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>>       }
>>>>
>>>> Note how not only does this become more compact, but the unchecked
>>>> "NumberFormatException" is folded into the match, rather than being a separate
>>>> concern.
>>>>
>>>>
>>>> ## Varargs patterns
>>>>
>>>> Having array patterns offers us a natural way to interpret deconstruction
>>>> patterns for varargs records.  Assume we have:
>>>>
>>>>       void m(X... xs) { }
>>>>
>>>> Then a varargs invocation
>>>>
>>>>       m(a, b, c)
>>>>
>>>> is really sugar for
>>>>
>>>>       m(new X[] { a, b, c })
>>>>
>>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>>> array pattern.  So for a record
>>>>
>>>>       record R(X... xs) { }
>>>>
>>>> a varargs match:
>>>>
>>>>       case R(var a, var b, var c):
>>>>
>>>> is really sugar for an array match:
>>>>
>>>>       case R(X[] { var a, var b, var c }):
>>>>
>>>> And similarly, we can use our "more arity" indicator:
>>>>
>>>>       case R(var a, var b, var c, ...):
>>>>
>>>> to indicate that there are at least three elements.
>>>>
>>>>


From guy.steele at oracle.com  Thu Sep  8 01:35:31 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 8 Sep 2022 01:35:31 +0000
Subject: Unnamed variables and match-all patterns
In-Reply-To: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
Message-ID: <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>


On Sep 7, 2022, at 1:41 PM, Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:
. . .

Where we stumble is on method parameters, because method parameter names serve two masters -- the implementation (as the declaration of a variable) and the API (as part of the specification of what the method does.)  Among other things, we like to document the semantics of method parameters in Javadoc with the `@param` tag, but doing so requires a name

And a general language-design pattern is that if you discover a single language feature is serving two masters, consider splitting it into two features, one to serve each master (and then perhaps continue to allow the old feature, explaining it in terms of the new, more general features.

In this case, a single feature (method parameter name) provides both a name for the implementation and a name for the API. So, consider having a way to provide two names. Common Lisp has been doing this for its keyword parameters for almost four decades:

(defun foo (&key ((:color c) white) ((:angle a) 0))
   ? c ? a ?)

(foo :color black :angle 45)

So the names :color and :angle are part of the API, and the names c and a are the variable names that are actually bound for use in the body.

When you write

(defun baz (&key (color white) (angle 0))
  ? color ? angle)

it is by definition an abbreviation for

(defun baz (&key ((:color color) white) ((:angle angle) 0))
  ? color ? angle)

So you don?t have to write out two names in the common case where you actually do want them to be ?the same?.

????

So in Java we could pick some crazy syntax to allow specifying two names for a method parameter, the API name and the implementation (bound variable) name:

int colorHack(int red=>r, int green=>g, int blue=>b, int fromIndex=>from, int toIndex=>to) {
// Here the names `r`, `g`, `b`, `from`, and `to` are in scope.
}

and then if you really want to ignore a parameter:

int colorBlindHack(int red=>_, int green=>_, int blue=>_, int fromIndex=>from, int toIndex=>to) {
// Here the names `from` and `to` are in scope.
}

Not sure we want to go in that direction, but we should at least consider it.

?Guy

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/c33b5c68/attachment-0001.htm>

From john.r.rose at oracle.com  Thu Sep  8 04:10:02 2022
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 07 Sep 2022 21:10:02 -0700
Subject: Unnamed variables and match-all patterns
In-Reply-To: <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>
Message-ID: <89034F8F-F4B2-4F29-AC3F-57D34F9A8B6B@oracle.com>


On 7 Sep 2022, at 18:35, Guy Steele wrote:

> On Sep 7, 2022, at 1:41 PM, Brian Goetz 
> <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:
> . . .
>
> Where we stumble is on method parameters, because method parameter 
> names serve two masters --
> the implementation (as the declaration of a variable) and the API (as 
> part of the specification of what the method does.)  Among other 
> things, we like to document the semantics of method parameters in 
> Javadoc with the `@param` tag, but doing so requires a name
>
> And a general language-design pattern is that if you discover a single 
> language feature is serving two masters, consider splitting it into 
> two features, one to serve each master (and then perhaps continue to 
> allow the old feature, explaining it in terms of the new, more general 
> features.
>
> In this case, a single feature (method parameter name) provides both a 
> name for the implementation and a name for the API. So, consider 
> having a way to provide two names. Common Lisp has been doing this for 
> its keyword parameters for almost four decades:
>
> (defun foo (&key ((:color c) white) ((:angle a) 0))
>    ? c ? a ?)
>
> (foo :color black :angle 45)
>
> So the names :color and :angle are part of the API, and the names c 
> and a are the variable names that are actually bound for use in the 
> body.
>
> When you write
>
> (defun baz (&key (color white) (angle 0))
>   ? color ? angle)
>
> it is by definition an abbreviation for
>
> (defun baz (&key ((:color color) white) ((:angle angle) 0))
>   ? color ? angle)
>
> So you don?t have to write out two names in the common case where 
> you actually do want them to be ?the same?.
>
> ????
>
> So in Java we could pick some crazy syntax to allow specifying two 
> names for a method parameter, the API name and the implementation 
> (bound variable) name:
>
> int colorHack(int red=>r, int green=>g, int blue=>b, int 
> fromIndex=>from, int toIndex=>to) {
> // Here the names `r`, `g`, `b`, `from`, and `to` are in scope.
> }
>
> and then if you really want to ignore a parameter:
>
> int colorBlindHack(int red=>_, int green=>_, int blue=>_, int 
> fromIndex=>from, int toIndex=>to) {
> // Here the names `from` and `to` are in scope.
> }
>
> Not sure we want to go in that direction, but we should at least 
> consider it.
>
> ?Guy

As it happens, today I also cited Lisp argument syntax practice to 
Brian, on the subject of array patterns.  (As in, one bit of prior art 
for sequence matching is Common Lisp req/opt/key args?)

This is another bit of prior art from the same rather deep wellspring.

There is a proposed syntax which allows a single value to have two 
names, one of which is a binding, and that is the pattern-let syntax.

Perhaps a variation of pattern-let could make sense in parameter 
declarations.  (As many kinds of patterns might eventually be useful in 
parameter position.)  In this case, the binding is inside the pattern, 
such as `String s`, and the let-part is after an equals sign, `let 
String s = expr`.  (Bikeshed still to be painted here.)  Aligning with 
the need for a double declaration, we could say that the `expr` part is 
the formal and external name of the parameter, and the `s` part is the 
local and internal name of the binding.  So:

int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, 
?) ?

Huh.  Looks too close to optional arguments for comfort.  And how would 
you combine it with optional arguments?

int colorBlindHack(let int _ = red = 0, let int _ = green = 0, let int b 
= blue, ?) ?

Oh well, it was a thought.

If we ever do go with keyword-based calling conventions in Java, then 
there will be significant pressure for such double names, as there was 
in Common Lisp.  Until then, the double naming seems to me to be a 
corner-case feature.

Adding immutable structs (value classes) into the language does, in 
fact, increase the need for keyword-based conventions, so that you can 
*update* an immutable instance by combining a pre-existing instance with 
one or more field values to update.  (I like to call such a factory 
method a ?reconstructor? because of its similarity to a constructor, 
which disregards any previous state and sets *all* the fields.)  If we 
add reconstructors to the language, there is new pressure for 
keyword-based calling conventions, and after that, there is pressure for 
the ?double naming? of parameters.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220907/5dfe7781/attachment.htm>

From john.r.rose at oracle.com  Thu Sep  8 04:16:48 2022
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 07 Sep 2022 21:16:48 -0700
Subject: Unnamed variables and match-all patterns
In-Reply-To: <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>
Message-ID: <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com>

> Aligning with the need for a double declaration, we could say that the `expr` part is the formal and external name of the parameter, and the `s` part is the local and internal name of the binding.  So:
>
> int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, ?) ?
>
> Huh.  Looks too close to optional arguments for comfort.  And how would you combine it with optional arguments?

P.S.  (That is, Painting Shed.)  If we allowed Java the label-like syntax adopted by some languages for externally named keyword arguments it might look like this:

int colorBlindHack(red: int _, green: int _, blue: int, ?)

The last argument keyworded as ?blue? is bound to the name ?blue? in the absence of other indication; the other choices being `blue: int _` for ignored argument and `blue: int b` for a different local name.

int colorHack(red: int, green: int, blue: int, ?)   //keyword arguments

(The ?L: FOO? syntax is already a thing in Java, see?)

So many bikesheds, so little time?

From john.r.rose at oracle.com  Thu Sep  8 04:41:22 2022
From: john.r.rose at oracle.com (John Rose)
Date: Wed, 07 Sep 2022 21:41:22 -0700
Subject: Array patterns (and varargs patterns)
In-Reply-To: <3b623438-81c9-f888-07bf-55d231db3240@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <CAE+3fjYprDiOmnz_ec-jKs=7F_ZMEqzsEQKZOCuHgXBDqS1Xqw@mail.gmail.com>
 <3b623438-81c9-f888-07bf-55d231db3240@oracle.com>
Message-ID: <0DAD71CE-C638-4DDC-A6E8-C61EF082C0B9@oracle.com>

On 7 Jan 2021, at 6:18, Brian Goetz wrote:

> ?Varargs patterns will build on it (as shown in the mail); if and 
> when Java ever gets collection literals, there will be corresponding 
> collection patterns too.? I think the path to streamlining this is 
> not to try and simplify the syntax of the primitive, but move upwards 
> to higher-level patterns.

OTOH if patterns (like `switch ((O)x) case P v:` or `let P v = (O)x`) 
are the duals of assignment (like `x = v` or `O x = v`), then we are 
within our moral rights to make a pattern dualization of the venerable 
Java syntax `T[] x = {a,b,c}`, which is sugar for `T[] x = new 
T[]{a,b,c}`.  The sugar allows you to take the second `T[]` (and the 
`new`) for a typeful context (`T[] x`).

So without the sugar we get something like:

T[] a = ?;
switch (a) { case new T[]{a,b,c}: }

(The `new` from `new T[]{a,b,c}` is dropped because `new` doesn?t 
appear in patterns.)

But with the same sugar, but dualized, we get:

T[] a = ?;
switch (a) { case {a,b,c}: }

In other words, when the pattern target is already an array, there is no 
need for the ceremony of repeating the array type, as with normal array 
declarations.

Likewise:

T[][] a2d = ?;
switch (a2d) { case {{a,b},{c,d}}: }

I think this is what Tagir expected, and I think it is a reasonable 
?penciling out? of the basic moves of the game we are playing here.

Moving on to varargs, the context of a method call marked varargs allows 
elision not only of the `new T[]` in `new T[]{a,b,c}` but also the 
braces, you you can equally say `f(a,b,c)` or `f(new T[]{a,b,c})`.

(But not `f({a,b,c})`.  So, we don?t get `f2d({{a,b},{c,d}})` by 
analogy with nested array initializers.  Whatever.)

If a pattern-method can take a pattern-flavored argument, and perhaps a 
varargs argument to boot, it?s pretty clear that additional moves 
could follow quickly:

pattern f(pattern T[] a} { ? }
pattern f2d(pattern T[][] a) { ? }
pattern fv(pattern T a?) { ? }
// extra ?pattern? keyword on parameters for emphasis?

switch (x) {
   case f({a,b,c}): ?   // omit `new T[]` b/c type
   case f2d({{d,e,f},{g}}): ?   // omit `new T[][]` b/c type
   case fv(h,i,j,k): ?   // omit `new T[]` and braces b/c varargs
}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220907/380e9fad/attachment.htm>

From amaembo at gmail.com  Thu Sep  8 06:47:26 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Thu, 8 Sep 2022 08:47:26 +0200
Subject: Unnamed variables and match-all patterns
In-Reply-To: <cf99168c-156b-dd74-de6b-2e77c4f5c580@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>
 <cf99168c-156b-dd74-de6b-2e77c4f5c580@oracle.com>
Message-ID: <CAE+3fja8Ws7B+DkWmu1-Jcne5Av+mkCSuEur-4iK4KDAQoayOw@mail.gmail.com>

Hello!

??, 8 ????. 2022 ?., 00:14 Brian Goetz <brian.goetz at oracle.com>:

>
> >
> > As a teacher, i vote for A, APIs should be documented, giving a good
> > name to a parameter is usually the first step.
> >
>
> I'm willing to consider starting with A, though I think we should admit
> that the most likely reaction if we do that is "you idiots got it wrong
> again, we waited 25 years for underscore, and you don't even let us do
> it in the most obvious places."  So I don't think "do A and never do
> anything about method parameters" is going to fly, though it is
> potentially a reasonable incremental step on the way there to get people
> used to unnamed things.
>

I'm not sure it's so critical. To me, the main source of frustration is the
necessity to think up a name that I won't use anyway. The second source is
the fact that the code becomes noticeably longer when it includes unused
names. Both problems are not so important for method parameters:

- If you override or implement method, any IDE just copies names from the
super-method for you, so you don't need to think.
- Method declaration is already quite verbose. It contains @Override
annotation, modifiers, types of all parameters and return type explicitly
spelled, all of them could be quite long. Probably other annotations,
throws and Javadoc. Saving few chars there would not help much. On the
other hand, declaration doesn't contain logic, so people rarely stare at it
trying to understand what's going on.

Another problem, namely polluting namespace with an unused name, stays, but
I believe it's not so important. It may be confusing if you want to reuse
the name of super-method parameter for another purpose, so occupying it by
default has its advantages.

That's said, I'm also for A. It's simple and well defined. It's in line
with lvti philosophy and will be already very helpful without adding
confusion and strange corner cases.

With best regards,
Tagir Valeev.

>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/6420f4e8/attachment.htm>

From brian.goetz at oracle.com  Thu Sep  8 12:22:22 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 8 Sep 2022 08:22:22 -0400
Subject: Unnamed variables and match-all patterns
In-Reply-To: <CAE+3fja8Ws7B+DkWmu1-Jcne5Av+mkCSuEur-4iK4KDAQoayOw@mail.gmail.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>
 <cf99168c-156b-dd74-de6b-2e77c4f5c580@oracle.com>
 <CAE+3fja8Ws7B+DkWmu1-Jcne5Av+mkCSuEur-4iK4KDAQoayOw@mail.gmail.com>
Message-ID: <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com>


>
> I'm not sure it's so critical. To me, the main source of frustration 
> is the necessity to think up a name that I won't use anyway. The 
> second source is the fact that the code becomes noticeably longer when 
> it includes unused names. Both problems are not so important for 
> method parameters:
>
> - If you override or implement method, any IDE just copies names from 
> the super-method for you, so you don't need to think.
> - Method declaration is already quite verbose. It contains @Override 
> annotation, modifiers, types of all parameters and return type 
> explicitly spelled, all of them could be quite long. Probably other 
> annotations, throws and Javadoc. Saving few chars there would not help 
> much. On the other hand, declaration doesn't contain logic, so people 
> rarely stare at it trying to understand what's going on.

For the people who complain about this, I don't think it's about saving 
a few characters in the declaration, as much as satisyfing static 
analysis that complains about unused parameters.? But I suspect that 
many of these have already become lambdas (this happened most commonly 
with anonymous classes previously).? So I'm willing to do the experiment 
of A first and see if we need to take the next step.


From guy.steele at oracle.com  Thu Sep  8 15:13:28 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 8 Sep 2022 15:13:28 +0000
Subject: Unnamed variables and match-all patterns
In-Reply-To: <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <A1117F78-35F7-49AB-AD7D-74712829F5F5@oracle.com>
 <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com>
Message-ID: <710344EE-A52C-42FA-BA80-F29C0562F002@oracle.com>


> On Sep 8, 2022, at 12:16 AM, John Rose <john.r.rose at oracle.com> wrote:
> 
>> Aligning with the need for a double declaration, we could say that the `expr` part is the formal and external name of the parameter, and the `s` part is the local and internal name of the binding.  So:
>> 
>> int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, ?) ?
>> 
>> Huh.  Looks too close to optional arguments for comfort.  And how would you combine it with optional arguments?
> 
> P.S.  (That is, Painting Shed.)  If we allowed Java the label-like syntax adopted by some languages for externally named keyword arguments it might look like this:
> 
> int colorBlindHack(red: int _, green: int _, blue: int, ?)
> 
> The last argument keyworded as ?blue? is bound to the name ?blue? in the absence of other indication; the other choices being `blue: int _` for ignored argument and `blue: int b` for a different local name.
> 
> int colorHack(red: int, green: int, blue: int, ?)   //keyword arguments
> 
> (The ?L: FOO? syntax is already a thing in Java, see?)
> 
> So many bikesheds, so little time?

Wowww?this is one of the bikiest sheds I have seen in a long time. I am very impressed. Completely consistent to those who know the history, therefore very appealing! But also, alas, with the promise of totally confusing newcomers as to whether Java parameter declaration syntax is C-like or Pascal-like. :-)

?Guy


From brian.goetz at oracle.com  Thu Sep  8 16:53:21 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 8 Sep 2022 12:53:21 -0400
Subject: Primitives in instanceof and patterns
Message-ID: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>

Earlier in the year we talked about primitive type patterns.? Let me 
summarize
the past discussion, what I think the right direction is, and why this 
is (yet
another) "finishing up the job" task for basic patterns that, if left 
undone,
will be a sharp edge.

Prior to record patterns, we didn't support primitive type patterns at 
all. With
records, we now support primitive type patterns as nested patterns, but 
they are
very limited; they are only applicable to exactly their own type.

The motivation for "finishing" primitive type patterns is the same as 
discussed
earlier this week with array patterns -- if pattern matching is the dual of
aggregation, we want to avoid gratuitous asymmetries that let you put things
together but not take them apart.

Currently, we can assign a `String` to an `Object`, and recover the `String`
with a pattern match:

 ??? Object o = "Bob";
 ??? if (o instanceof String s) { println("Hi Bob"); }

Analogously, we can assign an `int` to a `long`:

 ??? long n = 0;

but we cannot yet recover the int with a pattern match:

 ??? if (n instanceof int i) { ... } // error, pattern `int i` not 
applicable to `long`

To fill out some more of the asymmetries around records if we don't 
finish the job: given

 ??? record R(int i) { }

we can construct it with

 ??? new R(anInt)???? // no adaptation
 ??? new R(aShort)??? // widening
 ??? new R(anInteger) // unboxing

but yet cannot deconstruct it the same way:

 ??? case R(int i)???? // OK
 ??? case R(short s)?? // nope
 ??? case R(Integer i) // nope

It would be a gratuitous asymmetry that we can use pattern matching to 
recover from
reference widening, but not from primitive widening.? While many of the
arguments against doing primitive type patterns now were of the form 
"let's keep
things simple", I believe that the simpler solution is actually to 
_finish the
job_, because this minimizes asymmetries and potholes that users would 
otherwise
have to maintain a mental catalog of.

Our earlier explorations started (incorrectly, as it turned out), with
assignment context.? This direction gave us a good push in the right 
direction,
but turned out to not be the right answer.? A more careful reading of 
JLS Ch5
convinced me that the answer lies not in assignment conversion, but _cast
conversion_.

#### Stepping back: instanceof

The right place to start is actually not patterns, but `instanceof`.? If we
start here, and listen carefully to the specification, it leads us to the
correct answer.

Today, `instanceof` works only for reference types. Accordingly, most people
view `instanceof` as "the subtyping operator" -- because that's the only
question we can currently ask it.? We almost never see `instanceof` on 
its own;
it is nearly always followed by a cast to the same type. Similarly, we 
rarely
see a cast on its own; it is nearly always preceded by an `instanceof` 
for the
same type.

There's a reason these two operations travel together: casting is, in 
general,
unsafe; we can try to cast an `Object` reference to a `String`, but if the
reference refers to another type, the cast will fail.? So to make 
casting safe,
we precede it with an `instanceof` test.? The semantics of `instanceof` and
casting align such that `instanceof` is the precondition test for safe 
casting.

 > instanceof is the precondition for safe casting

Asking `instanceof T` means "if I cast this to T, would I like the answer."
Obviously CCE is an unlikable answer; `instanceof` further adopts the 
opinion
that casting `null` would also be an unlikable answer, because while the 
cast
would succeed, you can't do anything useful with the result.

Currently, `instanceof` is only defined on reference types, and on this 
domain
coincides with subtyping.? On the other hand, casting is defined between
primitive types (widening, narrowing), and between primitive and 
reference types
(boxing, unboxing).? Some casts involving primitives yield "better" 
results than
others; casting `0` to `byte` results in no loss of information, since 
`0` is
representable as a byte, but casting `500` to `byte` succeeds but loses
information because the higher order bits are discarded.

If we characterize some casts as "lossy" and others as "exact" -- where 
lossy
means discarding useful information -- we can extend the "safe casting
precondition" meaning of `instanceof` to primitive operands and types in the
obvious way -- "would casting this expression to this type succeed 
without error
and without information loss."? If the type of the expression is not 
castable to
the type we are asking about, we know the cast cannot succeed and reject the
`instanceof` test at compile time.

Defining which casts are lossy and which are exact is fairly 
straightforward; we
can appeal to the concept already in the JLS of "representable in the 
range of a
type."? For some pairs of types, casting is always exact (e.g., casting 
`int` to
`long` is always exact); we call these "unconditionally exact". For 
other pairs
of types, some values can be cast exactly and others cannot.

Defining which casts are exact gives us a simple and precise semantics 
for `x
instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the 
static
type of `x` is not castable to `T`, then the corresponding `instanceof` 
question
is rejected statically.? The answers are not suprising:

 ?- Boxing is always exact;
 ?- Unboxing is exact for all non-null values;
 ?- Reference widening is always exact;
 ?- Reference narrowing is exact if the type of the target expression is a
 ?? subtype of the target type;
 ?- Primitive widening and narrowing are exact if the target expression 
can be
 ?? represented in the range of the target type.

#### Primitive type patterns

It is a short hop from `instanceof` to patterns (including primitive type
patterns, and reference type patterns applied to primitive types), which 
can be
defined entirely in terms of cast conversion and exactness:

 ?- A type pattern `T t` is applicable to a target of type `S` if `S` is
 ?? cast-convertible to `T`;
 ?- A type pattern `T t` matches a target `x` if `x` can be cast exactly 
to `T`;
 ?- A type pattern `T t` is unconditional at type `S` if casting from 
`T` to `S`
 ?? is unconditionally exact;
 ?- A type pattern `T t` dominates a type pattern `S s` (or a record pattern
 ?? `S(...)`) if `T t` would be unconditional on `S`.

While the rules for casting are complex, primitive patterns add no new
complexity; there are no new conversions or conversion contexts.? If we see:

 ??? switch (a) {
 ??????? case T t: ...
 ??? }

we know the case matches if `a` can be cast exactly to `T`, and the 
pattern is
unconditional if _all_ values of `a`'s type can be cast exactly to `T`.? 
Note
that none of this is specific to primitives; we derive the semantics of 
_all_
type patterns from the enhanced definition of casting.

Now, our record deconstruction examples work symmetrically to construction:

 ??? case R(int i)???? // OK
 ??? case R(short s)?? // test if `i` is in the range of `short`
 ??? case R(Integer i) // box `i` to `Integer`

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/e04bba25/attachment-0001.htm>

From emcmanus at google.com  Thu Sep  8 17:53:55 2022
From: emcmanus at google.com (=?UTF-8?Q?=C3=89amonn_McManus?=)
Date: Thu, 8 Sep 2022 10:53:55 -0700
Subject: Primitives in instanceof and patterns
In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
Message-ID: <CAChqX89jk3v10B2q4kA6xHJE9uf2Y7ZBYEgb_859YXxzzx4HNQ@mail.gmail.com>

This makes a lot of sense. I'm wondering, though, how it works with float
and double. I don't see the answer by looking at JLS Ch5. Are the following
expressions legal, and if so which ones are true?

   1. 2.0 instanceof int
   2. 2.5 instanceof int
   3. Double.MAX_VALUE instanceof float
   4. Math.PI instanceof float
   5. Double.NaN instanceof float
   6. Double.NaN instanceof double

Intuitively, I think I would expect that if t is an expression of primitive
type T, and U is also a primitive type, then t instanceof U is true iff t
== (T) (U) t. (Perhaps with a variant of == that is reflexive even for NaN,
or perhaps not.) That would make (1) true, (2,3,4) false, and (5,6) who
knows.

On Thu, 8 Sept 2022 at 09:53, Brian Goetz <brian.goetz at oracle.com> wrote:

> Earlier in the year we talked about primitive type patterns.  Let me
> summarize
> the past discussion, what I think the right direction is, and why this is
> (yet
> another) "finishing up the job" task for basic patterns that, if left
> undone,
> will be a sharp edge.
>
> Prior to record patterns, we didn't support primitive type patterns at
> all. With
> records, we now support primitive type patterns as nested patterns, but
> they are
> very limited; they are only applicable to exactly their own type.
>
> The motivation for "finishing" primitive type patterns is the same as
> discussed
> earlier this week with array patterns -- if pattern matching is the dual of
> aggregation, we want to avoid gratuitous asymmetries that let you put
> things
> together but not take them apart.
>
> Currently, we can assign a `String` to an `Object`, and recover the
> `String`
> with a pattern match:
>
>     Object o = "Bob";
>     if (o instanceof String s) { println("Hi Bob"); }
>
> Analogously, we can assign an `int` to a `long`:
>
>     long n = 0;
>
> but we cannot yet recover the int with a pattern match:
>
>     if (n instanceof int i) { ... } // error, pattern `int i` not
> applicable to `long`
>
> To fill out some more of the asymmetries around records if we don't finish
> the job: given
>
>     record R(int i) { }
>
> we can construct it with
>
>     new R(anInt)     // no adaptation
>     new R(aShort)    // widening
>     new R(anInteger) // unboxing
>
> but yet cannot deconstruct it the same way:
>
>     case R(int i)     // OK
>     case R(short s)   // nope
>     case R(Integer i) // nope
>
> It would be a gratuitous asymmetry that we can use pattern matching to
> recover from
> reference widening, but not from primitive widening.  While many of the
> arguments against doing primitive type patterns now were of the form
> "let's keep
> things simple", I believe that the simpler solution is actually to _finish
> the
> job_, because this minimizes asymmetries and potholes that users would
> otherwise
> have to maintain a mental catalog of.
>
> Our earlier explorations started (incorrectly, as it turned out), with
> assignment context.  This direction gave us a good push in the right
> direction,
> but turned out to not be the right answer.  A more careful reading of JLS
> Ch5
> convinced me that the answer lies not in assignment conversion, but _cast
> conversion_.
>
> #### Stepping back: instanceof
>
> The right place to start is actually not patterns, but `instanceof`.  If we
> start here, and listen carefully to the specification, it leads us to the
> correct answer.
>
> Today, `instanceof` works only for reference types.  Accordingly, most
> people
> view `instanceof` as "the subtyping operator" -- because that's the only
> question we can currently ask it.  We almost never see `instanceof` on its
> own;
> it is nearly always followed by a cast to the same type.  Similarly, we
> rarely
> see a cast on its own; it is nearly always preceded by an `instanceof` for
> the
> same type.
>
> There's a reason these two operations travel together: casting is, in
> general,
> unsafe; we can try to cast an `Object` reference to a `String`, but if the
> reference refers to another type, the cast will fail.  So to make casting
> safe,
> we precede it with an `instanceof` test.  The semantics of `instanceof` and
> casting align such that `instanceof` is the precondition test for safe
> casting.
>
> > instanceof is the precondition for safe casting
>
> Asking `instanceof T` means "if I cast this to T, would I like the answer."
> Obviously CCE is an unlikable answer; `instanceof` further adopts the
> opinion
> that casting `null` would also be an unlikable answer, because while the
> cast
> would succeed, you can't do anything useful with the result.
>
> Currently, `instanceof` is only defined on reference types, and on this
> domain
> coincides with subtyping.  On the other hand, casting is defined between
> primitive types (widening, narrowing), and between primitive and reference
> types
> (boxing, unboxing).  Some casts involving primitives yield "better"
> results than
> others; casting `0` to `byte` results in no loss of information, since `0`
> is
> representable as a byte, but casting `500` to `byte` succeeds but loses
> information because the higher order bits are discarded.
>
> If we characterize some casts as "lossy" and others as "exact" -- where
> lossy
> means discarding useful information -- we can extend the "safe casting
> precondition" meaning of `instanceof` to primitive operands and types in
> the
> obvious way -- "would casting this expression to this type succeed without
> error
> and without information loss."  If the type of the expression is not
> castable to
> the type we are asking about, we know the cast cannot succeed and reject
> the
> `instanceof` test at compile time.
>
> Defining which casts are lossy and which are exact is fairly
> straightforward; we
> can appeal to the concept already in the JLS of "representable in the
> range of a
> type."  For some pairs of types, casting is always exact (e.g., casting
> `int` to
> `long` is always exact); we call these "unconditionally exact".  For other
> pairs
> of types, some values can be cast exactly and others cannot.
>
> Defining which casts are exact gives us a simple and precise semantics for
> `x
> instanceof T`: whether `x` can be cast exactly to `T`.  Similarly, if the
> static
> type of `x` is not castable to `T`, then the corresponding `instanceof`
> question
> is rejected statically.  The answers are not suprising:
>
>  - Boxing is always exact;
>  - Unboxing is exact for all non-null values;
>  - Reference widening is always exact;
>  - Reference narrowing is exact if the type of the target expression is a
>    subtype of the target type;
>  - Primitive widening and narrowing are exact if the target expression can
> be
>    represented in the range of the target type.
>
> #### Primitive type patterns
>
> It is a short hop from `instanceof` to patterns (including primitive type
> patterns, and reference type patterns applied to primitive types), which
> can be
> defined entirely in terms of cast conversion and exactness:
>
>  - A type pattern `T t` is applicable to a target of type `S` if `S` is
>    cast-convertible to `T`;
>  - A type pattern `T t` matches a target `x` if `x` can be cast exactly to
> `T`;
>  - A type pattern `T t` is unconditional at type `S` if casting from `T`
> to `S`
>    is unconditionally exact;
>  - A type pattern `T t` dominates a type pattern `S s` (or a record pattern
>    `S(...)`) if `T t` would be unconditional on `S`.
>
> While the rules for casting are complex, primitive patterns add no new
> complexity; there are no new conversions or conversion contexts.  If we
> see:
>
>     switch (a) {
>         case T t: ...
>     }
>
> we know the case matches if `a` can be cast exactly to `T`, and the
> pattern is
> unconditional if _all_ values of `a`'s type can be cast exactly to `T`.
> Note
> that none of this is specific to primitives; we derive the semantics of
> _all_
> type patterns from the enhanced definition of casting.
>
> Now, our record deconstruction examples work symmetrically to
> construction:
>
>     case R(int i)     // OK
>     case R(short s)   // test if `i` is in the range of `short`
>     case R(Integer i) // box `i` to `Integer`
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/57768184/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4003 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/57768184/smime-0001.p7s>

From brian.goetz at oracle.com  Thu Sep  8 20:03:45 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 8 Sep 2022 16:03:45 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <CAChqX89jk3v10B2q4kA6xHJE9uf2Y7ZBYEgb_859YXxzzx4HNQ@mail.gmail.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <CAChqX89jk3v10B2q4kA6xHJE9uf2Y7ZBYEgb_859YXxzzx4HNQ@mail.gmail.com>
Message-ID: <601c9f99-da45-5990-21c0-baf3731f19db@oracle.com>

Sigh, floating point.? Yes, this is the most difficult corner of this work.

Bear in mind that you're really asking questions about cast conversion.? 
Additionally, we have to define which of these casts are "lossy" vs 
which are "information preserving" (exact), which for most numbers is 
straightforward but for weird floating point values might be harder.?? 
So let's first see what happens when we cast your examples:

jshell> (int) 2.0f
$1 ==> 2

jshell> (int) 2.5f
$2 ==> 2

jshell> (float) Double.MAX_VALUE
Infinity

jshell> (float) Math.PI
$4 ==> 3.1415927

jshell> (float) Double.NaN
$5 ==> NaN

jshell> (double) Double.NaN
$6 ==> NaN

Clearly #2 (casting 2.5 to int) is lossy, and therefore would not be 
exact, so we can cross that one off the list.? Similar for 
Double.MAX_VALUE to float.? It's also pretty clear that Math.PI -- which 
is merely an alias for 3.14159265358979323846 -- is also not 
representable in the range of float.

So that leaves:

 ??? 2.0 instanceof int
 ??? Double.NaN instanceof float
 ??? Double.NaN instanceof double

The first one is an exact cast; this can be specified in a number of 
ways, including "sufficient number of low order bits are zero", or 
"representable in the range of the target type", or others.? The binary 
representation of 2.0f is 01000000000000000000000000000000, which 
exactly encodes an integer.? Again, this is not a rule about 
`instanceof`; this is derived from casting (though we do have to define 
what exactness means.)

For NaN, negative zero, and infinity, whether various conversions are 
exact or not may require a somewhat ad-hoc decision, but I think the 
intuitive answers for NaN is that Float.NaN <--> Double.NaN is exact, 
and similar for Float.Inf <--> Double.Inf.? There will be an argument 
about -0.0 and 0.0.

> Intuitively, I think I would expect that if t is an expression of 
> primitive type T, and U is also a primitive type, then t instanceof 
> Uis true iff?t == (T) (U) t. (Perhaps with a variant of == that is 
> reflexive even for NaN, or perhaps not.) That would make (1) true, 
> (2,3,4) false, and (5,6) who knows.

This is a good intuition, and is true for some types, and almost true 
for others, but it falls into some holes.? In particular, with some 
conversions, `(U) t` is lossy but `(T) (U) t` is also lossy in an 
exactly compensating way.

> This makes a lot of sense. I'm wondering, though, how it works with 
> float and double. I don't see the answer by looking at JLS Ch5. Are 
> the following expressions legal, and if so which ones are true?
>
>  1. 2.0 instanceof int
>  2. 2.5 instanceof int
>  3. Double.MAX_VALUE instanceof float
>  4. Math.PI instanceof float
>  5. Double.NaN instanceof float
>  6. Double.NaN instanceof double
>
> Intuitively, I think I would expect that if t is an expression of 
> primitive type T, and U is also a primitive type, then t instanceof 
> Uis true iff?t == (T) (U) t. (Perhaps with a variant of == that is 
> reflexive even for NaN, or perhaps not.) That would make (1) true, 
> (2,3,4) false, and (5,6) who knows.
>
> On Thu, 8 Sept 2022 at 09:53, Brian Goetz <brian.goetz at oracle.com> wrote:
>
>     Earlier in the year we talked about primitive type patterns.? Let
>     me summarize
>     the past discussion, what I think the right direction is, and why
>     this is (yet
>     another) "finishing up the job" task for basic patterns that, if
>     left undone,
>     will be a sharp edge.
>
>     Prior to record patterns, we didn't support primitive type
>     patterns at all. With
>     records, we now support primitive type patterns as nested
>     patterns, but they are
>     very limited; they are only applicable to exactly their own type.
>
>     The motivation for "finishing" primitive type patterns is the same
>     as discussed
>     earlier this week with array patterns -- if pattern matching is
>     the dual of
>     aggregation, we want to avoid gratuitous asymmetries that let you
>     put things
>     together but not take them apart.
>
>     Currently, we can assign a `String` to an `Object`, and recover
>     the `String`
>     with a pattern match:
>
>     ??? Object o = "Bob";
>     ??? if (o instanceof String s) { println("Hi Bob"); }
>
>     Analogously, we can assign an `int` to a `long`:
>
>     ??? long n = 0;
>
>     but we cannot yet recover the int with a pattern match:
>
>     ??? if (n instanceof int i) { ... } // error, pattern `int i` not
>     applicable to `long`
>
>     To fill out some more of the asymmetries around records if we
>     don't finish the job: given
>
>     ??? record R(int i) { }
>
>     we can construct it with
>
>     ??? new R(anInt)???? // no adaptation
>     ??? new R(aShort)??? // widening
>     ??? new R(anInteger) // unboxing
>
>     but yet cannot deconstruct it the same way:
>
>     ??? case R(int i)???? // OK
>     ??? case R(short s)?? // nope
>     ??? case R(Integer i) // nope
>
>     It would be a gratuitous asymmetry that we can use pattern
>     matching to recover from
>     reference widening, but not from primitive widening. While many of the
>     arguments against doing primitive type patterns now were of the
>     form "let's keep
>     things simple", I believe that the simpler solution is actually to
>     _finish the
>     job_, because this minimizes asymmetries and potholes that users
>     would otherwise
>     have to maintain a mental catalog of.
>
>     Our earlier explorations started (incorrectly, as it turned out), with
>     assignment context.? This direction gave us a good push in the
>     right direction,
>     but turned out to not be the right answer.? A more careful reading
>     of JLS Ch5
>     convinced me that the answer lies not in assignment conversion,
>     but _cast
>     conversion_.
>
>     #### Stepping back: instanceof
>
>     The right place to start is actually not patterns, but
>     `instanceof`.? If we
>     start here, and listen carefully to the specification, it leads us
>     to the
>     correct answer.
>
>     Today, `instanceof` works only for reference types. Accordingly,
>     most people
>     view `instanceof` as "the subtyping operator" -- because that's
>     the only
>     question we can currently ask it.? We almost never see
>     `instanceof` on its own;
>     it is nearly always followed by a cast to the same type.?
>     Similarly, we rarely
>     see a cast on its own; it is nearly always preceded by an
>     `instanceof` for the
>     same type.
>
>     There's a reason these two operations travel together: casting is,
>     in general,
>     unsafe; we can try to cast an `Object` reference to a `String`,
>     but if the
>     reference refers to another type, the cast will fail. So to make
>     casting safe,
>     we precede it with an `instanceof` test.? The semantics of
>     `instanceof` and
>     casting align such that `instanceof` is the precondition test for
>     safe casting.
>
>     > instanceof is the precondition for safe casting
>
>     Asking `instanceof T` means "if I cast this to T, would I like the
>     answer."
>     Obviously CCE is an unlikable answer; `instanceof` further adopts
>     the opinion
>     that casting `null` would also be an unlikable answer, because
>     while the cast
>     would succeed, you can't do anything useful with the result.
>
>     Currently, `instanceof` is only defined on reference types, and on
>     this domain
>     coincides with subtyping.? On the other hand, casting is defined
>     between
>     primitive types (widening, narrowing), and between primitive and
>     reference types
>     (boxing, unboxing).? Some casts involving primitives yield
>     "better" results than
>     others; casting `0` to `byte` results in no loss of information,
>     since `0` is
>     representable as a byte, but casting `500` to `byte` succeeds but
>     loses
>     information because the higher order bits are discarded.
>
>     If we characterize some casts as "lossy" and others as "exact" --
>     where lossy
>     means discarding useful information -- we can extend the "safe casting
>     precondition" meaning of `instanceof` to primitive operands and
>     types in the
>     obvious way -- "would casting this expression to this type succeed
>     without error
>     and without information loss."? If the type of the expression is
>     not castable to
>     the type we are asking about, we know the cast cannot succeed and
>     reject the
>     `instanceof` test at compile time.
>
>     Defining which casts are lossy and which are exact is fairly
>     straightforward; we
>     can appeal to the concept already in the JLS of "representable in
>     the range of a
>     type."? For some pairs of types, casting is always exact (e.g.,
>     casting `int` to
>     `long` is always exact); we call these "unconditionally exact".?
>     For other pairs
>     of types, some values can be cast exactly and others cannot.
>
>     Defining which casts are exact gives us a simple and precise
>     semantics for `x
>     instanceof T`: whether `x` can be cast exactly to `T`. Similarly,
>     if the static
>     type of `x` is not castable to `T`, then the corresponding
>     `instanceof` question
>     is rejected statically.? The answers are not suprising:
>
>     ?- Boxing is always exact;
>     ?- Unboxing is exact for all non-null values;
>     ?- Reference widening is always exact;
>     ?- Reference narrowing is exact if the type of the target
>     expression is a
>     ?? subtype of the target type;
>     ?- Primitive widening and narrowing are exact if the target
>     expression can be
>     ?? represented in the range of the target type.
>
>     #### Primitive type patterns
>
>     It is a short hop from `instanceof` to patterns (including
>     primitive type
>     patterns, and reference type patterns applied to primitive types),
>     which can be
>     defined entirely in terms of cast conversion and exactness:
>
>     ?- A type pattern `T t` is applicable to a target of type `S` if
>     `S` is
>     ?? cast-convertible to `T`;
>     ?- A type pattern `T t` matches a target `x` if `x` can be cast
>     exactly to `T`;
>     ?- A type pattern `T t` is unconditional at type `S` if casting
>     from `T` to `S`
>     ?? is unconditionally exact;
>     ?- A type pattern `T t` dominates a type pattern `S s` (or a
>     record pattern
>     ?? `S(...)`) if `T t` would be unconditional on `S`.
>
>     While the rules for casting are complex, primitive patterns add no new
>     complexity; there are no new conversions or conversion contexts.?
>     If we see:
>
>     ??? switch (a) {
>     ??????? case T t: ...
>     ??? }
>
>     we know the case matches if `a` can be cast exactly to `T`, and
>     the pattern is
>     unconditional if _all_ values of `a`'s type can be cast exactly to
>     `T`.? Note
>     that none of this is specific to primitives; we derive the
>     semantics of _all_
>     type patterns from the enhanced definition of casting.
>
>     Now, our record deconstruction examples work symmetrically to
>     construction:
>
>     ??? case R(int i)???? // OK
>     ??? case R(short s)?? // test if `i` is in the range of `short`
>     ??? case R(Integer i) // box `i` to `Integer`
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220908/6fcfa17c/attachment-0001.htm>

From alex.buckley at oracle.com  Thu Sep  8 22:32:42 2022
From: alex.buckley at oracle.com (Alex Buckley)
Date: Thu, 8 Sep 2022 15:32:42 -0700
Subject: Unnamed variables and match-all patterns
In-Reply-To: <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com>
References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com>
 <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr>
 <cf99168c-156b-dd74-de6b-2e77c4f5c580@oracle.com>
 <CAE+3fja8Ws7B+DkWmu1-Jcne5Av+mkCSuEur-4iK4KDAQoayOw@mail.gmail.com>
 <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com>
Message-ID: <db44f856-57d4-f6c1-732c-89415e14286c@oracle.com>

On 9/8/2022 5:22 AM, Brian Goetz wrote:
> For the people who complain about this, I don't think it's about saving 
> a few characters in the declaration, as much as satisyfing static 
> analysis that complains about unused parameters.? But I suspect that 
> many of these have already become lambdas (this happened most commonly 
> with anonymous classes previously).? So I'm willing to do the experiment 
> of A first and see if we need to take the next step.

A longstanding request is for method parameters that are implicitly 
final, so that static analysis can point out dumb assignments to them in 
the method body. I suspect a lot of requestors are actually thinking 
about constructor parameters, where useless self-assignment (`firstName 
= firstName;`) is a tripwire for Java beginners. Of course, record 
classes sidestep the assignment boilerplate completely, but being able 
to denote a constructor parameter as unusable, and therefore unused, and 
therefore not contributory to the state of an object, feels like it has 
some utility. This speaks to keeping an open mind about D, even if A is 
the first step.

Alex

From forax at univ-mlv.fr  Fri Sep  9 15:35:44 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Fri, 9 Sep 2022 17:35:44 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
Message-ID: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Thursday, September 8, 2022 6:53:21 PM
> Subject: Primitives in instanceof and patterns

> Earlier in the year we talked about primitive type patterns. Let me summarize
> the past discussion, what I think the right direction is, and why this is (yet
> another) "finishing up the job" task for basic patterns that, if left undone,
> will be a sharp edge.

> Prior to record patterns, we didn't support primitive type patterns at all. With
> records, we now support primitive type patterns as nested patterns, but they are
> very limited; they are only applicable to exactly their own type.

> The motivation for "finishing" primitive type patterns is the same as discussed
> earlier this week with array patterns -- if pattern matching is the dual of
> aggregation, we want to avoid gratuitous asymmetries that let you put things
> together but not take them apart.

> Currently, we can assign a `String` to an `Object`, and recover the `String`
> with a pattern match:

> Object o = "Bob";
> if (o instanceof String s) { println("Hi Bob"); }

> Analogously, we can assign an `int` to a `long`:

> long n = 0;

> but we cannot yet recover the int with a pattern match:

> if (n instanceof int i) { ... } // error, pattern `int i` not applicable to
> `long`

> To fill out some more of the asymmetries around records if we don't finish the
> job: given

> record R(int i) { }

> we can construct it with

> new R(anInt) // no adaptation
> new R(aShort) // widening
> new R(anInteger) // unboxing

> but yet cannot deconstruct it the same way:

> case R(int i) // OK
> case R(short s) // nope
> case R(Integer i) // nope

> It would be a gratuitous asymmetry that we can use pattern matching to recover
> from
> reference widening, but not from primitive widening. While many of the
> arguments against doing primitive type patterns now were of the form "let's keep
> things simple", I believe that the simpler solution is actually to _finish the
> job_, because this minimizes asymmetries and potholes that users would otherwise
> have to maintain a mental catalog of.

> Our earlier explorations started (incorrectly, as it turned out), with
> assignment context. This direction gave us a good push in the right direction,
> but turned out to not be the right answer. A more careful reading of JLS Ch5
> convinced me that the answer lies not in assignment conversion, but _cast
> conversion_.

> #### Stepping back: instanceof

> The right place to start is actually not patterns, but `instanceof`. If we
> start here, and listen carefully to the specification, it leads us to the
> correct answer.

> Today, `instanceof` works only for reference types. Accordingly, most people
> view `instanceof` as "the subtyping operator" -- because that's the only
> question we can currently ask it. We almost never see `instanceof` on its own;
> it is nearly always followed by a cast to the same type. Similarly, we rarely
> see a cast on its own; it is nearly always preceded by an `instanceof` for the
> same type.

> There's a reason these two operations travel together: casting is, in general,
> unsafe; we can try to cast an `Object` reference to a `String`, but if the
> reference refers to another type, the cast will fail. So to make casting safe,
> we precede it with an `instanceof` test. The semantics of `instanceof` and
> casting align such that `instanceof` is the precondition test for safe casting.

> > instanceof is the precondition for safe casting

> Asking `instanceof T` means "if I cast this to T, would I like the answer."
> Obviously CCE is an unlikable answer; `instanceof` further adopts the opinion
> that casting `null` would also be an unlikable answer, because while the cast
> would succeed, you can't do anything useful with the result.

> Currently, `instanceof` is only defined on reference types, and on this domain
> coincides with subtyping. On the other hand, casting is defined between
> primitive types (widening, narrowing), and between primitive and reference types
> (boxing, unboxing). Some casts involving primitives yield "better" results than
> others; casting `0` to `byte` results in no loss of information, since `0` is
> representable as a byte, but casting `500` to `byte` succeeds but loses
> information because the higher order bits are discarded.

> If we characterize some casts as "lossy" and others as "exact" -- where lossy
> means discarding useful information -- we can extend the "safe casting
> precondition" meaning of `instanceof` to primitive operands and types in the
> obvious way -- "would casting this expression to this type succeed without error
> and without information loss." If the type of the expression is not castable to
> the type we are asking about, we know the cast cannot succeed and reject the
> `instanceof` test at compile time.

> Defining which casts are lossy and which are exact is fairly straightforward; we
> can appeal to the concept already in the JLS of "representable in the range of a
> type." For some pairs of types, casting is always exact (e.g., casting `int` to
> `long` is always exact); we call these "unconditionally exact". For other pairs
> of types, some values can be cast exactly and others cannot.

> Defining which casts are exact gives us a simple and precise semantics for `x
> instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the static
> type of `x` is not castable to `T`, then the corresponding `instanceof` question
> is rejected statically. The answers are not suprising:

> - Boxing is always exact;
> - Unboxing is exact for all non-null values;
> - Reference widening is always exact;
> - Reference narrowing is exact if the type of the target expression is a
> subtype of the target type;
> - Primitive widening and narrowing are exact if the target expression can be
> represented in the range of the target type.

> #### Primitive type patterns

> It is a short hop from `instanceof` to patterns (including primitive type
> patterns, and reference type patterns applied to primitive types), which can be
> defined entirely in terms of cast conversion and exactness:

> - A type pattern `T t` is applicable to a target of type `S` if `S` is
> cast-convertible to `T`;
> - A type pattern `T t` matches a target `x` if `x` can be cast exactly to `T`;
> - A type pattern `T t` is unconditional at type `S` if casting from `T` to `S`
> is unconditionally exact;
> - A type pattern `T t` dominates a type pattern `S s` (or a record pattern
> `S(...)`) if `T t` would be unconditional on `S`.

> While the rules for casting are complex, primitive patterns add no new
> complexity; there are no new conversions or conversion contexts. If we see:

> switch (a) {
> case T t: ...
> }

> we know the case matches if `a` can be cast exactly to `T`, and the pattern is
> unconditional if _all_ values of `a`'s type can be cast exactly to `T`. Note
> that none of this is specific to primitives; we derive the semantics of _all_
> type patterns from the enhanced definition of casting.

> Now, our record deconstruction examples work symmetrically to construction:

> case R(int i) // OK
> case R(short s) // test if `i` is in the range of `short`
> case R(Integer i) // box `i` to `Integer`

I think we hev to be careful with you notion of dual here, a record canonical constructor and a deconstructing pattern are dual, but it's a special case because the deconstructing pattern always match, once you introduce patterns that may match or not, there is no duality anymore. 

The primitive pattern you propose is clearly not the dual of the cast conversions, because the casting conversions are verified by the compiler while some of the primitive patterns you propose are checked at runtime. 

As an example, if there is a method declared like this 
static void m(int i) { ... } 

and this method is called with a short, 
short s = ... 
m(s); 

there is an implicit conversion from short to int, and if the first parameter of m is not compatible a compiler error occurs. 

If you compare with the corresponding pattern 
int i = ... 
switch(i) { 
case short s -> ... 
} 

The semantics you propose is not to emit a compile error but at runtime to check if the value "i" is beetween Short.MIN_VALUE and Short.MAX_VALUE. 

So there is perhaps a syntactic duality but clearly there is no semantics duality. 

Moreover, the semantics you propose is not aligned with the concept of data oriented programming which says that the data are more important than the code so that we should try to raise a compile error when the data changed to help the developer to change the code. 

If we take a simple example 
record Point(int x, int y) { } 
Point point = ... 
switch(point) { 
case Point(int i, int j) -> ... 
... 
} 

let say know that we change Point to use longs 
record Point(long x, long y) { } 

With the semantics you propose, the code still compile but the pattern is now transformed to a partial pattern that will not match all Points but only the ones with x and y in between Integer.MIN_VALUE and Integer.MAX_VALUE. 

I believe this is exactly what Stephen Colbourne was complaining when we discussed the previous iteration of this spec, the semantics of the primtiive pattern change depending on the definition of the data. 

The remark of Tagir about array pattern also works here, having a named pattern like Short.asShort() makes the semantics far cleared because it disambiguate between a a pattern that request a conversion and a pattern that does a conversion because the data definition has changed. 

And i'm still worry that we are muddying the water here, instanceof is about instance and subtypining relationship (hence the name), extending it to cover non-instance / primitive value is very confusing. 

regards, 
R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/39d39ef3/attachment-0001.htm>

From forax at univ-mlv.fr  Fri Sep  9 16:09:03 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Fri, 9 Sep 2022 18:09:03 +0200 (CEST)
Subject: Array patterns (and varargs patterns)
In-Reply-To: <d87c0168-5a10-7c4e-7026-4d9ee04ab024@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
 <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
 <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>
 <d87c0168-5a10-7c4e-7026-4d9ee04ab024@oracle.com>
Message-ID: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Tagir Valeev" <amaembo at gmail.com>, "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Thursday, September 8, 2022 12:15:04 AM
> Subject: Re: Array patterns (and varargs patterns)

>> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
> 
> Its a named pattern, but to work, it would need varargs patterns -- and
> array patterns are the underpinnings of varargs, just as array creation
> is the underpinning of varargs invocation.? We're not going to do
> varargs patterns differently than we do varargs invocation, just to
> avoid doing array patterns -- that would be silly.

Here we want to extract the value into bindings/variables, that is not what the varargs does, the varargs  takes a bunch of value on stack and put them into an array.
Here we want the opposite operation of a varargs, the spread (or splat) operator that takes the argument from an array (or a collection ?) and put them on the stack. 

If we have the pattern method Arrays.of()

static <T> pattern (T...) of(T[] array) {  // here it's a varargs
  ...
}

and we call it using a named pattern
  switch(array) {
    case Arrays.of(/* insert a syntax here */) -> ...

the syntax should extract some/all values of the array into one or several bindings.

If we are in Caml, we have the :: operator to separate the first element from the rest
  switch(array) {
    case Arrays.of(String first :: String[] rest) -> ...

If we are in JavaScript, we have the spread operator (notice that the ... is before the type)
  switch(array) {
    case Arrays.of(String first, ... String[] rest) -> ...

So the varargs is at the declaration side, at the pattern side we need a new operator spread, so i think that adding an array pattern now is not a good idea.

regards,
R?mi

>>
>>>> With best regards,
>>>> Tagir Valeev.
>>>>
>>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>>>> We dropped this out of the record patterns JEP, but I think it is time to
>>>>> revisit this.
>>>>>
>>>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>>>> container for nested patterns.  And they have an obvious duality with array
>>>>> creation expressions.
>>>>>
>>>>> The main open question here was how we distinguish between "match an array of
>>>>> length exactly N" (where there are N nested patterns) and "match an array of
>>>>> length at least N".  We toyed with the idea of a "..." indicator to mean "more
>>>>> elements", but this felt a little forced and opened new questions.
>>>>>
>>>>> It later occurred to me that there is another place to nest a pattern in an
>>>>> array pattern -- to match (and bind) the length.  In the following, assume for
>>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>>>> denote here with a constant literal.
>>>>>
>>>>> There is an obvious place to put this (optional) pattern: in between the
>>>>> brackets.  So:
>>>>>
>>>>>       case String[1] { P }:
>>>>>                   ^ a constant pattern
>>>>>
>>>>> would match string arrays of length 1 whose sole element matches P.  And
>>>>>
>>>>>       case String[] { P, Q }
>>>>>
>>>>> would match string arrays of length exactly 2, whose first two elements match P
>>>>> and Q respectively.  (If the length pattern is not specified, we infer a
>>>>> constant pattern whose constant is equal to the length of the nested pattern
>>>>> list.)
>>>>>
>>>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>>>
>>>>>       x instanceof String[] arr
>>>>>           && arr.length matches L
>>>>>           && arr.length >= n
>>>>>           && arr[0] matches P0
>>>>>           && arr[1] matches P1
>>>>>           ...
>>>>>           && arr[n] matches Pn
>>>>>
>>>>> More examples:
>>>>>
>>>>>       case String[int len] { P }
>>>>>
>>>>> would match string arrays of length >= 1 whose first element matches P, and
>>>>> further binds the array length to `len`.
>>>>>
>>>>>       case String[_] { P, Q }
>>>>>
>>>>> would match string arrays of any length whose first two elements match P and Q.
>>>>>
>>>>>       case String[3] { }
>>>>>                   ^constant pattern
>>>>>
>>>>> matches all string arrays of length 3.
>>>>>
>>>>>
>>>>> This is a more principled way to do it, because the length is a part of the
>>>>> array and deserves a chance to match via nested patterns, just as with the
>>>>> elements, and it avoid trying to give "..." a new meaning.
>>>>>
>>>>> The downside is that it might be confusing at first (though people will learn
>>>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>>>
>>>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>>>> attach another sub-feature: array patterns.  (This also bears on the question
>>>>> of "how would varargs patterns work", which I'll address below, though they
>>>>> might come later.)
>>>>>
>>>>> ## Array Patterns
>>>>>
>>>>> If we want to create a new array, we do so with an array construction
>>>>> expression:
>>>>>
>>>>>       new String[] { "a", "b" }
>>>>>
>>>>> Since each form of aggregation should have its dual in destructuring, the
>>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>>>> is:
>>>>>
>>>>>       if (arr instanceof String[] { var a, var b }) { ... }
>>>>>
>>>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>>>> to the nested patterns `var a` and `var b`.   This is the natural analogue of
>>>>> deconstruction patterns for arrays, complete with nesting.
>>>>>
>>>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>>>> rather than simply "length == 2".  There are multiple syntactic ways to get
>>>>> there, for now I'm going to write
>>>>>
>>>>>       if (arr instanceof String[] { var a, var b, ... })
>>>>>
>>>>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>>>>
>>>>> <digression>
>>>>> People are immediately going to ask "can I bind something to the remainder"; I
>>>>> think this is mostly an "attractive distraction", and would prefer to not have
>>>>> this dominate the discussion.
>>>>> </digression>
>>>>>
>>>>> Here's an example from the JDK that could use this effectively:
>>>>>
>>>>> String[] limits = limitString.split(":");
>>>>> try {
>>>>>       switch (limits.length) {
>>>>>           case 2: {
>>>>>               if (!limits[1].equals("*"))
>>>>>                   setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>>>           }
>>>>>           case 1: {
>>>>>               if (!limits[0].equals("*"))
>>>>>                   setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>>>           }
>>>>>       }
>>>>> }
>>>>> catch(NumberFormatException ex) {
>>>>>       setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>>>       setMultilineLimit(MultilineLimit.LENGTH, -1);
>>>>> }
>>>>>
>>>>> becomes (eventually)
>>>>>
>>>>>       switch (limitString.split(":")) {
>>>>>           case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>>>           case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>>>           default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>>>       }
>>>>>
>>>>> Note how not only does this become more compact, but the unchecked
>>>>> "NumberFormatException" is folded into the match, rather than being a separate
>>>>> concern.
>>>>>
>>>>>
>>>>> ## Varargs patterns
>>>>>
>>>>> Having array patterns offers us a natural way to interpret deconstruction
>>>>> patterns for varargs records.  Assume we have:
>>>>>
>>>>>       void m(X... xs) { }
>>>>>
>>>>> Then a varargs invocation
>>>>>
>>>>>       m(a, b, c)
>>>>>
>>>>> is really sugar for
>>>>>
>>>>>       m(new X[] { a, b, c })
>>>>>
>>>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>>>> array pattern.  So for a record
>>>>>
>>>>>       record R(X... xs) { }
>>>>>
>>>>> a varargs match:
>>>>>
>>>>>       case R(var a, var b, var c):
>>>>>
>>>>> is really sugar for an array match:
>>>>>
>>>>>       case R(X[] { var a, var b, var c }):
>>>>>
>>>>> And similarly, we can use our "more arity" indicator:
>>>>>
>>>>>       case R(var a, var b, var c, ...):
>>>>>
>>>>> to indicate that there are at least three elements.
>>>>>

From brian.goetz at oracle.com  Fri Sep  9 18:07:41 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 9 Sep 2022 14:07:41 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
Message-ID: <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>


>
> The semantics you propose is not to emit a compile error but at 
> runtime to check if the value "i" is beetween Short.MIN_VALUE and 
> Short.MAX_VALUE.
>
> So there is perhaps a syntactic duality but clearly there is no 
> semantics duality.

Of course there is a semantic duality here.? Specifically, `int` and 
`short` are related by an _embedding-projection pair_.? Briefly: given 
two sets A and B (think "B" for "bigger"), an approximation metric on B 
(a complete partial ordering), and a pair of functions `e : A -> B` and 
`p : B -> A`, they form an e-p pair if (a) p . e is the identity 
function (dot is compose), and e . p produces an approximation of the 
input (according to the metric.)

The details are not critical here (though this algebraic structure shows 
up everywhere in our work if you look closely), but the point remains: 
there is an algebraic duality here.? Yes, when going in one direction, 
no runtime tests are needed; when going in the other direction, because 
it may be lossy in one direction, a runtime test is needed in that 
direction.? Just like with `instanceof String` / `case String s` today.

Anyway, I don't think you're saying what you really mean.? Let's not get 
caught up in silly arguments about what "dual" means; that won't be 
helpful.

> Moreover, the semantics you propose is not aligned with the concept of 
> data oriented programming which says that the data are more important 
> than the code so that we should try to raise a compile error when the 
> data changed to help the developer to change the code.
>
> If we take a simple example
> ? record Point(int x, int y) { }
> ? Point point = ...
> ? switch(point) {
> ?? case Point(int i, int j) -> ...
> ?? ...
> ? }
>
> let say know that we change Point to use longs
> ? record Point(long x, long y) { }
>
> With the semantics you propose, the code still compile but the pattern 
> is now transformed to a partial pattern that will not match all Points 
> but only the ones with x and y in between Integer.MIN_VALUE and 
> Integer.MAX_VALUE.

This is an extraneous argument; if you change the declaration of Point 
to take two Strings, of course all the use sites will change their 
meaning.? Maybe they'll still compile but mean something else, maybe 
they will be errors.? Patterns are not special here; the semantics of 
nearly all language features (assignment, arithmetic, etc) will change 
when you change the type of the underlying arguments.? That the meaning 
of patterns changes also when you change the types involved is just more 
of the same.

> I believe this is exactly what Stephen Colbourne was complaining when 
> we discussed the previous iteration of this spec, the semantics of the 
> primtiive pattern change depending on the definition of the data.

I think what Stephen didn't like is that there is no syntactic 
difference between a total and partial pattern at the use site.? And I 
get why that made him uncomfortable; it's a valid concern, and one could 
imagine designing the language so that total and partial patterns look 
different.? This is one of the tradeoffs we have made; I do still think 
we picked a good one.

> The remark of Tagir about array pattern also works here, having a 
> named pattern like Short.asShort() makes the semantics far cleared 
> because it disambiguate between a a pattern that request a conversion 
> and a pattern that does a conversion because the data definition has 
> changed.

If the language didn't support primitive widening in assignment / method 
invocation context (like Golang does), and instead said "use 
Integer::toLong (or Long::fromInteger) to convert int -> long", then 
yes, the natural duality would be to also represent these as named 
patterns; then conversions in both directions are mediated by API 
points, total in one direction, partial in the other.? But that's not 
the language we have!? The language we have allows us to provide an int 
where a long is needed, and the language does the needful.? Pattern 
matching allows us to recover whether a value came from a certain type, 
even after we've lost the static type information.? Just as we can 
recover the String-ness here:

 ??? Object o = "Foo";
 ??? if (o instanceof String s) { ... }

because reference type patterns are willing to conditionally reverse 
reference widening, all the same arguments apply to

 ??? long n = 3;
 ??? if (n instanceof int i) { ... }

And not allowing this makes the language *more* complicated, because now 
some conversions are reversible and some are not, for ad-hoc reasons 
that no one will be able to understand.? Can you offer any compelling 
reason why we should be able to recover the String-ness of `o` after a 
widening, but not the int-ness of `n` after a widening?

> And i'm still worry that we are muddying the water here, instanceof is 
> about instance and subtypining relationship (hence the name), 
> extending it to cover non-instance / primitive value is very confusing.

Sorry, this is a cheap rhetorical trick; declaring words to mean what 
you want them to mean, and then pointing to that meaning as a way to 
close the argument.

Yes, saying "instanceof T is about subtyping" is a useful mental model 
*when the only types you can apply it to are those related by inclusion 
polymorphism*."? But the restriction of instanceof to reference types is 
arbitrary (and we've already decided to allow patterns in instanceof, 
which are surely not mere subtyping.)

Regardless, a better way to think about `instanceof` is that it is the 
precondition for "would a cast to this type be safe and useful."? In the 
world where we restrict to reference types, the two notions coincide.? 
But the safe-cast-precondition is clearly more general (this is like the 
difference between defining the function 2^n on Z, vs on R or C; of 
course they have to agree at the integers, but the continuous 
exponential function is far more useful than the discrete one.)? 
Moreover, the general mental model is just as simple: how do you know a 
cast is safe?? Ask instanceof.? What does safe mean?? No error or 
material loss of precision.

A more reasonable way to state this objection would be: "most users 
believe that `instnaceof` is purely about subtyping, and it will take 
some work to bring them around to a more general interpretation, how are 
we going to do that?"


Jumping up a level, you're throwing a lot of arguments at the wall that 
mostly come down to "I don't like this feature, so let me try and 
undermine it."? That's not a particularly helpful way to go about this, 
and none of the arguments so far have been very compelling (nor are they 
new from the last time we went around on it.)? I get that you would like 
pattern matching to have a more "surface" role in the language; that's a 
valid opinion.? But I would also like you to try harder to understand 
what we're trying to achieve and why we're pushing it deeper, and to 
respond to the substance of the proposal rather than just saying "YAGNI".

(I strongly encourage everyone to re-read JLS Ch5, and to meditate on 
*why* we have the particular conversions in the contexts we have.? 
They're complex, but not arbitrary; if you listen closely to the 
specification, it sometimes whispers to you.)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/7cdab5ef/attachment-0001.htm>

From brian.goetz at oracle.com  Fri Sep  9 18:29:37 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 9 Sep 2022 14:29:37 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
 <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
 <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>
 <d87c0168-5a10-7c4e-7026-4d9ee04ab024@oracle.com>
 <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr>
Message-ID: <7a7f81f9-8b6b-635c-fe54-4605610570c6@oracle.com>

Again, look for the embedding projection pairs.? The sets involved are 
T^n and T[].? The array creation operator is an embedding from T^n to 
T[]; the missing dual is the projection from T[] to T^k (for specific 
k.)? Projections are partial (or lossy), so these are patterns rather 
than total functions.? The dual of packing an array from a list of 
expressions is unpacking the elements into a list of variables.

When I pack an array:

 ??? String[] ss = new String[] { "Hi", "Bob" };

this has a similar feel to

 ??? Object o = "Bob";

in that we've thrown away some static typing information (in the former, 
that the array has length two.)? But this information is retained 
dynamically, and we can recover it with a runtime test. Asking

 ??? if (o instanceof String s) { ... }

is asking "was the last assignment to `o` from a String".? Asking

 ??? if (ss instanceof String[] { var a, var b }) { ... }

is asking "was the last assignment to ss a String[] with two elements" 
(and similar for other configurations of the nested patterns.)? In both 
cases, we are asking the same generalized question: could this { object, 
array } have come from an assignment / creation expression that has a 
certain shape.

I get it; you don't find this feature compelling.? You've said that 
already, and now we're just going in circles.? Your mail reads to me 
like "its a bad idea because I think its a bad idea."? Yes, other 
languages approach this in different ways; Caml deconstructs into (head, 
tail) because its fundamental data structure is a cons list. That makes 
sense given how the language works.? Java works differently, so 
transplanting from Caml or Javascript is not always going to be a good 
answer.? Remember the pattern mantra: each aggregation idiom in the 
language should have a corresponding form deconstruction pattern.? 
Constructors have deconstruction patterns; factory methods will 
eventually have named static patterns; if we add collection literals, 
there will be collection patterns, etc.? If an aggregation form lacks a 
corresponding dual, this turns into an asymmetry which in turn means 
*destructuring cannot compose the same way aggregation composes*.? This 
is bad!? Arrays have their own special form of aggregation (array 
creation expression); array patterns are the corresponding destructuring.

I encourage you to re-read 
https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model 
, and the "red ball" API examples, to see what I mean.? This is about 
composibility, not about whether any given form of pattern "pays its 
weight."


So again, please try harder to engage with _why do we think this is 
important_, and the specifics of what has been proposed, rather than 
just waving the YAGNI stick.? There's a bigger picture here.

>>> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
>> Its a named pattern, but to work, it would need varargs patterns -- and
>> array patterns are the underpinnings of varargs, just as array creation
>> is the underpinning of varargs invocation.? We're not going to do
>> varargs patterns differently than we do varargs invocation, just to
>> avoid doing array patterns -- that would be silly.
> Here we want to extract the value into bindings/variables, that is not what the varargs does, the varargs  takes a bunch of value on stack and put them into an array.
> Here we want the opposite operation of a varargs, the spread (or splat) operator that takes the argument from an array (or a collection ?) and put them on the stack.
>
> If we have the pattern method Arrays.of()
>
> static <T> pattern (T...) of(T[] array) {  // here it's a varargs
>    ...
> }
>
> and we call it using a named pattern
>    switch(array) {
>      case Arrays.of(/* insert a syntax here */) -> ...
>
> the syntax should extract some/all values of the array into one or several bindings.
>
> If we are in Caml, we have the :: operator to separate the first element from the rest
>    switch(array) {
>      case Arrays.of(String first :: String[] rest) -> ...
>
> If we are in JavaScript, we have the spread operator (notice that the ... is before the type)
>    switch(array) {
>      case Arrays.of(String first, ... String[] rest) -> ...
>
> So the varargs is at the declaration side, at the pattern side we need a new operator spread, so i think that adding an array pattern now is not a good idea.
>
> regards,
> R?mi
>
>>>>> With best regards,
>>>>> Tagir Valeev.
>>>>>
>>>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>>>>> We dropped this out of the record patterns JEP, but I think it is time to
>>>>>> revisit this.
>>>>>>
>>>>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>>>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>>>>> container for nested patterns.  And they have an obvious duality with array
>>>>>> creation expressions.
>>>>>>
>>>>>> The main open question here was how we distinguish between "match an array of
>>>>>> length exactly N" (where there are N nested patterns) and "match an array of
>>>>>> length at least N".  We toyed with the idea of a "..." indicator to mean "more
>>>>>> elements", but this felt a little forced and opened new questions.
>>>>>>
>>>>>> It later occurred to me that there is another place to nest a pattern in an
>>>>>> array pattern -- to match (and bind) the length.  In the following, assume for
>>>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>>>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>>>>> denote here with a constant literal.
>>>>>>
>>>>>> There is an obvious place to put this (optional) pattern: in between the
>>>>>> brackets.  So:
>>>>>>
>>>>>>        case String[1] { P }:
>>>>>>                    ^ a constant pattern
>>>>>>
>>>>>> would match string arrays of length 1 whose sole element matches P.  And
>>>>>>
>>>>>>        case String[] { P, Q }
>>>>>>
>>>>>> would match string arrays of length exactly 2, whose first two elements match P
>>>>>> and Q respectively.  (If the length pattern is not specified, we infer a
>>>>>> constant pattern whose constant is equal to the length of the nested pattern
>>>>>> list.)
>>>>>>
>>>>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>>>>
>>>>>>        x instanceof String[] arr
>>>>>>            && arr.length matches L
>>>>>>            && arr.length >= n
>>>>>>            && arr[0] matches P0
>>>>>>            && arr[1] matches P1
>>>>>>            ...
>>>>>>            && arr[n] matches Pn
>>>>>>
>>>>>> More examples:
>>>>>>
>>>>>>        case String[int len] { P }
>>>>>>
>>>>>> would match string arrays of length >= 1 whose first element matches P, and
>>>>>> further binds the array length to `len`.
>>>>>>
>>>>>>        case String[_] { P, Q }
>>>>>>
>>>>>> would match string arrays of any length whose first two elements match P and Q.
>>>>>>
>>>>>>        case String[3] { }
>>>>>>                    ^constant pattern
>>>>>>
>>>>>> matches all string arrays of length 3.
>>>>>>
>>>>>>
>>>>>> This is a more principled way to do it, because the length is a part of the
>>>>>> array and deserves a chance to match via nested patterns, just as with the
>>>>>> elements, and it avoid trying to give "..." a new meaning.
>>>>>>
>>>>>> The downside is that it might be confusing at first (though people will learn
>>>>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>>>>
>>>>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>>>>> attach another sub-feature: array patterns.  (This also bears on the question
>>>>>> of "how would varargs patterns work", which I'll address below, though they
>>>>>> might come later.)
>>>>>>
>>>>>> ## Array Patterns
>>>>>>
>>>>>> If we want to create a new array, we do so with an array construction
>>>>>> expression:
>>>>>>
>>>>>>        new String[] { "a", "b" }
>>>>>>
>>>>>> Since each form of aggregation should have its dual in destructuring, the
>>>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>>>>> is:
>>>>>>
>>>>>>        if (arr instanceof String[] { var a, var b }) { ... }
>>>>>>
>>>>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>>>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>>>>> to the nested patterns `var a` and `var b`.   This is the natural analogue of
>>>>>> deconstruction patterns for arrays, complete with nesting.
>>>>>>
>>>>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>>>>> rather than simply "length == 2".  There are multiple syntactic ways to get
>>>>>> there, for now I'm going to write
>>>>>>
>>>>>>        if (arr instanceof String[] { var a, var b, ... })
>>>>>>
>>>>>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>>>>>
>>>>>> <digression>
>>>>>> People are immediately going to ask "can I bind something to the remainder"; I
>>>>>> think this is mostly an "attractive distraction", and would prefer to not have
>>>>>> this dominate the discussion.
>>>>>> </digression>
>>>>>>
>>>>>> Here's an example from the JDK that could use this effectively:
>>>>>>
>>>>>> String[] limits = limitString.split(":");
>>>>>> try {
>>>>>>        switch (limits.length) {
>>>>>>            case 2: {
>>>>>>                if (!limits[1].equals("*"))
>>>>>>                    setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>>>>            }
>>>>>>            case 1: {
>>>>>>                if (!limits[0].equals("*"))
>>>>>>                    setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>>>>            }
>>>>>>        }
>>>>>> }
>>>>>> catch(NumberFormatException ex) {
>>>>>>        setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>>>>        setMultilineLimit(MultilineLimit.LENGTH, -1);
>>>>>> }
>>>>>>
>>>>>> becomes (eventually)
>>>>>>
>>>>>>        switch (limitString.split(":")) {
>>>>>>            case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>>>>            case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>>>>            default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>>>>        }
>>>>>>
>>>>>> Note how not only does this become more compact, but the unchecked
>>>>>> "NumberFormatException" is folded into the match, rather than being a separate
>>>>>> concern.
>>>>>>
>>>>>>
>>>>>> ## Varargs patterns
>>>>>>
>>>>>> Having array patterns offers us a natural way to interpret deconstruction
>>>>>> patterns for varargs records.  Assume we have:
>>>>>>
>>>>>>        void m(X... xs) { }
>>>>>>
>>>>>> Then a varargs invocation
>>>>>>
>>>>>>        m(a, b, c)
>>>>>>
>>>>>> is really sugar for
>>>>>>
>>>>>>        m(new X[] { a, b, c })
>>>>>>
>>>>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>>>>> array pattern.  So for a record
>>>>>>
>>>>>>        record R(X... xs) { }
>>>>>>
>>>>>> a varargs match:
>>>>>>
>>>>>>        case R(var a, var b, var c):
>>>>>>
>>>>>> is really sugar for an array match:
>>>>>>
>>>>>>        case R(X[] { var a, var b, var c }):
>>>>>>
>>>>>> And similarly, we can use our "more arity" indicator:
>>>>>>
>>>>>>        case R(var a, var b, var c, ...):
>>>>>>
>>>>>> to indicate that there are at least three elements.
>>>>>>


From brian.goetz at oracle.com  Fri Sep  9 20:06:34 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 9 Sep 2022 16:06:34 -0400
Subject: What does instanceof mean (was: Primitives in instanceof and patterns)
In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
Message-ID: <ca889381-f827-d2f3-48b0-24d943e93a85@oracle.com>

As mentioned, it is a common mental model that "Instanceof is the 
subtype operator", as Remi claims here:

On 9/9/2022 11:35 AM, Remi Forax wrote:
> And i'm still worry that we are muddying the water here, instanceof is 
> about instance and subtypining relationship (hence the name)

We will surely elicit some "who moved my cheese" responses when 
generalizing from "subtyping" to "safe casting precondition" (though the 
two coincide given the restrictions on instanceof today.)? The question 
is largely a pedagogical one; how do we help people see that "subtyping 
operator" is merely a convenient description of what instanceof has done 
to date?

We made a choice to lump rather than split by having `instanceof` take a 
pattern on the RHS as well as a type.? This choice is not without its 
challenges (mostly, the confusion around whether patterns match null or 
not), but it also illustrates that people can get over a narrow view of 
what `instanceof` "means", since this has generally not been a problem 
to date.? The leap from "reference types only" to "all types" is a 
smaller one, though it appeals to a broader view of polymorphism.

When restricted to reference types, `instanceof` is a question about 
subtyping relative to inclusion polymorphism.? When we bring in 
primitive widening/narrowing, we are appealing to coercion polymorphism 
too -- "can this value be coerced to this type without getting 
mangled."? When we bring in boxing and unboxing, we appeal to another 
form of coercion.? (Both forms are covered under existing conversion 
rules.)? I suspect this direction is not likely to be helpful, since 
these terms are not particularly widely used in the Java community.

Positioning "instanceof TYPE" as the "precondition for safe casting to 
TYPE" seems a pretty simple leap to me, since "instanceof TYPE" is 
basically never seen in the wild when not immediately followed by a cast 
to the same type.? Which makes sense; casting is risky (unless the type 
pairs involved are known to be related in a certain way), and instanceof 
is how you avoid casting surprises.

Is there a better way to explain this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/1e0df50c/attachment-0001.htm>

From brian.goetz at oracle.com  Fri Sep  9 20:12:21 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 9 Sep 2022 16:12:21 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
Message-ID: <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com>

I have a question about your example.? I'm not trying to be clever and 
play "whatabout", I'm looking for a straight answer why you think the 
two cases are different.

>
> If we take a simple example
> ? record Point(int x, int y) { }
> ? Point point = ...
> ? switch(point) {
> ?? case Point(int i, int j) -> ...
> ?? ...
> ? }
>
> let say know that we change Point to use longs
> ? record Point(long x, long y) { }
>
> With the semantics you propose, the code still compile but the pattern 
> is now transformed to a partial pattern that will not match all Points 
> but only the ones with x and y in between Integer.MIN_VALUE and 
> Integer.MAX_VALUE.

The same is true when I start with

 ???? record Foo(String s) { ... }

and later change it to

 ??? record Foo(Object s) { ... }

(both are incompatible changes, but we won't dwell on that.)

My question is: why does it not bother you that use-site patterns like 
`Foo(String s)` are reinterpreted as partial after the String -> Object 
change, but the analogous change with long -> int bothers you so much 
that you'd use it to argue against being able to ask whether a long is 
also an int?

You obviously think that these two examples are radically different.? 
Can you explain why?? Is it anything more than "that's the way its 
always been"?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/ffbf67a4/attachment.htm>

From john.r.rose at oracle.com  Fri Sep  9 21:32:04 2022
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 09 Sep 2022 14:32:04 -0700
Subject: Primitives in instanceof and patterns
In-Reply-To: <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
Message-ID: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>

On 9 Sep 2022, at 11:07, Brian Goetz wrote:
> ? Regardless, a better way to think about `instanceof` is that it is 
> the precondition for "would a cast to this type be safe and useful."  
> In the world where we restrict to reference types, the two notions 
> coincide.

And, in the future world where every value (except possibly `null`) is 
an *instance*, the two notions will coincide again, without the 
restriction to reference types.  We are taking reasonable incremental 
steps toward that world here, IMO.

> But the safe-cast-precondition is clearly more general (this is like 
> the difference between defining the function 2^n on Z, vs on R or C; 
> of course they have to agree at the integers, but the continuous 
> exponential function is far more useful than the discrete one.)  
> Moreover, the general mental model is just as simple: how do you know 
> a cast is safe?  Ask instanceof.  What does safe mean?  No error or 
> material loss of precision.

And (to pile on a bit here), the casts you are speaking of here, Brian, 
*are the casts we have in Java*, not some idealized or restricted or 
cleaned up cast.  So we have to deal with the oddities of primitive 
value conversion.

The payoff from dealing with this is that the meaning of patterns is 
derived systematically from the meaning of casts (and other 
conversions).  That is hugely desirable, because it means a very complex 
new feature is firmly anchored to existing features.  Getting this kind 
of thing right preserves and extends Java?s role as a world-class 
programming language.

> A more reasonable way to state this objection would be: "most users 
> believe that `instanceof` is purely about subtyping, and it will take 
> some work to bring them around to a more general interpretation, how 
> are we going to do that?"

This is subjective and esthetic, but I think two thoughts help here 
(with teaching and rationale):  First, everything (except `null`) is an 
instance, or will eventually be.  Second, subtyping in Java includes the 
murky rules for primitive typing.

Those specific rules more or less systematically determine how casts 
work.  They should also systematically determine (in the same way) how 
patterns work.  After all, casts and patterns are (and very much should 
be!) mirror image counterparts of each other, or dance partners holding 
hands.

(I visualize such things as boxes on the whiteboard with reversible 
arrows between them.  You could say ?category? if you like.  Brian 
likes to say ?dual?, and I took linear algebra too, but I doubt most 
folks took the trouble in that class to be curious about exactly what a 
?dual space? really is all about.)

Rather than extending the language we wish we had, we are extending the 
one we *do* have, and that means aligning even the murky parts of casts 
with pattern behavior.

In the end, I don?t think it?s very murky at all in practice, except 
of course for the outraged theoretical purist (who lives in each of us). 
  There is certainly *no new murk*.  IMO what Brian is showing works out 
surprisingly well, so kudos to him for following his nose to a design 
with liveable details. This success also IMO demonstrates the foresight 
of the original authors and current maintainers of the spec, even in the 
?murky? parts of primitive value conversions.

? John
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/3114b109/attachment-0001.htm>

From guy.steele at oracle.com  Fri Sep  9 23:13:40 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Fri, 9 Sep 2022 23:13:40 +0000
Subject: Primitives in instanceof and patterns
In-Reply-To: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>
Message-ID: <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com>


On Sep 9, 2022, at 5:32 PM, John Rose <john.r.rose at oracle.com> wrote:

Well said, but I cannot resist observing:
> 
> ?There is certainly no new murk.

To be precise, if you think there is more murk than before, take comfort that it is merely the dual of the existing murk. :-)


From john.r.rose at oracle.com  Fri Sep  9 23:25:10 2022
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 09 Sep 2022 16:25:10 -0700
Subject: Primitives in instanceof and patterns
In-Reply-To: <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>
 <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com>
Message-ID: <B26C597A-D16B-47F7-9D5F-5C6187A35568@oracle.com>

On 9 Sep 2022, at 16:13, Guy Steele wrote:

> On Sep 9, 2022, at 5:32 PM, John Rose <john.r.rose at oracle.com> wrote:
>
> Well said, but I cannot resist observing:
>>
>> ?There is certainly no new murk.
>
> To be precise, if you think there is more murk than before, take comfort that it is merely the dual of the existing murk. :-)

(How much is that murk in the mirror?)

From brian.goetz at oracle.com  Sat Sep 10 00:16:15 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 9 Sep 2022 20:16:15 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
Message-ID: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>

John pulled a nice Jedi-mind-trick on me, and pointed out that we 
actually have two creation expressions for arrays:

 ??? new Foo[n]
 ??? new Foo[] { a0, .., an }

and that if we are dualizing, then we should have these two patterns:

 ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N
 ??? new Foo[P]???????????????? // matches arrays whose length match P

but that neither

 ??? new Foo[] { P, Q, ... }?? // previous suggestion
nor
 ??? new Foo[L] { P, Q }?????? // current suggestion

correspond to either of those, which suggests that we may have 
prematurely optimized the pattern form.? The rational consequence of 
this observation is to do

 ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N

now (which is also the basis of varargs patterns), and once we have 
constant patterns (which are kind of required for the second form to be 
all that useful), come back for `Foo[P]`.


On 9/6/2022 5:11 PM, Brian Goetz wrote:
> We dropped this out of the record patterns JEP, but I think it is time 
> to revisit this.
>
> The concept of array patterns was pretty straightforward; they mimic 
> the nesting and exhaustiveness rules of record patterns, they are just 
> a different sort of container for nested patterns.? And they have an 
> obvious duality with array creation expressions.
>
> The main open question here was how we distinguish between "match an 
> array of length exactly N" (where there are N nested patterns) and 
> "match an array of length at least N".? We toyed with the idea of a 
> "..." indicator to mean "more elements", but this felt a little forced 
> and opened new questions.
>
> It later occurred to me that there is another place to nest a pattern 
> in an array pattern -- to match (and bind) the length.? In the 
> following, assume for sake of exposition that "_" is the "any" pattern 
> (matches everything, binds nothing) and that we have some way to 
> denote a constant pattern, which I'll denote here with a constant 
> literal.
>
> There is an obvious place to put this (optional) pattern: in between 
> the brackets.? So:
>
> ??? case String[1] { P }:
> ??????????????? ^ a constant pattern
>
> would match string arrays of length 1 whose sole element matches P.? And
>
> ??? case String[] { P, Q }
>
> would match string arrays of length exactly 2, whose first two 
> elements match P and Q respectively.? (If the length pattern is not 
> specified, we infer a constant pattern whose constant is equal to the 
> length of the nested pattern list.)
>
> Matching a target to `String[L] { P0, .., Pn }` means
>
> ??? x instanceof String[] arr
> ??????? && arr.length matches L
> ??????? && arr.length >= n
> ??????? && arr[0] matches P0
> ??????? && arr[1] matches P1
> ??????? ...
> ??????? && arr[n] matches Pn
>
> More examples:
>
> ??? case String[int len] { P }
>
> would match string arrays of length >= 1 whose first element matches 
> P, and further binds the array length to `len`.
>
> ??? case String[_] { P, Q }
>
> would match string arrays of any length whose first two elements match 
> P and Q.
>
> ??? case String[3] { }
> ??????????????? ^constant pattern
>
> matches all string arrays of length 3.
>
>
> This is a more principled way to do it, because the length is a part 
> of the array and deserves a chance to match via nested patterns, just 
> as with the elements, and it avoid trying to give "..." a new meaning.
>
> The downside is that it might be confusing at first (though people 
> will learn quickly enough) how to distinguish between an exact match 
> and a prefix match.
>
>
>
>
> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>> As we get into the next round of pattern matching, I'd like to 
>> opportunistically attach another sub-feature: array patterns.? (This 
>> also bears on the question of "how would varargs patterns work", 
>> which I'll address below, though they might come later.)
>>
>> ## Array Patterns
>>
>> If we want to create a new array, we do so with an array construction 
>> expression:
>>
>> ??? new String[] { "a", "b" }
>>
>> Since each form of aggregation should have its dual in destructuring, 
>> the natural way to represent an array pattern (h/t to AlanM for 
>> suggesting this) is:
>>
>> ??? if (arr instanceof String[] { var a, var b }) { ... }
>>
>> Here, the applicability test is: "are you an instanceof of String[], 
>> with length = 2", and if so, we cast to String[], extract the two 
>> elements, and match them to the nested patterns `var a` and `var 
>> b`.?? This is the natural analogue of deconstruction patterns for 
>> arrays, complete with nesting.
>>
>> Since an array can have more elements, we likely need a way to say 
>> "length >= 2" rather than simply "length == 2". There are multiple 
>> syntactic ways to get there, for now I'm going to write
>>
>> ??? if (arr instanceof String[] { var a, var b, ... })
>>
>> to indicate "more".? The "..." matches zero or more elements and 
>> binds nothing.
>>
>> <digression>
>> People are immediately going to ask "can I bind something to the 
>> remainder"; I think this is mostly an "attractive distraction", and 
>> would prefer to not have this dominate the discussion.
>> </digression>
>>
>> Here's an example from the JDK that could use this effectively:
>>
>> String[] limits = limitString.split(":");
>> try {
>> ??? switch (limits.length) {
>> ??????? case 2: {
>> ??????????? if (!limits[1].equals("*"))
>> ??????????????? setMultilineLimit(MultilineLimit.DEPTH, 
>> Integer.parseInt(limits[1]));
>> ??????? }
>> ??????? case 1: {
>> ??????????? if (!limits[0].equals("*"))
>> ??????????????? setMultilineLimit(MultilineLimit.LENGTH, 
>> Integer.parseInt(limits[0]));
>> ??????? }
>> ??? }
>> }
>> catch(NumberFormatException ex) {
>> ??? setMultilineLimit(MultilineLimit.DEPTH, -1);
>> ??? setMultilineLimit(MultilineLimit.LENGTH, -1);
>> }
>>
>> becomes (eventually)
>>
>> switch (limitString.split(":")) {
>> ??????? case String[] { var _, Integer.parseInt(var i) } -> 
>> setMultilineLimit(DEPTH, i);
>> ? ? ? case String[] { Integer.parseInt(var i) } -> 
>> setMultilineLimit(LENGTH, i);
>> ??????? default -> { setMultilineLimit(DEPTH, -1); 
>> setMultilineLimit(LENGTH, -1); }
>> ??? }
>>
>> Note how not only does this become more compact, but the unchecked 
>> "NumberFormatException" is folded into the match, rather than being a 
>> separate concern.
>>
>>
>> ## Varargs patterns
>>
>> Having array patterns offers us a natural way to interpret 
>> deconstruction patterns for varargs records.? Assume we have:
>>
>> ??? void m(X... xs) { }
>>
>> Then a varargs invocation
>>
>> ??? m(a, b, c)
>>
>> is really sugar for
>>
>> ??? m(new X[] { a, b, c })
>>
>> So the dual of a varargs invocation, a varargs match, is really a 
>> match to an array pattern.? So for a record
>>
>> ??? record R(X... xs) { }
>>
>> a varargs match:
>>
>> ??? case R(var a, var b, var c):
>>
>> is really sugar for an array match:
>>
>> ??? case R(X[] { var a, var b, var c }):
>>
>> And similarly, we can use our "more arity" indicator:
>>
>> ??? case R(var a, var b, var c, ...):
>>
>> to indicate that there are at least three elements.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/98d407c0/attachment-0001.htm>

From john.r.rose at oracle.com  Sat Sep 10 00:34:40 2022
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 09 Sep 2022 17:34:40 -0700
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <CAE+3fjZywJgk874aT1TMzAjn6rVMkEUuvH0xuuKf3ihjDn9GZQ@mail.gmail.com>
 <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com>
 <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr>
 <d87c0168-5a10-7c4e-7026-4d9ee04ab024@oracle.com>
 <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr>
Message-ID: <E4F39FCB-3599-46C9-99CC-968F46D7B0DF@oracle.com>

On 9 Sep 2022, at 9:09, forax at univ-mlv.fr wrote:

> ----- Original Message -----
> Here we want to extract the value into bindings/variables, that is not
> what the varargs does, the varargs  takes a bunch of value on stack 
> and put them into an array.
> Here we want the opposite operation of a varargs, the spread (or 
> splat) operator that takes the argument from an array (or a collection 
> ?) and put them on the stack.

You are right that Brian?s proposal is not at its heart varargs, it is 
*array patterns* just as *array construction* are not equivalent to 
varargs, just a precursor to varargs.

I think we need to get array patterns right first.  Then we can move to 
whatever a fuller conception of varargs might look like ?in the dual 
mirror?.

In Brian?s architecture of patterns, every aggregator is matched as 
cleanly as possible with its dual pattern (which reverses data flows).

There are actually two array construction expressions in Java today.  
(We could extend them with more varargs-flavored features to do 
slice/splat/spread/splice/whatever, but we don?t have them today!)  
The older expression takes a length and produces an uninitialized array. 
The slightly-less-old expression takes an initializer list *and refuses 
to take a length* and produces an initialized array, correctly sized.

The most conservative application of Brian?s design principles would 
create, I think, *two distinct array patterns*, one for each kind of 
expression.

Can the two patterns be merged?  Yeah, maybe, but at the cost of 
disturbing the correspondence with array aggregation.

And it may be that some of of the tricky questions about varargs go away 
if we restrict ourselves to just the two kinds of basic patterns that 
derive directly from today?s array creation expressions.

Remember, patterns compose.  If you have that rare need for both length 
and contents, use two patterns combined on the same array.  There?s 
always a way to do that.

If you want *some of the content* of an array to match a pattern, use a 
don?t care pattern.  If you want length-polymorphism and element 
subpatterns (a match of one pattern to many lengths, with elements 
sprinkled around somehow) then we are beyond the bounds of today?s 
exercise, aren?t we?

>
> If we have the pattern method Arrays.of()
>
> static <T> pattern (T...) of(T[] array) {  // here it's a varargs
>   ...
> }
>
> and we call it using a named pattern
>   switch(array) {
>     case Arrays.of(/* insert a syntax here */) -> ...
>
> the syntax should extract some/all values of the array into one or 
> several bindings.

We?ll get there.  Just not quite yet.  One step at a time.

I think it would be really neat to be able to ?slice out? 
multi-element chunks of an array and bind them to pattern variables.  
Lisp folks have been enjoying this sort of thing basically forever.  And 
*ignoring a range of elements* in a pattern is exactly equivalent to 
slicing them out and binding them to a don?t-care pattern, right?

Confronted with such a feature, having thought about Brian?s 
principles, I think I would at the same time expect that there would be 
a dual array *construction* expression which would do the 
*mirror-opposite*.  That is, it would ?splice in? multi-element 
chunks, into a newly created array.  The Lisp folks sometimes use the 
same notation for both splicing and slicing.  (I?m thinking of 
backquote-comma-atsign, with and without some kind of pattern-bind.)

Under Brian?s design principles, which I whole-heartedly agree with, I 
guess a slogan for array patterns might be:  No slicing without 
splicing!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/cc239af2/attachment.htm>

From john.r.rose at oracle.com  Sat Sep 10 00:45:16 2022
From: john.r.rose at oracle.com (John Rose)
Date: Fri, 09 Sep 2022 17:45:16 -0700
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
Message-ID: <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com>

I was practicing that trick all morning!

I agree that `Foo[P]` can be saved for later.

In case it wasn?t clear in my previous message, I also think that 
splicey stuff like `new Foo[]{ ...as, b, c, ...ds, e }` and the 
corresponding slicey patterns can *also* be saved for later.

In fact, the slice/splice stuff seems like it is best situated in a 
larger design exercise for ?collection literals? whatever those are. 
  Basically, that would be where Lisp?s backquote-comma get inherited 
by Java.

On 9 Sep 2022, at 17:16, Brian Goetz wrote:

> John pulled a nice Jedi-mind-trick on me, and pointed out that we 
> actually have two creation expressions for arrays:
>
> ??? new Foo[n]
> ??? new Foo[] { a0, .., an }
>
> and that if we are dualizing, then we should have these two patterns:
>
> ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length 
> N
> ??? new Foo[P]???????????????? // matches arrays 
> whose length match P
>
> but that neither
>
> ??? new Foo[] { P, Q, ... }?? // previous suggestion
> nor
> ??? new Foo[L] { P, Q }?????? // current suggestion
>
> correspond to either of those, which suggests that we may have 
> prematurely optimized the pattern form.? The rational consequence of 
> this observation is to do
>
> ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length 
> N
>
> now (which is also the basis of varargs patterns), and once we have 
> constant patterns (which are kind of required for the second form to 
> be all that useful), come back for `Foo[P]`.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220909/e92525ed/attachment-0001.htm>

From forax at univ-mlv.fr  Sat Sep 10 08:57:50 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Sat, 10 Sep 2022 10:57:50 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com>
Message-ID: <1698220860.2333623.1662800270577.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Friday, September 9, 2022 10:12:21 PM
> Subject: Re: Primitives in instanceof and patterns

> I have a question about your example. I'm not trying to be clever and play
> "whatabout", I'm looking for a straight answer why you think the two cases are
> different.

>> If we take a simple example
>> record Point(int x, int y) { }
>> Point point = ...
>> switch(point) {
>> case Point(int i, int j) -> ...
>> ...
>> }

>> let say know that we change Point to use longs
>> record Point(long x, long y) { }

>> With the semantics you propose, the code still compile but the pattern is now
>> transformed to a partial pattern that will not match all Points but only the
>> ones with x and y in between Integer.MIN_VALUE and Integer.MAX_VALUE.

> The same is true when I start with

> record Foo(String s) { ... }

> and later change it to

> record Foo(Object s) { ... }

> (both are incompatible changes, but we won't dwell on that.)

> My question is: why does it not bother you that use-site patterns like
> `Foo(String s)` are reinterpreted as partial after the String -> Object change,
> but the analogous change with long -> int bothers you so much that you'd use it
> to argue against being able to ask whether a long is also an int?

> You obviously think that these two examples are radically different. Can you
> explain why? Is it anything more than "that's the way its always been"?

I think i've been a little over my head with this example. 

I've forgotten that in both cases, the patterns move from being a total pattern to be a partial pattern so the enclosing switch will not be exhaustive anymore, thus not compile. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/0ac6676d/attachment.htm>

From forax at univ-mlv.fr  Sat Sep 10 08:58:01 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Sat, 10 Sep 2022 10:58:01 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
Message-ID: <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Friday, September 9, 2022 8:07:41 PM
> Subject: Re: Primitives in instanceof and patterns

>> The semantics you propose is not to emit a compile error but at runtime to check
>> if the value "i" is beetween Short.MIN_VALUE and Short.MAX_VALUE.

>> So there is perhaps a syntactic duality but clearly there is no semantics
>> duality.

> Of course there is a semantic duality here. Specifically, `int` and `short` are
> related by an _embedding-projection pair_. Briefly: given two sets A and B
> (think "B" for "bigger"), an approximation metric on B (a complete partial
> ordering), and a pair of functions `e : A -> B` and `p : B -> A`, they form an
> e-p pair if (a) p . e is the identity function (dot is compose), and e . p
> produces an approximation of the input (according to the metric.)

> The details are not critical here (though this algebraic structure shows up
> everywhere in our work if you look closely), but the point remains: there is an
> algebraic duality here. Yes, when going in one direction, no runtime tests are
> needed; when going in the other direction, because it may be lossy in one
> direction, a runtime test is needed in that direction. Just like with
> `instanceof String` / `case String s` today.

> Anyway, I don't think you're saying what you really mean. Let's not get caught
> up in silly arguments about what "dual" means; that won't be helpful.
I do not disagree with the fact that a dual exist, i disagree that the semantics you propose is a dual (or a good dual if you prefer). 
A cast on primitive type apply the same operation for all the possible values, the semantics you propose for checking if an integer is a short does not apply the same operation to all values. 

The semantics of the Java 19 of the type pattern with a primitive type is a better dual in my opinion. 

The idea that the semantics of a primitive type pattern has to be "useful" is a trap. 

[...] 

>> I believe this is exactly what Stephen Colbourne was complaining when we
>> discussed the previous iteration of this spec, the semantics of the primtiive
>> pattern change depending on the definition of the data.

> I think what Stephen didn't like is that there is no syntactic difference
> between a total and partial pattern at the use site. And I get why that made
> him uncomfortable; it's a valid concern, and one could imagine designing the
> language so that total and partial patterns look different. This is one of the
> tradeoffs we have made; I do still think we picked a good one.

>> The remark of Tagir about array pattern also works here, having a named pattern
>> like Short.asShort() makes the semantics far cleared because it disambiguate
>> between a a pattern that request a conversion and a pattern that does a
>> conversion because the data definition has changed.

> If the language didn't support primitive widening in assignment / method
> invocation context (like Golang does), and instead said "use Integer::toLong
> (or Long::fromInteger) to convert int -> long", then yes, the natural duality
> would be to also represent these as named patterns; then conversions in both
> directions are mediated by API points, total in one direction, partial in the
> other. But that's not the language we have! The language we have allows us to
> provide an int where a long is needed, and the language does the needful.
> Pattern matching allows us to recover whether a value came from a certain type,
> even after we've lost the static type information. Just as we can recover the
> String-ness here:

> Object o = "Foo";
> if (o instanceof String s) { ... }

> because reference type patterns are willing to conditionally reverse reference
> widening, all the same arguments apply to

> long n = 3;
> if (n instanceof int i) { ... }

> And not allowing this makes the language *more* complicated, because now some
> conversions are reversible and some are not, for ad-hoc reasons that no one
> will be able to understand. Can you offer any compelling reason why we should
> be able to recover the String-ness of `o` after a widening, but not the
> int-ness of `n` after a widening?
In the case of instanceof, the type is not lost because any instances keep a reference to its class at runtime, a long does not keep a secret class saying its really an integer in disguise. 

>> And i'm still worry that we are muddying the water here, instanceof is about
>> instance and subtypining relationship (hence the name), extending it to cover
>> non-instance / primitive value is very confusing.

> Sorry, this is a cheap rhetorical trick; declaring words to mean what you want
> them to mean, and then pointing to that meaning as a way to close the argument.

> Yes, saying "instanceof T is about subtyping" is a useful mental model *when the
> only types you can apply it to are those related by inclusion polymorphism*."
> But the restriction of instanceof to reference types is arbitrary (and we've
> already decided to allow patterns in instanceof, which are surely not mere
> subtyping.)

> Regardless, a better way to think about `instanceof` is that it is the
> precondition for "would a cast to this type be safe and useful." In the world
> where we restrict to reference types, the two notions coincide. But the
> safe-cast-precondition is clearly more general (this is like the difference
> between defining the function 2^n on Z, vs on R or C; of course they have to
> agree at the integers, but the continuous exponential function is far more
> useful than the discrete one.) Moreover, the general mental model is just as
> simple: how do you know a cast is safe? Ask instanceof. What does safe mean? No
> error or material loss of precision.
[...] 

"would a cast to this type be safe and useful." 

I think you are overstating how useful a pattern that do a range check is. 
There is no method in the JDK that takes an int convert it to a short if in the right range or throw an exception otherwise. 
It seems a better fit to a named pattern that to "default behavior" of the type pattern. 

I believe that defining a range check as a primitive type pattern is a too clever idea. 

[...] 

> (I strongly encourage everyone to re-read JLS Ch5, and to meditate on *why* we
> have the particular conversions in the contexts we have. They're complex, but
> not arbitrary; if you listen closely to the specification, it sometimes
> whispers to you.)

I don't disagree that users may want what you call the dual of a cast to primitive, i disagree that it has to come as a type pattern and not as a named pattern. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/75980246/attachment-0001.htm>

From forax at univ-mlv.fr  Sat Sep 10 08:58:15 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Sat, 10 Sep 2022 10:58:15 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com>
Message-ID: <909015874.2333799.1662800295791.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "Remi Forax" <forax at univ-mlv.fr>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Friday, September 9, 2022 11:32:04 PM
> Subject: Re: Primitives in instanceof and patterns

> On 9 Sep 2022, at 11:07, Brian Goetz wrote:

>> ? Regardless, a better way to think about `instanceof` is that it is the
>> precondition for "would a cast to this type be safe and useful." In the world
>> where we restrict to reference types, the two notions coincide.
> And, in the future world where every value (except possibly null ) is an
> instance , the two notions will coincide again, without the restriction to
> reference types. We are taking reasonable incremental steps toward that world
> here, IMO.

>> But the safe-cast-precondition is clearly more general (this is like the
>> difference between defining the function 2^n on Z, vs on R or C; of course they
>> have to agree at the integers, but the continuous exponential function is far
>> more useful than the discrete one.) Moreover, the general mental model is just
>> as simple: how do you know a cast is safe? Ask instanceof. What does safe mean?
>> No error or material loss of precision.
> And (to pile on a bit here), the casts you are speaking of here, Brian, are the
> casts we have in Java , not some idealized or restricted or cleaned up cast. So
> we have to deal with the oddities of primitive value conversion.

> The payoff from dealing with this is that the meaning of patterns is derived
> systematically from the meaning of casts (and other conversions). That is
> hugely desirable, because it means a very complex new feature is firmly
> anchored to existing features. Getting this kind of thing right preserves and
> extends Java?s role as a world-class programming language.

>> A more reasonable way to state this objection would be: "most users believe that
>> `instanceof` is purely about subtyping, and it will take some work to bring
>> them around to a more general interpretation, how are we going to do that?"
> This is subjective and esthetic, but I think two thoughts help here (with
> teaching and rationale): First, everything (except null ) is an instance, or
> will eventually be. Second, subtyping in Java includes the murky rules for
> primitive typing.

> Those specific rules more or less systematically determine how casts work. They
> should also systematically determine (in the same way) how patterns work. After
> all, casts and patterns are (and very much should be!) mirror image
> counterparts of each other, or dance partners holding hands.

> (I visualize such things as boxes on the whiteboard with reversible arrows
> between them. You could say ?category? if you like. Brian likes to say ?dual?,
> and I took linear algebra too, but I doubt most folks took the trouble in that
> class to be curious about exactly what a ?dual space? really is all about.)

> Rather than extending the language we wish we had, we are extending the one we
> do have, and that means aligning even the murky parts of casts with pattern
> behavior.

> In the end, I don?t think it?s very murky at all in practice, except of course
> for the outraged theoretical purist (who lives in each of us). There is
> certainly no new murk . IMO what Brian is showing works out surprisingly well,
> so kudos to him for following his nose to a design with liveable details. This
> success also IMO demonstrates the foresight of the original authors and current
> maintainers of the spec, even in the ?murky? parts of primitive value
> conversions.

> ? John

At some point in the future, we may want what an instanceof means, i think we can all agree with that. 

I would prefer to be on the safe side when we will ask ourselves how exactly to retrofit primitive types to value classes. 

I'm not against changing what a type pattern is but it should be done in concert with changing the other rules (overriding rules especially) and the retrofitting of primitive types to value classes. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/6de4fe6d/attachment.htm>

From forax at univ-mlv.fr  Sat Sep 10 09:48:20 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 10 Sep 2022 11:48:20 +0200 (CEST)
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
Message-ID: <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Saturday, September 10, 2022 2:16:15 AM
> Subject: Re: Array patterns (and varargs patterns)

> John pulled a nice Jedi-mind-trick on me, and pointed out that we actually have
> two creation expressions for arrays:

> new Foo[n]
> new Foo[] { a0, .., an }

> and that if we are dualizing, then we should have these two patterns:

> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N
> new Foo[P] // matches arrays whose length match P

> but that neither

> new Foo[] { P, Q, ... } // previous suggestion
> nor
> new Foo[L] { P, Q } // current suggestion

> correspond to either of those, which suggests that we may have prematurely
> optimized the pattern form. The rational consequence of this observation is to
> do

> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N

> now (which is also the basis of varargs patterns), and once we have constant
> patterns (which are kind of required for the second form to be all that
> useful), come back for `Foo[P]`.
I like this proposal, it offers a clean separation between the array pattern and a future spread pattern (or whatever when end up calling it). 

R?mi 

> On 9/6/2022 5:11 PM, Brian Goetz wrote:

>> We dropped this out of the record patterns JEP, but I think it is time to
>> revisit this.

>> The concept of array patterns was pretty straightforward; they mimic the nesting
>> and exhaustiveness rules of record patterns, they are just a different sort of
>> container for nested patterns. And they have an obvious duality with array
>> creation expressions.

>> The main open question here was how we distinguish between "match an array of
>> length exactly N" (where there are N nested patterns) and "match an array of
>> length at least N". We toyed with the idea of a "..." indicator to mean "more
>> elements", but this felt a little forced and opened new questions.

>> It later occurred to me that there is another place to nest a pattern in an
>> array pattern -- to match (and bind) the length. In the following, assume for
>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>> nothing) and that we have some way to denote a constant pattern, which I'll
>> denote here with a constant literal.

>> There is an obvious place to put this (optional) pattern: in between the
>> brackets. So:

>> case String[1] { P }:
>> ^ a constant pattern

>> would match string arrays of length 1 whose sole element matches P. And

>> case String[] { P, Q }

>> would match string arrays of length exactly 2, whose first two elements match P
>> and Q respectively. (If the length pattern is not specified, we infer a
>> constant pattern whose constant is equal to the length of the nested pattern
>> list.)

>> Matching a target to `String[L] { P0, .., Pn }` means

>> x instanceof String[] arr
>> && arr.length matches L
>> && arr.length >= n
>> && arr[0] matches P0
>> && arr[1] matches P1
>> ...
>> && arr[n] matches Pn

>> More examples:

>> case String[int len] { P }

>> would match string arrays of length >= 1 whose first element matches P, and
>> further binds the array length to `len`.

>> case String[_] { P, Q }

>> would match string arrays of any length whose first two elements match P and Q.

>> case String[3] { }
>> ^constant pattern

>> matches all string arrays of length 3.

>> This is a more principled way to do it, because the length is a part of the
>> array and deserves a chance to match via nested patterns, just as with the
>> elements, and it avoid trying to give "..." a new meaning.

>> The downside is that it might be confusing at first (though people will learn
>> quickly enough) how to distinguish between an exact match and a prefix match.

>> On 1/5/2021 1:48 PM, Brian Goetz wrote:

>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>> attach another sub-feature: array patterns. (This also bears on the question of
>>> "how would varargs patterns work", which I'll address below, though they might
>>> come later.)

>>> ## Array Patterns

>>> If we want to create a new array, we do so with an array construction
>>> expression:

>>> new String[] { "a", "b" }

>>> Since each form of aggregation should have its dual in destructuring, the
>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>> is:

>>> if (arr instanceof String[] { var a, var b }) { ... }

>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>> to the nested patterns `var a` and `var b`. This is the natural analogue of
>>> deconstruction patterns for arrays, complete with nesting.

>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>> rather than simply "length == 2". There are multiple syntactic ways to get
>>> there, for now I'm going to write

>>> if (arr instanceof String[] { var a, var b, ... })

>>> to indicate "more". The "..." matches zero or more elements and binds nothing.

>>> <digression>
>>> People are immediately going to ask "can I bind something to the remainder"; I
>>> think this is mostly an "attractive distraction", and would prefer to not have
>>> this dominate the discussion.
>>> </digression>

>>> Here's an example from the JDK that could use this effectively:

>>> String[] limits = limitString.split(":");
>>> try {
>>> switch (limits.length) {
>>> case 2: {
>>> if (!limits[1].equals("*"))
>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>> }
>>> case 1: {
>>> if (!limits[0].equals("*"))
>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>> }
>>> }
>>> }
>>> catch(NumberFormatException ex) {
>>> setMultilineLimit(MultilineLimit.DEPTH, -1);
>>> setMultilineLimit(MultilineLimit.LENGTH, -1);
>>> }

>>> becomes (eventually)

>>> switch (limitString.split(":")) {
>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>> }

>>> Note how not only does this become more compact, but the unchecked
>>> "NumberFormatException" is folded into the match, rather than being a separate
>>> concern.

>>> ## Varargs patterns

>>> Having array patterns offers us a natural way to interpret deconstruction
>>> patterns for varargs records. Assume we have:

>>> void m(X... xs) { }

>>> Then a varargs invocation

>>> m(a, b, c)

>>> is really sugar for

>>> m(new X[] { a, b, c })

>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>> array pattern. So for a record

>>> record R(X... xs) { }

>>> a varargs match:

>>> case R(var a, var b, var c):

>>> is really sugar for an array match:

>>> case R(X[] { var a, var b, var c }):

>>> And similarly, we can use our "more arity" indicator:

>>> case R(var a, var b, var c, ...):

>>> to indicate that there are at least three elements.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/dfeaced0/attachment-0001.htm>

From forax at univ-mlv.fr  Sat Sep 10 09:50:00 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Sat, 10 Sep 2022 11:50:00 +0200 (CEST)
Subject: Array patterns (and varargs patterns)
In-Reply-To: <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
 <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com>
Message-ID: <1484184530.2347295.1662803400260.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Saturday, September 10, 2022 2:45:16 AM
> Subject: Re: Array patterns (and varargs patterns)

> I was practicing that trick all morning!

> I agree that Foo[P] can be saved for later.

> In case it wasn?t clear in my previous message, I also think that splicey stuff
> like new Foo[]{ ...as, b, c, ...ds, e } and the corresponding slicey patterns
> can also be saved for later.

> In fact, the slice/splice stuff seems like it is best situated in a larger
> design exercise for ?collection literals? whatever those are. Basically, that
> would be where Lisp?s backquote-comma get inherited by Java.

yes, 
R?mi 

> On 9 Sep 2022, at 17:16, Brian Goetz wrote:

>> John pulled a nice Jedi-mind-trick on me, and pointed out that we actually have
>> two creation expressions for arrays:

>> new Foo[n]
>> new Foo[] { a0, .., an }

>> and that if we are dualizing, then we should have these two patterns:

>> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N
>> new Foo[P] // matches arrays whose length match P

>> but that neither

>> new Foo[] { P, Q, ... } // previous suggestion
>> nor
>> new Foo[L] { P, Q } // current suggestion

>> correspond to either of those, which suggests that we may have prematurely
>> optimized the pattern form. The rational consequence of this observation is to
>> do

>> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N

>> now (which is also the basis of varargs patterns), and once we have constant
>> patterns (which are kind of required for the second form to be all that
>> useful), come back for `Foo[P]`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/91ef1a5f/attachment.htm>

From brian.goetz at oracle.com  Sat Sep 10 14:00:38 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 10 Sep 2022 10:00:38 -0400
Subject: Array patterns (and varargs patterns)
In-Reply-To: <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr>
References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com>
 <ee49d374-13cb-c832-8d9c-5d5679e0af9f@oracle.com>
 <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com>
 <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr>
Message-ID: <242ed758-cbcd-1f06-eaf1-23c2402b42d7@oracle.com>

Obvious correction: the `new` in the pattern examples was a cut and 
paste error, patterns don't say `new`.

On 9/10/2022 5:48 AM, Remi Forax wrote:
>
>
> ------------------------------------------------------------------------
>
>     *From: *"Brian Goetz" <brian.goetz at oracle.com>
>     *To: *"amber-spec-experts" <amber-spec-experts at openjdk.java.net>
>     *Sent: *Saturday, September 10, 2022 2:16:15 AM
>     *Subject: *Re: Array patterns (and varargs patterns)
>
>     John pulled a nice Jedi-mind-trick on me, and pointed out that we
>     actually have two creation expressions for arrays:
>
>     ??? new Foo[n]
>     ??? new Foo[] { a0, .., an }
>
>     and that if we are dualizing, then we should have these two patterns:
>
>     ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N
>     ??? new Foo[P]???????????????? // matches arrays whose length match P
>
>     but that neither
>
>     ??? new Foo[] { P, Q, ... }?? // previous suggestion
>     nor
>     ??? new Foo[L] { P, Q }?????? // current suggestion
>
>     correspond to either of those, which suggests that we may have
>     prematurely optimized the pattern form.? The rational consequence
>     of this observation is to do
>
>     ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N
>
>     now (which is also the basis of varargs patterns), and once we
>     have constant patterns (which are kind of required for the second
>     form to be all that useful), come back for `Foo[P]`. 
>
>
> I like this proposal, it offers a clean separation between the array 
> pattern and a future spread pattern (or whatever when end up calling it).
>
> R?mi
>
>
>
>     On 9/6/2022 5:11 PM, Brian Goetz wrote:
>
>         We dropped this out of the record patterns JEP, but I think it
>         is time to revisit this.
>
>         The concept of array patterns was pretty straightforward; they
>         mimic the nesting and exhaustiveness rules of record patterns,
>         they are just a different sort of container for nested
>         patterns. And they have an obvious duality with array creation
>         expressions.
>
>         The main open question here was how we distinguish between
>         "match an array of length exactly N" (where there are N nested
>         patterns) and "match an array of length at least N".? We toyed
>         with the idea of a "..." indicator to mean "more elements",
>         but this felt a little forced and opened new questions.
>
>         It later occurred to me that there is another place to nest a
>         pattern in an array pattern -- to match (and bind) the
>         length.? In the following, assume for sake of exposition that
>         "_" is the "any" pattern (matches everything, binds nothing)
>         and that we have some way to denote a constant pattern, which
>         I'll denote here with a constant literal.
>
>         There is an obvious place to put this (optional) pattern: in
>         between the brackets.? So:
>
>         ??? case String[1] { P }:
>         ??????????????? ^ a constant pattern
>
>         would match string arrays of length 1 whose sole element
>         matches P.? And
>
>         ??? case String[] { P, Q }
>
>         would match string arrays of length exactly 2, whose first two
>         elements match P and Q respectively.? (If the length pattern
>         is not specified, we infer a constant pattern whose constant
>         is equal to the length of the nested pattern list.)
>
>         Matching a target to `String[L] { P0, .., Pn }` means
>
>         ??? x instanceof String[] arr
>         ??????? && arr.length matches L
>         ??????? && arr.length >= n
>         ??????? && arr[0] matches P0
>         ??????? && arr[1] matches P1
>         ??????? ...
>         ??????? && arr[n] matches Pn
>
>         More examples:
>
>         ??? case String[int len] { P }
>
>         would match string arrays of length >= 1 whose first element
>         matches P, and further binds the array length to `len`.
>
>         ??? case String[_] { P, Q }
>
>         would match string arrays of any length whose first two
>         elements match P and Q.
>
>         ??? case String[3] { }
>         ??????????????? ^constant pattern
>
>         matches all string arrays of length 3.
>
>
>         This is a more principled way to do it, because the length is
>         a part of the array and deserves a chance to match via nested
>         patterns, just as with the elements, and it avoid trying to
>         give "..." a new meaning.
>
>         The downside is that it might be confusing at first (though
>         people will learn quickly enough) how to distinguish between
>         an exact match and a prefix match.
>
>
>
>
>         On 1/5/2021 1:48 PM, Brian Goetz wrote:
>
>             As we get into the next round of pattern matching, I'd
>             like to opportunistically attach another sub-feature:
>             array patterns.? (This also bears on the question of "how
>             would varargs patterns work", which I'll address below,
>             though they might come later.)
>
>             ## Array Patterns
>
>             If we want to create a new array, we do so with an array
>             construction expression:
>
>             ??? new String[] { "a", "b" }
>
>             Since each form of aggregation should have its dual in
>             destructuring, the natural way to represent an array
>             pattern (h/t to AlanM for suggesting this) is:
>
>             ??? if (arr instanceof String[] { var a, var b }) { ... }
>
>             Here, the applicability test is: "are you an instanceof of
>             String[], with length = 2", and if so, we cast to
>             String[], extract the two elements, and match them to the
>             nested patterns `var a` and `var b`.?? This is the natural
>             analogue of deconstruction patterns for arrays, complete
>             with nesting.
>
>             Since an array can have more elements, we likely need a
>             way to say "length >= 2" rather than simply "length ==
>             2".? There are multiple syntactic ways to get there, for
>             now I'm going to write
>
>             ??? if (arr instanceof String[] { var a, var b, ... })
>
>             to indicate "more".? The "..." matches zero or more
>             elements and binds nothing.
>
>             <digression>
>             People are immediately going to ask "can I bind something
>             to the remainder"; I think this is mostly an "attractive
>             distraction", and would prefer to not have this dominate
>             the discussion.
>             </digression>
>
>             Here's an example from the JDK that could use this
>             effectively:
>
>             String[] limits = limitString.split(":");
>             try {
>             ??? switch (limits.length) {
>             ??????? case 2: {
>             ??????????? if (!limits[1].equals("*"))
>             setMultilineLimit(MultilineLimit.DEPTH,
>             Integer.parseInt(limits[1]));
>             ??????? }
>             ??????? case 1: {
>             ??????????? if (!limits[0].equals("*"))
>             setMultilineLimit(MultilineLimit.LENGTH,
>             Integer.parseInt(limits[0]));
>             ??????? }
>             ??? }
>             }
>             catch(NumberFormatException ex) {
>             ??? setMultilineLimit(MultilineLimit.DEPTH, -1);
>             ??? setMultilineLimit(MultilineLimit.LENGTH, -1);
>             }
>
>             becomes (eventually)
>
>             ??? switch (limitString.split(":")) {
>             ??????? case String[] { var _, Integer.parseInt(var i) }
>             -> setMultilineLimit(DEPTH, i);
>             ? ? ? ? case String[] { Integer.parseInt(var i) } ->
>             setMultilineLimit(LENGTH, i);
>             ??????? default -> { setMultilineLimit(DEPTH, -1);
>             setMultilineLimit(LENGTH, -1); }
>             ??? }
>
>             Note how not only does this become more compact, but the
>             unchecked "NumberFormatException" is folded into the
>             match, rather than being a separate concern.
>
>
>             ## Varargs patterns
>
>             Having array patterns offers us a natural way to interpret
>             deconstruction patterns for varargs records.? Assume we have:
>
>             ??? void m(X... xs) { }
>
>             Then a varargs invocation
>
>             ??? m(a, b, c)
>
>             is really sugar for
>
>             ??? m(new X[] { a, b, c })
>
>             So the dual of a varargs invocation, a varargs match, is
>             really a match to an array pattern.? So for a record
>
>             ??? record R(X... xs) { }
>
>             a varargs match:
>
>             ??? case R(var a, var b, var c):
>
>             is really sugar for an array match:
>
>             ??? case R(X[] { var a, var b, var c }):
>
>             And similarly, we can use our "more arity" indicator:
>
>             ??? case R(var a, var b, var c, ...):
>
>             to indicate that there are at least three elements.
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220910/bffa5451/attachment-0001.htm>

From brian.goetz at oracle.com  Sat Sep 10 14:01:23 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 10 Sep 2022 10:01:23 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
Message-ID: <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>

> I think you are overstating how useful a pattern that do a range check is.

I think you're falling into the trap of examining each conversion and 
asking "would I want a pattern to do this."? That's a recipe for more 
complexity because we'll end up with another ad-hoc, 
not-like-anything-else construct (which is what the Java 19 primitive 
type pattern semantics is.)? It's not about "is range check useful" 
(though, it is), its about "is casting to/from primitives safely" useful.

> I'm not against changing what a type pattern is but it should be done 
> in concert with changing the other rules (overriding rules especially) 
> and the retrofitting of primitive types to value classes.

It's not about "changing other rules", its about aligning to them. We're 
aligning to cast conversion here.? When we have named patterns, we will 
have to define overload selection for patterns; again, this should just 
be the existing overload selection with "arrows reversed", which means 
we want boxing for patterns to also be "boxing with arrows reversed" 
(otherwise it doesn't compose.) The language we have now is telling us 
how patterns should work; we should listen.


From brian.goetz at oracle.com  Sat Sep 10 20:04:50 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sat, 10 Sep 2022 16:04:50 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
Message-ID: <26d98589-409f-ce80-911d-91c472cade29@oracle.com>


>> I'm not against changing what a type pattern is but it should be done 
>> in concert with changing the other rules (overriding rules 
>> especially) and the retrofitting of primitive types to value classes.
>
> It's not about "changing other rules", its about aligning to them. 
> We're aligning to cast conversion here.? When we have named patterns, 
> we will have to define overload selection for patterns; again, this 
> should just be the existing overload selection with "arrows reversed", 
> which means we want boxing for patterns to also be "boxing with arrows 
> reversed" (otherwise it doesn't compose.) The language we have now is 
> telling us how patterns should work; we should listen.
>

Just in case it's not clear:

 ?- instanceof T means "is it safe to cast to T"
 ?- non-unconditional type patterns (those that do not resolve to any 
patterns) mean `instanceof T`

And this is true for primitive and reference types, primitive and 
reference targets.? This isn't special new rules for primitive type 
patterns, it is extending instanceof to mean "is it safe to cast" and 
then defining type patterns purely in terms of instanceof.

Other useful things follow too:

 ?- for types S and T, if all values of S are instanceof T, then `T t` 
is unconditional on S (no distinction between primitive and reference types)
 ?- with the same condition, `T t` dominates `S s`
 ?- if S is cast-convertible (modulo unchecked conversions) to T, then 
`T t` is applicable to S (no distinction between primitive and reference 
types)

If you read the new spec (Aggelos will post a draft soon), you'll see 
that we hardly even mention primitive types in the section on type 
patterns.? We just define type patterns in terms of exact casts, and 
same for instanceof.? Pretty much all the new text is "what is an exact 
cast".

From forax at univ-mlv.fr  Sun Sep 11 10:19:13 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Sun, 11 Sep 2022 12:19:13 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
Message-ID: <583895871.2568221.1662891553338.JavaMail.zimbra@u-pem.fr>

I found a way to explain clearly why a reference type pattern and a primitive type pattern are different. 

Let suppose that the code compiles (to avoid the issues of the separate compilation), 
unlike a reference type pattern, the code executed for a primitive type pattern is a function of *both* the declared type and the pattern type. 

By example, if i have a code like this, i've no idea what code is executed for case Foo(int i) without having to go to the declaration of Foo which is usually not collocated with the switch itself. 

Foo foo = ... 
switch (foo) { 
case Foo(int i) -> {} 
case Foo(double d) -> {} 
} 

Here, if Foo is declared like this, record Foo(long l) { } or like that, record Foo(double d) { }, the semantics is different, 
there is no such problem with a reference type, if the first pattern of the switch is case Foo(String s) -> {} we know that it is always a subtyping check. 

This is different from a cast because the cast is collocated with the expression it applies to, so the semantics is a kind of obvious. 
long l = ... 
int i = (int) l; 

R?mi 

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Thursday, September 8, 2022 6:53:21 PM
> Subject: Primitives in instanceof and patterns

> Earlier in the year we talked about primitive type patterns. Let me summarize
> the past discussion, what I think the right direction is, and why this is (yet
> another) "finishing up the job" task for basic patterns that, if left undone,
> will be a sharp edge.

> Prior to record patterns, we didn't support primitive type patterns at all. With
> records, we now support primitive type patterns as nested patterns, but they are
> very limited; they are only applicable to exactly their own type.

> The motivation for "finishing" primitive type patterns is the same as discussed
> earlier this week with array patterns -- if pattern matching is the dual of
> aggregation, we want to avoid gratuitous asymmetries that let you put things
> together but not take them apart.

> Currently, we can assign a `String` to an `Object`, and recover the `String`
> with a pattern match:

> Object o = "Bob";
> if (o instanceof String s) { println("Hi Bob"); }

> Analogously, we can assign an `int` to a `long`:

> long n = 0;

> but we cannot yet recover the int with a pattern match:

> if (n instanceof int i) { ... } // error, pattern `int i` not applicable to
> `long`

> To fill out some more of the asymmetries around records if we don't finish the
> job: given

> record R(int i) { }

> we can construct it with

> new R(anInt) // no adaptation
> new R(aShort) // widening
> new R(anInteger) // unboxing

> but yet cannot deconstruct it the same way:

> case R(int i) // OK
> case R(short s) // nope
> case R(Integer i) // nope

> It would be a gratuitous asymmetry that we can use pattern matching to recover
> from
> reference widening, but not from primitive widening. While many of the
> arguments against doing primitive type patterns now were of the form "let's keep
> things simple", I believe that the simpler solution is actually to _finish the
> job_, because this minimizes asymmetries and potholes that users would otherwise
> have to maintain a mental catalog of.

> Our earlier explorations started (incorrectly, as it turned out), with
> assignment context. This direction gave us a good push in the right direction,
> but turned out to not be the right answer. A more careful reading of JLS Ch5
> convinced me that the answer lies not in assignment conversion, but _cast
> conversion_.

> #### Stepping back: instanceof

> The right place to start is actually not patterns, but `instanceof`. If we
> start here, and listen carefully to the specification, it leads us to the
> correct answer.

> Today, `instanceof` works only for reference types. Accordingly, most people
> view `instanceof` as "the subtyping operator" -- because that's the only
> question we can currently ask it. We almost never see `instanceof` on its own;
> it is nearly always followed by a cast to the same type. Similarly, we rarely
> see a cast on its own; it is nearly always preceded by an `instanceof` for the
> same type.

> There's a reason these two operations travel together: casting is, in general,
> unsafe; we can try to cast an `Object` reference to a `String`, but if the
> reference refers to another type, the cast will fail. So to make casting safe,
> we precede it with an `instanceof` test. The semantics of `instanceof` and
> casting align such that `instanceof` is the precondition test for safe casting.

> > instanceof is the precondition for safe casting

> Asking `instanceof T` means "if I cast this to T, would I like the answer."
> Obviously CCE is an unlikable answer; `instanceof` further adopts the opinion
> that casting `null` would also be an unlikable answer, because while the cast
> would succeed, you can't do anything useful with the result.

> Currently, `instanceof` is only defined on reference types, and on this domain
> coincides with subtyping. On the other hand, casting is defined between
> primitive types (widening, narrowing), and between primitive and reference types
> (boxing, unboxing). Some casts involving primitives yield "better" results than
> others; casting `0` to `byte` results in no loss of information, since `0` is
> representable as a byte, but casting `500` to `byte` succeeds but loses
> information because the higher order bits are discarded.

> If we characterize some casts as "lossy" and others as "exact" -- where lossy
> means discarding useful information -- we can extend the "safe casting
> precondition" meaning of `instanceof` to primitive operands and types in the
> obvious way -- "would casting this expression to this type succeed without error
> and without information loss." If the type of the expression is not castable to
> the type we are asking about, we know the cast cannot succeed and reject the
> `instanceof` test at compile time.

> Defining which casts are lossy and which are exact is fairly straightforward; we
> can appeal to the concept already in the JLS of "representable in the range of a
> type." For some pairs of types, casting is always exact (e.g., casting `int` to
> `long` is always exact); we call these "unconditionally exact". For other pairs
> of types, some values can be cast exactly and others cannot.

> Defining which casts are exact gives us a simple and precise semantics for `x
> instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the static
> type of `x` is not castable to `T`, then the corresponding `instanceof` question
> is rejected statically. The answers are not suprising:

> - Boxing is always exact;
> - Unboxing is exact for all non-null values;
> - Reference widening is always exact;
> - Reference narrowing is exact if the type of the target expression is a
> subtype of the target type;
> - Primitive widening and narrowing are exact if the target expression can be
> represented in the range of the target type.

> #### Primitive type patterns

> It is a short hop from `instanceof` to patterns (including primitive type
> patterns, and reference type patterns applied to primitive types), which can be
> defined entirely in terms of cast conversion and exactness:

> - A type pattern `T t` is applicable to a target of type `S` if `S` is
> cast-convertible to `T`;
> - A type pattern `T t` matches a target `x` if `x` can be cast exactly to `T`;
> - A type pattern `T t` is unconditional at type `S` if casting from `T` to `S`
> is unconditionally exact;
> - A type pattern `T t` dominates a type pattern `S s` (or a record pattern
> `S(...)`) if `T t` would be unconditional on `S`.

> While the rules for casting are complex, primitive patterns add no new
> complexity; there are no new conversions or conversion contexts. If we see:

> switch (a) {
> case T t: ...
> }

> we know the case matches if `a` can be cast exactly to `T`, and the pattern is
> unconditional if _all_ values of `a`'s type can be cast exactly to `T`. Note
> that none of this is specific to primitives; we derive the semantics of _all_
> type patterns from the enhanced definition of casting.

> Now, our record deconstruction examples work symmetrically to construction:

> case R(int i) // OK
> case R(short s) // test if `i` is in the range of `short`
> case R(Integer i) // box `i` to `Integer`
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220911/2e8b30e3/attachment-0001.htm>

From forax at univ-mlv.fr  Sun Sep 11 10:19:16 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Sun, 11 Sep 2022 12:19:16 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
Message-ID: <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Saturday, September 10, 2022 4:01:23 PM
> Subject: Re: Primitives in instanceof and patterns

>> I think you are overstating how useful a pattern that do a range check is.
> 
> I think you're falling into the trap of examining each conversion and
> asking "would I want a pattern to do this."? That's a recipe for more
> complexity because we'll end up with another ad-hoc,
> not-like-anything-else construct (which is what the Java 19 primitive
> type pattern semantics is.)? It's not about "is range check useful"
> (though, it is), its about "is casting to/from primitives safely" useful.

Given that only primitive widening casts are safe, allowing only primitive widening is another way to answer to the question what a primitive type pattern is.
You are proposing a semantics using range checks, that's the problem.

> 
>> I'm not against changing what a type pattern is but it should be done
>> in concert with changing the other rules (overriding rules especially)
>> and the retrofitting of primitive types to value classes.
> 
> It's not about "changing other rules", its about aligning to them. We're
> aligning to cast conversion here.? When we have named patterns, we will
> have to define overload selection for patterns; again, this should just
> be the existing overload selection with "arrows reversed", which means
> we want boxing for patterns to also be "boxing with arrows reversed"
> (otherwise it doesn't compose.) The language we have now is telling us
> how patterns should work; we should listen.

As an example, instanceof rules and the rules about overriding methods are intimately linked, asking if a method override another is equivalent to asking if their function types are a subtypes.

if int instanceof double is allowed, then B::m should override A::m
  class A {
    int m() { ... }
  }
  class B extends A {
    @Override
    double m() { ... }
  }

This is what i meant by changing other rules.

R?mi

From brian.goetz at oracle.com  Sun Sep 11 14:48:04 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sun, 11 Sep 2022 10:48:04 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
 <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>
Message-ID: <ecdc3f34-d901-b8cc-442c-61c21135e953@oracle.com>


>>
>> I think you're falling into the trap of examining each conversion and
>> asking "would I want a pattern to do this."
> Given that only primitive widening casts are safe, allowing only primitive widening is another way to answer to the question what a primitive type pattern is.
> You are proposing a semantics using range checks, that's the problem.

So, substitute "reference" for "primitive" in this argument, and you 
will see how silly it is: "since only reference widening is safe, 
allowing only reference widening would be 'another answer to what a 
reference type pattern is.'"? But that would also be a useless 
semantic.? You're caught up on "range checks", but that's not the 
important thing here.? Casting is the important thing.

> As an example, instanceof rules and the rules about overriding methods 
> are intimately linked, asking if a method override another is 
> equivalent to asking if their function types are a subtypes.
> if int instanceof double is allowed, then B::m should override A::m
>    class A {
>      int m() { ... }
>    }
>    class B extends A {
>      @Override
>      double m() { ... }
>    }
>
> This is what i meant by changing other rules.

Another cute argument, but no.? Covariant overriding is linked to 
*subtyping*.? Instanceof *happens to coincide* with subtyping right now 
(given its ad-hoc restrictions), but the causality goes the other way.? 
(Casting also appeals to subtyping, through reference widening 
conversions.)? But this argument is like starting with "all men are 
moral" and "Socrates is a man" and concluding "All men are Socrates."

We can talk about whether it would be wise to align the definition of 
covariant overrides with conversions other than reference widening (and 
will likely come up again in Valhalla anyway), but this is by no means a 
forced move, and not tied to generalizing the semantics of instanceof.

> I found a way to explain clearly why a reference type pattern and a 
> primitive type pattern are different.
>
> Let suppose that the code compiles (to avoid the issues of the 
> separate compilation),
> unlike a reference type pattern, the code executed for a primitive 
> type pattern is a function of *both* the declared type and the pattern 
> type.

So (a) untrue -- what code we execute for a reference type pattern does 
depend on the static types -- we may or may not generate an `instanceof` 
instruction, depending on whether the pattern is unconditional.? (The 
same is true for a cast; some casts are no-ops and generate no code.)? 
And (b), so what?? We're asking "would it be safe to cast x to T".? 
Depending on the types X and T, we will have different code for the 
casting, so why is it unreasonable to have different code for asking 
whether it is castable?

>
> By example, if i have a code like this, i've no idea what code is 
> executed for case Foo(int i) without having to go to the declaration 
> of Foo which is usually not collocated with the switch itself.
>
> ?? Foo foo = ...
> ?? switch (foo) {
> ????? case Foo(int i) -> {}
> ????? case Foo(double d) -> {}
> ??? }

Sigh, this argument again?? We've been through this extensively the 
first time around, with reference types, where you "had no idea what 
this code means" without looking at the declaration of the pattern. 
(Then, it was partiality and totality.)? I get that you didn't like that 
total and partial patterns don't look syntactically different, and that 
ship has sailed.? But this is the same argument warmed over.

"What code will be executed" is irrelevant; what is relevant is the 
semantics.? Assuming a single deconstruction pattern for Foo, the first 
case asks "can the Foo's component be cast safely to int, and if so, 
please cast it for me".? It doesn't matter what code we use to answer 
that question or do the cast -- could be a narrowing, could be an 
unboxing, whatever.

You see the same thing today without patterns:

 ??? var x = foo.getFoo();
 ??? int i = (int) x;

x could be a long, an int, an Integer, etc, but you don't know unless 
you look at the definition of getFoo().? And you have "no idea what code 
will be executed."? Sure, but so what?? You asked for a cast to int.? 
The language validated that x is castable to int, and does what needs to 
be done, which might be nothing, or a widening, or a narrowing with 
truncation, or an unboxing, or some combination.

(When we get to overloading deconstruction patterns, we'll have all the 
same issues as we have with overloading methods today -- it is not 
obvious looking only at the call site, which overload is called, and 
therefore which conversions are applied to arguments or returns.)

As a reminder, here's what a nested pattern means:

 ??? x matches P(Q) === x matches P(var q) && q matches Q

Understanding what is going to happen involves understanding the type of 
`q`.? I get that you didn't like that choice, and that's your right, but 
it's not OK to keep bringing it up as if its a new thing.


I think I actually understand your concern here, which has nothing to do 
with the dozen or so bogus examples and explanations you've tossed out 
so far.? It is that cast conversion is complicated, and you would like 
pattern matching to be "simple", and so pulling in the muck of cast 
conversion into pattern matching feels to you like an unforced error.? 
Right?? (And if so, perhaps you could have just said that, instead of 
throwing random arguments at the wall?)

I also would like to hear from more people in this discussion, and I 
don't think the style of discourse we've fallen into (again) is 
conducive to that.


From brian.goetz at oracle.com  Mon Sep 12 19:36:09 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 12 Sep 2022 15:36:09 -0400
Subject: Knocking off two more vestiges of legacy switch
Message-ID: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>

The work on primitive patterns continues to yield fruit; it points us to 
a principled way of dealing with _constant patterns_, both as nested 
patterns, and to redefining constant case labels as simple patterns.? It 
also points us to a way to bring the missing three types into the realm 
of switch (since now switch is usable at every type _but_ these): float, 
double, and boolean.? While I'm not in a hurry to prioritize this 
immediately, I wanted to connect the dots to how primitive type patterns 
lay the foundation for these two vestiges of legacy switch.? (The 
remaining vestige, not yet dealt with, is that legacy statement switches 
are not exhaustive.? We'd like a path back to uniformity there as well, 
but this is likely a longer road.)

**Constant patterns.**? In early explorations (e.g., "Pattern Matching 
Semantics"), we struggled with the meaning of constant patterns, 
specifically with conversions in the absence of a sharp type for the 
match target.? The exploration of that document treated boxing 
conversions but not other conversions, which would have created a 
gratuitously new conversion context. This was one of several reasons we 
deferred constant patterns.

The current status is that constant case labels (e.g., `case 3`) are 
permitted (a) only in the presence of a compatible operand type and (b) 
are not patterns.? This has led to some accidental complexity in 
specifying switch, since we can have a mix of pattern and non-pattern 
labels, and it means we can't use constants as nested patterns.? (We've 
also not yet integrated enum cases into the exhaustiveness analysis in 
the presence of a sealed type that permits an enum type.)? Ret-conning 
all case labels as patterns seems attractive if we can make the 
semantics clear, as not only does it bring more uniformity, but it means 
we can use them as nested patterns, not just at the top level of the 
switch.? More composition.

The recent work on `instanceof` involving primitives offers a clear and 
principled meaning to `0` as a pattern; given a constant `c` of type 
`C`, treat

 ??? x matches c

as meaning

 ??? x matches C alpha && alpha eq c

where `eq` is a suitable comparison predicate for the type C (== for 
integral types and enums, .equals() for String, and something irritating 
for floating point.)? This gives us a solid basis for interpreting 
something like `case 3L`; we match if the target would match `long 
alpha` and `alpha == 3L`.? No new rules; all conversions are handled 
through the type pattern for the static type of the constant in 
question.? Not coincidentally, the rules for primitive type patterns 
support the implicit conversions allowed in today's switches on `short`, 
`byte`, and `char`, which are allowed to use `int` labels, preserving 
the meaning of existing code while we generalize what switch means.

The other attributes of patterns -- applicability, exhaustiveness, and 
dominance -- are also easy:

 ?- a constant pattern for `c : C` is applicable to S if a type pattern 
for `C` is applicable to S.
 ?- a type pattern for T dominates a constant pattern for `c : C` if the 
type pattern for T dominates a type pattern for C.
 ?- constant patterns are never exhaustive.

No new rules; just appeal to type patterns.

**Switch on float, double, and boolean.**? Switches on floating point 
were left out for the obvious reason -- it just isn't that useful, and 
it would have introduced new complexity into the specification of 
switch.? Similarly, boolean was left out because we have "if" 
statements.? In the original world, where you could switch on only five 
types, this was a sensible compromise.? We later added in String and 
enum types, which were sensible additions.?? But now we move into a 
world where we can switch on every type _except_ float, double, and 
boolean -- and this no long seems sensible.? It still may not be 
something people will use often, but a key driver of the redesign of 
switch has been refactorability, and we currently don't have a story for 
refactoring

 ??? record R(float f) { }

 ??? switch (r) {
 ??????? case R(0f): ...
 ??????? case R(1f): ...
 ??? }

to

 ??? switch (r) {
 ??????? case R rr:
 ??????????? switch (rr.f()) {
 ??????????????? case 0f: ...
 ??????????????? case 1f: ...
 ??????????? }
 ??? }

because we don't have switches on float.? By retconning constant case 
labels as patterns, we don't have to define new semantics for switching 
on these types or for constant labels of these types, we only have to 
remove the restrictions about what types you can switch on.

**Denoting constant patterns.**? One of the remaining questions is how 
we denote constant patterns.? This is a bit of a bikeshed, which we can 
come back to when we're ready to move forward.? For purposes of 
exposition we'll use the constant literal here.

**Closing a compositional asymmetry.**? In the "Patterns in the Java 
Object Model" document, we called attention to a glaring problem in API 
design, where it becomes nearly impossible to use the same sort of 
composition for taking apart objects that we use for putting them 
together.? As an example, suppose we compose an `Optional<Shape>` as 
follows:

 ??? Optional<Shape> os = Optional.of(Shape.redBall(1));

Here, we have static factories for both Optional and Shape, they don't 
know about each other, but we can compose them just fine. Today, if we 
want to reverse that -- ask whether an `Optional<Shape>` contains a red 
ball of size 1, we have to do something awful and error prone:

 ??? Shape s = os.orElse(null);
 ??? boolean isRedUnitBall = s != null
 ??? ? ?? ????????????????? && s.isBall()
 ??????? ? ?? ????????????? && (s.color() == RED)
 ??????????? ? ?? ????????? && s.size() == 1;
 ??? if (isRedUnitBall) { ... }

These code snippets look nothing alike, making reversal harder and more 
error-prone, and it gets worse the deeper you compose. With 
destructuring patterns, this gets much better and more like the creation 
expression:

 ??? if (os instanceof Optional.of(Shape.redBall(var size))
 ? ?? ?? && size == 1) { ... }

but that `&& size == 1` was a pesky asymmetry.? With constant patterns 
(modulo syntax), we can complete the transformation:

 ??? if (os instanceof Optional.of(Shape.redBall(1)) { ... }

and destructuring looks just like the aggregation.

**Bonus round: the last (?) vestige.**? Currently, we allow statement 
switches on legacy switch types (integers, their boxes, strings, and 
enums) with all constant labels to be partial, and require all other 
switches to be total. Patching this hole is harder, since there is lots 
of legacy code today that depends on this partiality.? There are a few 
things we can do to pave the way forward here:

 ?- Allow `default -> ;` in addition to `default -> { }`, since people 
seem to have a hard time discovering the latter.
 ?- Issue a warning when a legacy switch construct is not exhaustive.? 
This can start as a lint warning, move up to a regular warning over 
time, then a mandatory (unsuppressable) warning.? Maybe in a decade it 
can become an error, but we can start paving the way sooner.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220912/c91e016e/attachment-0001.htm>

From amalloy at google.com  Mon Sep 12 20:20:05 2022
From: amalloy at google.com (Alan Malloy)
Date: Mon, 12 Sep 2022 13:20:05 -0700
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
Message-ID: <CAC1Wh8EjF+yW=28Cv7SZ2oW0OcwmnNy8cPk9Hhy4rj23PhT7DQ@mail.gmail.com>

It's nice to see this: I think it helps some with the previous discussion
on the list about "why do we want instanceof for primitives?" The point
isn't that we expect anyone to use instanceof for primitives often, but
that conversions for primitives in patterns is an important part of fixing
up the asymmetries switch is still stuck with.

On Mon, Sep 12, 2022 at 12:36 PM Brian Goetz <brian.goetz at oracle.com> wrote:

> The work on primitive patterns continues to yield fruit; it points us to a
> principled way of dealing with _constant patterns_, both as nested
> patterns, and to redefining constant case labels as simple patterns.  It
> also points us to a way to bring the missing three types into the realm of
> switch (since now switch is usable at every type _but_ these): float,
> double, and boolean.  While I'm not in a hurry to prioritize this
> immediately, I wanted to connect the dots to how primitive type patterns
> lay the foundation for these two vestiges of legacy switch.  (The remaining
> vestige, not yet dealt with, is that legacy statement switches are not
> exhaustive.  We'd like a path back to uniformity there as well, but this is
> likely a longer road.)
>
> **Constant patterns.**  In early explorations (e.g., "Pattern Matching
> Semantics"), we struggled with the meaning of constant patterns,
> specifically with conversions in the absence of a sharp type for the match
> target.  The exploration of that document treated boxing conversions but
> not other conversions, which would have created a gratuitously new
> conversion context.  This was one of several reasons we deferred constant
> patterns.
>
> The current status is that constant case labels (e.g., `case 3`) are
> permitted (a) only in the presence of a compatible operand type and (b) are
> not patterns.  This has led to some accidental complexity in specifying
> switch, since we can have a mix of pattern and non-pattern labels, and it
> means we can't use constants as nested patterns.  (We've also not yet
> integrated enum cases into the exhaustiveness analysis in the presence of a
> sealed type that permits an enum type.)  Ret-conning all case labels as
> patterns seems attractive if we can make the semantics clear, as not only
> does it bring more uniformity, but it means we can use them as nested
> patterns, not just at the top level of the switch.  More composition.
>
> The recent work on `instanceof` involving primitives offers a clear and
> principled meaning to `0` as a pattern; given a constant `c` of type `C`,
> treat
>
>     x matches c
>
> as meaning
>
>     x matches C alpha && alpha eq c
>
> where `eq` is a suitable comparison predicate for the type C (== for
> integral types and enums, .equals() for String, and something irritating
> for floating point.)  This gives us a solid basis for interpreting
> something like `case 3L`; we match if the target would match `long alpha`
> and `alpha == 3L`.  No new rules; all conversions are handled through the
> type pattern for the static type of the constant in question.  Not
> coincidentally, the rules for primitive type patterns support the implicit
> conversions allowed in today's switches on `short`, `byte`, and `char`,
> which are allowed to use `int` labels, preserving the meaning of existing
> code while we generalize what switch means.
>
> The other attributes of patterns -- applicability, exhaustiveness, and
> dominance -- are also easy:
>
>  - a constant pattern for `c : C` is applicable to S if a type pattern for
> `C` is applicable to S.
>  - a type pattern for T dominates a constant pattern for `c : C` if the
> type pattern for T dominates a type pattern for C.
>  - constant patterns are never exhaustive.
>
> No new rules; just appeal to type patterns.
>
> **Switch on float, double, and boolean.**  Switches on floating point were
> left out for the obvious reason -- it just isn't that useful, and it would
> have introduced new complexity into the specification of switch.
> Similarly, boolean was left out because we have "if" statements.  In the
> original world, where you could switch on only five types, this was a
> sensible compromise.  We later added in String and enum types, which were
> sensible additions.   But now we move into a world where we can switch on
> every type _except_ float, double, and boolean -- and this no long seems
> sensible.  It still may not be something people will use often, but a key
> driver of the redesign of switch has been refactorability, and we currently
> don't have a story for refactoring
>
>     record R(float f) { }
>
>     switch (r) {
>         case R(0f): ...
>         case R(1f): ...
>     }
>
> to
>
>     switch (r) {
>         case R rr:
>             switch (rr.f()) {
>                 case 0f: ...
>                 case 1f: ...
>             }
>     }
>
> because we don't have switches on float.  By retconning constant case
> labels as patterns, we don't have to define new semantics for switching on
> these types or for constant labels of these types, we only have to remove
> the restrictions about what types you can switch on.
>
> **Denoting constant patterns.**  One of the remaining questions is how we
> denote constant patterns.  This is a bit of a bikeshed, which we can come
> back to when we're ready to move forward.  For purposes of exposition we'll
> use the constant literal here.
>
> **Closing a compositional asymmetry.**  In the "Patterns in the Java
> Object Model" document, we called attention to a glaring problem in API
> design, where it becomes nearly impossible to use the same sort of
> composition for taking apart objects that we use for putting them
> together.  As an example, suppose we compose an `Optional<Shape>` as
> follows:
>
>     Optional<Shape> os = Optional.of(Shape.redBall(1));
>
> Here, we have static factories for both Optional and Shape, they don't
> know about each other, but we can compose them just fine.  Today, if we
> want to reverse that -- ask whether an `Optional<Shape>` contains a red
> ball of size 1, we have to do something awful and error prone:
>
>     Shape s = os.orElse(null);
>     boolean isRedUnitBall = s != null
>                            && s.isBall()
>                            && (s.color() == RED)
>                            && s.size() == 1;
>     if (isRedUnitBall) { ... }
>
> These code snippets look nothing alike, making reversal harder and more
> error-prone, and it gets worse the deeper you compose.  With destructuring
> patterns, this gets much better and more like the creation expression:
>
>     if (os instanceof Optional.of(Shape.redBall(var size))
>         && size == 1) { ... }
>
> but that `&& size == 1` was a pesky asymmetry.  With constant patterns
> (modulo syntax), we can complete the transformation:
>
>     if (os instanceof Optional.of(Shape.redBall(1)) { ... }
>
> and destructuring looks just like the aggregation.
>
> **Bonus round: the last (?) vestige.**  Currently, we allow statement
> switches on legacy switch types (integers, their boxes, strings, and enums)
> with all constant labels to be partial, and require all other switches to
> be total.  Patching this hole is harder, since there is lots of legacy code
> today that depends on this partiality.  There are a few things we can do to
> pave the way forward here:
>
>  - Allow `default -> ;` in addition to `default -> { }`, since people seem
> to have a hard time discovering the latter.
>  - Issue a warning when a legacy switch construct is not exhaustive.  This
> can start as a lint warning, move up to a regular warning over time, then a
> mandatory (unsuppressable) warning.  Maybe in a decade it can become an
> error, but we can start paving the way sooner.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220912/ba3d6ebf/attachment.htm>

From forax at univ-mlv.fr  Mon Sep 12 22:28:58 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 00:28:58 +0200 (CEST)
Subject: Primitives in instanceof and patterns
In-Reply-To: <ecdc3f34-d901-b8cc-442c-61c21135e953@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
 <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>
 <ecdc3f34-d901-b8cc-442c-61c21135e953@oracle.com>
Message-ID: <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Sunday, September 11, 2022 4:48:04 PM
> Subject: Re: Primitives in instanceof and patterns

>>>
>>> I think you're falling into the trap of examining each conversion and
>>> asking "would I want a pattern to do this."
>> Given that only primitive widening casts are safe, allowing only primitive
>> widening is another way to answer to the question what a primitive type pattern
>> is.
>> You are proposing a semantics using range checks, that's the problem.
> 
> So, substitute "reference" for "primitive" in this argument, and you
> will see how silly it is: "since only reference widening is safe,
> allowing only reference widening would be 'another answer to what a
> reference type pattern is.'"? But that would also be a useless
> semantic.? You're caught up on "range checks", but that's not the
> important thing here.? Casting is the important thing.

In fact, primitive widening is not a good idea, see my anwser about the constant pattern.

> 
>> As an example, instanceof rules and the rules about overriding methods
>> are intimately linked, asking if a method override another is
>> equivalent to asking if their function types are a subtypes.
>> if int instanceof double is allowed, then B::m should override A::m
>>    class A {
>>      int m() { ... }
>>    }
>>    class B extends A {
>>      @Override
>>      double m() { ... }
>>    }
>>
>> This is what i meant by changing other rules.
> 
> Another cute argument, but no.? Covariant overriding is linked to
> *subtyping*.? Instanceof *happens to coincide* with subtyping right now
> (given its ad-hoc restrictions), but the causality goes the other way.
> (Casting also appeals to subtyping, through reference widening
> conversions.)? But this argument is like starting with "all men are
> moral" and "Socrates is a man" and concluding "All men are Socrates."
> 
> We can talk about whether it would be wise to align the definition of
> covariant overrides with conversions other than reference widening (and
> will likely come up again in Valhalla anyway), but this is by no means a
> forced move, and not tied to generalizing the semantics of instanceof.

If we can avoid ten different semantics for casting, pattern and overriding, etc. I think it's a win.

Valhalla is another can of worms, because you are prematurely assigning a semantics to instanceof int, so Valhalla can not retcon instanceof int to instaceof Qjava/lang/Integer; even if unlike the primitive type int, Qjava/lang/Integer; is an object.

There is another mismatch, int.class.isInstance(o) and o instanceof int are not aligned anymore.

> 
>> I found a way to explain clearly why a reference type pattern and a
>> primitive type pattern are different.
>>
>> Let suppose that the code compiles (to avoid the issues of the
>> separate compilation),
>> unlike a reference type pattern, the code executed for a primitive
>> type pattern is a function of *both* the declared type and the pattern
>> type.
> 
> So (a) untrue -- what code we execute for a reference type pattern does
> depend on the static types -- we may or may not generate an `instanceof`
> instruction, depending on whether the pattern is unconditional.? (The
> same is true for a cast; some casts are no-ops and generate no code.)

Please take a look to the examples, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional.

> And (b), so what?? We're asking "would it be safe to cast x to T".
> Depending on the types X and T, we will have different code for the
> casting, so why is it unreasonable to have different code for asking
> whether it is castable ?

see below

> 
>>
>> By example, if i have a code like this, i've no idea what code is
>> executed for case Foo(int i) without having to go to the declaration
>> of Foo which is usually not collocated with the switch itself.
>>
>> ?? Foo foo = ...
>> ?? switch (foo) {
>> ????? case Foo(int i) -> {}
>> ????? case Foo(double d) -> {}
>> ??? }
> 
> Sigh, this argument again?? We've been through this extensively the
> first time around, with reference types, where you "had no idea what
> this code means" without looking at the declaration of the pattern.
> (Then, it was partiality and totality.)? I get that you didn't like that
> total and partial patterns don't look syntactically different, and that
> ship has sailed.? But this is the same argument warmed over.

Nope, please take a look to the example, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional.

> 
> "What code will be executed" is irrelevant; what is relevant is the
> semantics.? Assuming a single deconstruction pattern for Foo, the first
> case asks "can the Foo's component be cast safely to int, and if so,
> please cast it for me".? It doesn't matter what code we use to answer
> that question or do the cast -- could be a narrowing, could be an
> unboxing, whatever.

It matters because the first pattern is conditional, so it's important to know the condition, at least when you debug.

> 
> You see the same thing today without patterns:
> 
> ??? var x = foo.getFoo();
> ??? int i = (int) x;
> 
> x could be a long, an int, an Integer, etc, but you don't know unless
> you look at the definition of getFoo().? And you have "no idea what code
> will be executed."? Sure, but so what?? You asked for a cast to int.
> The language validated that x is castable to int, and does what needs to
> be done, which might be nothing, or a widening, or a narrowing with
> truncation, or an unboxing, or some combination.

There is a big difference between 

  var x = foo.getFoo();
  if (x instanceof int) { ... }

and the code above when you are reading the code.

The issue with the semantics you propose is that the pattern express a condition but the condition is hidden.

With a cast there is no condition, it will be always executed.

> 
> (When we get to overloading deconstruction patterns, we'll have all the
> same issues as we have with overloading methods today -- it is not
> obvious looking only at the call site, which overload is called, and
> therefore which conversions are applied to arguments or returns.)

We do not need overloading of patterns !
I repeat.
We do not need overloading of patterns !

We need overloaded constructor because if the canonical constructor takes 3 arguments and we want a constructor with two, we have to provide a value.
In case of pattern methods / deconstructor, we can match the three arguments but with an '_' for the argument we want to drop.


> 
> As a reminder, here's what a nested pattern means:
> 
> ??? x matches P(Q) === x matches P(var q) && q matches Q
> 
> Understanding what is going to happen involves understanding the type of
> `q`.? I get that you didn't like that choice, and that's your right, but
> it's not OK to keep bringing it up as if its a new thing.

If the switch is exhaustive, the patterns below will usually (it's not fully true) help.
  switch(x) {
    case P(Q q) ->
    case P(R r) ->
  }  

the last one is unconditional so the first pattern does a r instanceof Q q.

> 
> 
> I think I actually understand your concern here, which has nothing to do
> with the dozen or so bogus examples and explanations you've tossed out
> so far.? It is that cast conversion is complicated, and you would like
> pattern matching to be "simple", and so pulling in the muck of cast
> conversion into pattern matching feels to you like an unforced error.
> Right?? (And if so, perhaps you could have just said that, instead of
> throwing random arguments at the wall?)

Nope,
let me recapitulate.

1) having a primitive pattern doing a range check is useless because this is rare that you want to do a range check + cast in real life,
  How many people have written a code like this

   int i = ...
   if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) {
     byte b = (byte) i;
     ...
   }

  It's useful when you write a bytecode generator without using an existing library, ok, but how many write a bytecode generator ?
  It should not be the default behavior for the primitive type pattern.

2) It's also useless because there is no need to have it as a pattern, when you can use a cast in the following expression
   Person person = ...
   switch(person) {
     // instead of
     // case Person(double age) -> foo(age);
     // one can write
     case Person(int age) -> foo(age);  // widening cast
   }

3) when you read a conditional primitive patterns, you have no idea what is the underlying operation until you go to the declaration (unlike the code just above).


4) if we change the type pattern to be not just about subtyping, we should revisit the JLS to avoid to have too many different semantics.

R?mi

From forax at univ-mlv.fr  Mon Sep 12 22:29:03 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Tue, 13 Sep 2022 00:29:03 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
Message-ID: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Monday, September 12, 2022 9:36:09 PM
> Subject: Knocking off two more vestiges of legacy switch

> The work on primitive patterns continues to yield fruit; it points us to a
> principled way of dealing with _constant patterns_, both as nested patterns,
> and to redefining constant case labels as simple patterns. It also points us to
> a way to bring the missing three types into the realm of switch (since now
> switch is usable at every type _but_ these): float, double, and boolean. While
> I'm not in a hurry to prioritize this immediately, I wanted to connect the dots
> to how primitive type patterns lay the foundation for these two vestiges of
> legacy switch. (The remaining vestige, not yet dealt with, is that legacy
> statement switches are not exhaustive. We'd like a path back to uniformity
> there as well, but this is likely a longer road.)

> **Constant patterns.** In early explorations (e.g., "Pattern Matching
> Semantics"), we struggled with the meaning of constant patterns, specifically
> with conversions in the absence of a sharp type for the match target. The
> exploration of that document treated boxing conversions but not other
> conversions, which would have created a gratuitously new conversion context.
> This was one of several reasons we deferred constant patterns.

> The current status is that constant case labels (e.g., `case 3`) are permitted
> (a) only in the presence of a compatible operand type and (b) are not patterns.
> This has led to some accidental complexity in specifying switch, since we can
> have a mix of pattern and non-pattern labels, and it means we can't use
> constants as nested patterns. (We've also not yet integrated enum cases into
> the exhaustiveness analysis in the presence of a sealed type that permits an
> enum type.) Ret-conning all case labels as patterns seems attractive if we can
> make the semantics clear, as not only does it bring more uniformity, but it
> means we can use them as nested patterns, not just at the top level of the
> switch. More composition.

> The recent work on `instanceof` involving primitives offers a clear and
> principled meaning to `0` as a pattern; given a constant `c` of type `C`, treat

> x matches c

> as meaning

> x matches C alpha && alpha eq c

> where `eq` is a suitable comparison predicate for the type C (== for integral
> types and enums, .equals() for String, and something irritating for floating
> point.) This gives us a solid basis for interpreting something like `case 3L`;
> we match if the target would match `long alpha` and `alpha == 3L`. No new
> rules; all conversions are handled through the type pattern for the static type
> of the constant in question. Not coincidentally, the rules for primitive type
> patterns support the implicit conversions allowed in today's switches on
> `short`, `byte`, and `char`, which are allowed to use `int` labels, preserving
> the meaning of existing code while we generalize what switch means.

> The other attributes of patterns -- applicability, exhaustiveness, and dominance
> -- are also easy:

> - a constant pattern for `c : C` is applicable to S if a type pattern for `C` is
> applicable to S.
> - a type pattern for T dominates a constant pattern for `c : C` if the type
> pattern for T dominates a type pattern for C.
> - constant patterns are never exhaustive.

> No new rules; just appeal to type patterns.
It shows that the semantics you propose for the primitive type pattern is not the right one. 

Currently, a code like this does not compile 
byte b = ... 
switch(b) { 
case 200 -> .... 
} 

because 200 is not a short which is great because otherwise at runtime it will never be reached. 

But if we apply the rules above + your definition of the primitive pattern, the code above will happily compile because it is equivalent to 

byte b = ... 
switch(b) { 
case short s when s == 200 -> .... 
} 

Moreover, i think R(true) and R(false) should be exhaustive, it's not a big deal because you can rewrite it R(true) and R (or R(_)) but i think that R(true) and R(false) is more readable. 

> **Switch on float, double, and boolean.** Switches on floating point were left
> out for the obvious reason -- it just isn't that useful, and it would have
> introduced new complexity into the specification of switch. Similarly, boolean
> was left out because we have "if" statements. In the original world, where you
> could switch on only five types, this was a sensible compromise. We later added
> in String and enum types, which were sensible additions. But now we move into a
> world where we can switch on every type _except_ float, double, and boolean --
> and this no long seems sensible. It still may not be something people will use
> often, but a key driver of the redesign of switch has been refactorability, and
> we currently don't have a story for refactoring

> record R(float f) { }

> switch (r) {
> case R(0f): ...
> case R(1f): ...
> }

> to

> switch (r) {
> case R rr:
> switch (rr.f()) {
> case 0f: ...
> case 1f: ...
> }
> }

> because we don't have switches on float. By retconning constant case labels as
> patterns, we don't have to define new semantics for switching on these types or
> for constant labels of these types, we only have to remove the restrictions
> about what types you can switch on.

> **Denoting constant patterns.** One of the remaining questions is how we denote
> constant patterns. This is a bit of a bikeshed, which we can come back to when
> we're ready to move forward. For purposes of exposition we'll use the constant
> literal here.
This is what Haskell does, this is what Caml don't, at some point we will have to pick a side. 

> **Closing a compositional asymmetry.** In the "Patterns in the Java Object
> Model" document, we called attention to a glaring problem in API design, where
> it becomes nearly impossible to use the same sort of composition for taking
> apart objects that we use for putting them together. As an example, suppose we
> compose an `Optional<Shape>` as follows:

> Optional<Shape> os = Optional.of(Shape.redBall(1));

> Here, we have static factories for both Optional and Shape, they don't know
> about each other, but we can compose them just fine. Today, if we want to
> reverse that -- ask whether an `Optional<Shape>` contains a red ball of size 1,
> we have to do something awful and error prone:

> Shape s = os.orElse(null);
> boolean isRedUnitBall = s != null
> && s.isBall()
> && (s.color() == RED)
> && s.size() == 1;
> if (isRedUnitBall) { ... }

> These code snippets look nothing alike, making reversal harder and more
> error-prone, and it gets worse the deeper you compose. With destructuring
> patterns, this gets much better and more like the creation expression:

> if (os instanceof Optional.of(Shape.redBall(var size))
> && size == 1) { ... }

> but that `&& size == 1` was a pesky asymmetry. With constant patterns (modulo
> syntax), we can complete the transformation:

> if (os instanceof Optional.of(Shape.redBall(1)) { ... }

> and destructuring looks just like the aggregation.
I agree, it's quite sad that we have to support float and double but as you said composition is more important. 

> **Bonus round: the last (?) vestige.** Currently, we allow statement switches on
> legacy switch types (integers, their boxes, strings, and enums) with all
> constant labels to be partial, and require all other switches to be total.
> Patching this hole is harder, since there is lots of legacy code today that
> depends on this partiality. There are a few things we can do to pave the way
> forward here:

> - Allow `default -> ;` in addition to `default -> { }`, since people seem to
> have a hard time discovering the latter.
we should also fix that for lambdas, the fact that the lambda syntax and the case arrow syntax are not aligned currently ; `() -> throw ...`is not legal while `case ... -> throw ...` is, is something that trouble a lot of my student (i also introduce the switch syntax before the lambda, so the lambda seems less powerful ??). 

> - Issue a warning when a legacy switch construct is not exhaustive. This can
> start as a lint warning, move up to a regular warning over time, then a
> mandatory (unsuppressable) warning. Maybe in a decade it can become an error,
> but we can start paving the way sooner.

I agree with a switch warning if all the IDEs stop fixing the warning by adding a `default` when the type switched upon is sealed. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/8776f6bc/attachment.htm>

From brian.goetz at oracle.com  Mon Sep 12 22:48:42 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 12 Sep 2022 18:48:42 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
 <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>
 <ecdc3f34-d901-b8cc-442c-61c21135e953@oracle.com>
 <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr>
Message-ID: <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com>


>> (When we get to overloading deconstruction patterns, we'll have all the
>> same issues as we have with overloading methods today -- it is not
>> obvious looking only at the call site, which overload is called, and
>> therefore which conversions are applied to arguments or returns.)
> We do not need overloading of patterns !
> I repeat.
> We do not need overloading of patterns !

You would be incorrect about that.

Deconstruction patterns are the dual of constructors.? Pairing a 
constructor (or factory) with a deconstruction (or static) pattern forms 
an embedding-projection pair, which is what drives, e.g., the `with` 
construct.? This is a very powerful relationship. Constructors can be 
overloaded; saying "but deconstructors can't" is just a gratuitous 
restriction, and it undermines the role of patterns as the dual of 
ctors/methods.

At the risk of repeating myself, it is clear that you just want pattern 
matching to be a much smaller feature.? That's a valid opinion, but 
please stop with the "not a good idea" / "not needed" / "useless" / 
"YAGNI" at every turn.? There's a big story here; you can agree with it 
or not, but please, please, please stop trying to talk down every part 
of the story because it seems "too big" to you.? If you don't understand 
the big story, ask (constructive) questions.? But please, please, stop 
trying to YAGNI away everything.? It's not helpful.

> Nope,
> let me recapitulate.
>
> 1) having a primitive pattern doing a range check is useless because this is rare that you want to do a range check + cast in real life,
>    How many people have written a code like this
>
>     int i = ...
>     if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) {
>       byte b = (byte) i;
>       ...
>     }
>
>    It's useful when you write a bytecode generator without using an existing library, ok, but how many write a bytecode generator ?
>    It should not be the default behavior for the primitive type pattern.
>
> 2) It's also useless because there is no need to have it as a pattern, when you can use a cast in the following expression
>     Person person = ...
>     switch(person) {
>       // instead of
>       // case Person(double age) -> foo(age);
>       // one can write
>       case Person(int age) -> foo(age);  // widening cast
>     }
>
> 3) when you read a conditional primitive patterns, you have no idea what is the underlying operation until you go to the declaration (unlike the code just above).
>
>
> 4) if we change the type pattern to be not just about subtyping, we should revisit the JLS to avoid to have too many different semantics.
>

Thanks for stating your concerns succinctly.? (Some of this is just 
subjective "I want patterns to be a smaller feature"; some is 
disagreement with decisions that are already made.)


From brian.goetz at oracle.com  Mon Sep 12 22:57:40 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Mon, 12 Sep 2022 18:57:40 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
Message-ID: <fc173a17-09a6-961d-afc9-adefdfab5fe5@oracle.com>


>
> It shows that the semantics you propose for the primitive type pattern 
> is not the right one.
>
> Currently, a code like this does not compile
> ? byte b = ...
> ? switch(b) {
> ??? case 200 -> ....
> ? }

Thanks, that's a good catch -- we currently do more type checking than a 
strict interpretation of this story for constant patterns provides.? But 
this can be addressed by additional compile-time type checking for 
constant patterns.

But this would be a critique of _constant patterns_, not of primitive 
type patterns (and easily addressed.)

>
> because 200 is not a short which is great because otherwise at runtime 
> it will never be reached.

I think you mean "not a byte"?

>
> But if we apply the rules above + your definition of the primitive 
> pattern, the code above will happily compile because it is equivalent to
>
> ? byte b = ...
> ? switch(b) {
> ??? case short s when s == 200 -> ....
> ? }

I think you mean "case int s when s == 200"?

>
> Moreover, i think R(true) and R(false) should be exhaustive, it's not 
> a big deal because you can rewrite it R(true) and R (or R(_)) but i 
> think that R(true) and R(false) is more readable.

Agree, that's in the plan.? Booleans are like enums, so true/false 
covers boolean, and therefore R(true) and R(false) covers R(boolean).

>
> I agree, it's quite sad that we have to support float and double but 
> as you said composition is more important.

It would have been unfortunate if we had to add these as special cases 
for switch.? But with primitive type patterns plus "constants are 
patterns" then this falls out trivially without additional 
specification; all we have to do is _remove_ the existing restriction.

>
>
>     **Bonus round: the last (?) vestige.**? Currently, we allow
>     statement switches on legacy switch types (integers, their boxes,
>     strings, and enums) with all constant labels to be partial, and
>     require all other switches to be total.? Patching this hole is
>     harder, since there is lots of legacy code today that depends on
>     this partiality.? There are a few things we can do to pave the way
>     forward here:
>
>     ?- Allow `default -> ;` in addition to `default -> { }`, since
>     people seem to have a hard time discovering the latter. 
>
>
> we should also fix that for lambdas, the fact that the lambda syntax 
> and the case arrow syntax are not aligned currently ; `() -> throw 
> ...`is not legal while `case ... -> throw ...` is, is something that 
> trouble a lot of my student (i also introduce the switch syntax before 
> the lambda, so the lambda seems less powerful ??).

Good thought.

>     ? - Issue a warning when a legacy switch construct is not
>     exhaustive.? This can start as a lint warning, move up to a
>     regular warning over time, then a mandatory (unsuppressable)
>     warning. Maybe in a decade it can become an error, but we can
>     start paving the way sooner.
>
>
> I agree with a switch warning if all the IDEs stop fixing the warning 
> by adding a `default` when the type switched upon is sealed.

Right, I think over some time, IDEs will fix all the occurrences and 
then it is less disruptive to tighten.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220912/800a6e40/attachment.htm>

From john.r.rose at oracle.com  Mon Sep 12 22:58:47 2022
From: john.r.rose at oracle.com (John Rose)
Date: Mon, 12 Sep 2022 15:58:47 -0700
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
Message-ID: <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>

It?s too harsh to say your example shows the semantics are just wrong.

I think they are right, but possibly incomplete.  The exclusion of case 
200 is the job of dead code detection logic in the language, the same 
kind of logic that also reports an error on `"foo" instanceof List`.

Then there are the old murky rules that allow an integral constant like 
100 to assign to `byte` only because 100 fits in the byte range while 
200 does not.  The duals of those rules will surely speak to the 
restriction of `case 200:` matching a byte.

On 12 Sep 2022, at 15:29, Remi Forax wrote:

>> No new rules; just appeal to type patterns.

> It shows that the semantics you propose for the primitive type pattern 
> is not the right one.
>
> Currently, a code like this does not compile
> byte b = ...
> switch(b) {
> case 200 -> ....
> }
>
> because 200 is not a short which is great because otherwise at runtime 
> it will never be reached.
>
> But if we apply the rules above + your definition of the primitive 
> pattern, the code above will happily compile because it is equivalent 
> to
>
> byte b = ...
> switch(b) {
> case short s when s == 200 -> ....
> }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220912/60b10307/attachment-0001.htm>

From forax at univ-mlv.fr  Tue Sep 13 14:06:17 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 16:06:17 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
Message-ID: <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>

> From: "John Rose" <john.r.rose at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Tuesday, September 13, 2022 12:58:47 AM
> Subject: Re: Knocking off two more vestiges of legacy switch

> It?s too harsh to say your example shows the semantics are just wrong.

yes, it's more than there is inconsistencies 

> I think they are right, but possibly incomplete. The exclusion of case 200 is
> the job of dead code detection logic in the language, the same kind of logic
> that also reports an error on "foo" instanceof List .

> Then there are the old murky rules that allow an integral constant like 100 to
> assign to byte only because 100 fits in the byte range while 200 does not. The
> duals of those rules will surely speak to the restriction of case 200: matching
> a byte.

The problem with that approach is that the semantics of constant patterns and the semantics of primitive type patterns will be not aligned, 
so if you have both pattern in a switch, users will spot the inconsistency. 

something like 
byte b = ... 
switch(b) { 
case 200 -> ... // does not compile, incompatible types between byte and int 
case int i -> ... // ok, compiles 
} 

So i agree that we should have primitive type patterns but instead of using the casting rules as model, the actual rules complemented with boolean, long, float and double seems a better fit. 

Compared to what Brian proposed, it means all primitive patterns are unconditional apart unboxing if the pattern is not total (the same way reference type pattern works with null). 

R?mi 

> On 12 Sep 2022, at 15:29, Remi Forax wrote:

>>> No new rules; just appeal to type patterns.
>> It shows that the semantics you propose for the primitive type pattern is not
>> the right one.

>> Currently, a code like this does not compile
>> byte b = ...
>> switch(b) {
>> case 200 -> ....
>> }

>> because 200 is not a short which is great because otherwise at runtime it will
>> never be reached.

>> But if we apply the rules above + your definition of the primitive pattern, the
>> code above will happily compile because it is equivalent to

>> byte b = ...
>> switch(b) {
>> case short s when s == 200 -> ....
>> }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/cce3183b/attachment.htm>

From heidinga at redhat.com  Tue Sep 13 14:13:48 2022
From: heidinga at redhat.com (Dan Heidinga)
Date: Tue, 13 Sep 2022 10:13:48 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
Message-ID: <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>

On Tue, Sep 13, 2022 at 10:08 AM <forax at univ-mlv.fr> wrote:

>
>
> ------------------------------
>
> *From: *"John Rose" <john.r.rose at oracle.com>
> *To: *"Remi Forax" <forax at univ-mlv.fr>
> *Cc: *"Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts" <
> amber-spec-experts at openjdk.java.net>
> *Sent: *Tuesday, September 13, 2022 12:58:47 AM
> *Subject: *Re: Knocking off two more vestiges of legacy switch
>
> It?s too harsh to say your example shows the semantics are just wrong.
>
>
> yes, it's more than there is inconsistencies
>
> I think they are right, but possibly incomplete. The exclusion of case 200
> is the job of dead code detection logic in the language, the same kind of
> logic that also reports an error on "foo" instanceof List.
>
> Then there are the old murky rules that allow an integral constant like
> 100 to assign to byte only because 100 fits in the byte range while 200
> does not. The duals of those rules will surely speak to the restriction of case
> 200: matching a byte.
>
>
> The problem with that approach is that the semantics of constant patterns
> and the semantics of primitive type patterns will be not aligned,
> so if you have both pattern in a switch, users will spot the inconsistency.
>
> something like
>   byte b = ...
>   switch(b) {
>     case 200 ->  ... // does not compile, incompatible types between byte
> and int
>     case int i -> ... // ok, compiles
>   }
>

I've been following along on this discussion and I'm not sure what the
inconsistency here is.  Remi, can you clarify?

As a developer, the semantics here are intuitive - I can't have a (signed)
byte that matches 200 so as John said earlier, it's clearly dead code.  On
the other hand, bytes can always be converted to an int so it makes sense
that the `case int i` both compiles and matches to the byte.  Can you
expand on why users would find that confusing?

--Dan


>
> So i agree that we should have primitive type patterns but instead of
> using the casting rules as model, the actual rules complemented with
> boolean, long, float and double seems a better fit.
>
> Compared to what Brian proposed, it means all primitive patterns are
> unconditional apart unboxing if the pattern is not total (the same way
> reference type pattern works with null).
>
> R?mi
>
> On 12 Sep 2022, at 15:29, Remi Forax wrote:
>
> No new rules; just appeal to type patterns.
>
> It shows that the semantics you propose for the primitive type pattern is
> not the right one.
>
> Currently, a code like this does not compile
> byte b = ...
> switch(b) {
> case 200 -> ....
> }
>
> because 200 is not a short which is great because otherwise at runtime it
> will never be reached.
>
> But if we apply the rules above + your definition of the primitive
> pattern, the code above will happily compile because it is equivalent to
>
> byte b = ...
> switch(b) {
> case short s when s == 200 -> ....
> }
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/09fe7aab/attachment-0001.htm>

From forax at univ-mlv.fr  Tue Sep 13 14:51:47 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 16:51:47 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
Message-ID: <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr>

> From: "Dan Heidinga" <heidinga at redhat.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "John Rose" <john.r.rose at oracle.com>, "Brian Goetz"
> <brian.goetz at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Tuesday, September 13, 2022 4:13:48 PM
> Subject: Re: Knocking off two more vestiges of legacy switch

> On Tue, Sep 13, 2022 at 10:08 AM < [ mailto:forax at univ-mlv.fr |
> forax at univ-mlv.fr ] > wrote:

>>> From: "John Rose" < [ mailto:john.r.rose at oracle.com | john.r.rose at oracle.com ] >
>>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >
>>> Cc: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ]
>>> >, "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net |
>>> amber-spec-experts at openjdk.java.net ] >
>>> Sent: Tuesday, September 13, 2022 12:58:47 AM
>>> Subject: Re: Knocking off two more vestiges of legacy switch

>>> It?s too harsh to say your example shows the semantics are just wrong.

>> yes, it's more than there is inconsistencies

>>> I think they are right, but possibly incomplete. The exclusion of case 200 is
>>> the job of dead code detection logic in the language, the same kind of logic
>>> that also reports an error on "foo" instanceof List .

>>> Then there are the old murky rules that allow an integral constant like 100 to
>>> assign to byte only because 100 fits in the byte range while 200 does not. The
>>> duals of those rules will surely speak to the restriction of case 200: matching
>>> a byte.

>> The problem with that approach is that the semantics of constant patterns and
>> the semantics of primitive type patterns will be not aligned,
>> so if you have both pattern in a switch, users will spot the inconsistency.

>> something like
>> byte b = ...
>> switch(b) {
>> case 200 -> ... // does not compile, incompatible types between byte and int
>> case int i -> ... // ok, compiles
>> }

> I've been following along on this discussion and I'm not sure what the
> inconsistency here is. Remi, can you clarify?

> As a developer, the semantics here are intuitive - I can't have a (signed) byte
> that matches 200 so as John said earlier, it's clearly dead code. On the other
> hand, bytes can always be converted to an int so it makes sense that the `case
> int i` both compiles and matches to the byte. Can you expand on why users would
> find that confusing?

The error messages of javac says the types are incompatible. 

> --Dan

R?mi 

>> So i agree that we should have primitive type patterns but instead of using the
>> casting rules as model, the actual rules complemented with boolean, long, float
>> and double seems a better fit.

>> Compared to what Brian proposed, it means all primitive patterns are
>> unconditional apart unboxing if the pattern is not total (the same way
>> reference type pattern works with null).

>> R?mi

>>> On 12 Sep 2022, at 15:29, Remi Forax wrote:

>>>>> No new rules; just appeal to type patterns.
>>>> It shows that the semantics you propose for the primitive type pattern is not
>>>> the right one.

>>>> Currently, a code like this does not compile
>>>> byte b = ...
>>>> switch(b) {
>>>> case 200 -> ....
>>>> }

>>>> because 200 is not a short which is great because otherwise at runtime it will
>>>> never be reached.

>>>> But if we apply the rules above + your definition of the primitive pattern, the
>>>> code above will happily compile because it is equivalent to

>>>> byte b = ...
>>>> switch(b) {
>>>> case short s when s == 200 -> ....
>>>> }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/53a04e1e/attachment.htm>

From heidinga at redhat.com  Tue Sep 13 15:31:19 2022
From: heidinga at redhat.com (Dan Heidinga)
Date: Tue, 13 Sep 2022 11:31:19 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
 <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr>
Message-ID: <CAJq4Gi4aWpTFbegJcpiRLcPgwh+y0qK=f1dEunBMYMwhNZSQ3g@mail.gmail.com>

On Tue, Sep 13, 2022 at 11:01 AM <forax at univ-mlv.fr> wrote:

>
>
> ------------------------------
>
> *From: *"Dan Heidinga" <heidinga at redhat.com>
> *To: *"Remi Forax" <forax at univ-mlv.fr>
> *Cc: *"John Rose" <john.r.rose at oracle.com>, "Brian Goetz" <
> brian.goetz at oracle.com>, "amber-spec-experts" <
> amber-spec-experts at openjdk.java.net>
> *Sent: *Tuesday, September 13, 2022 4:13:48 PM
> *Subject: *Re: Knocking off two more vestiges of legacy switch
>
>
>
> On Tue, Sep 13, 2022 at 10:08 AM <forax at univ-mlv.fr> wrote:
>
>>
>>
>> ------------------------------
>>
>> *From: *"John Rose" <john.r.rose at oracle.com>
>> *To: *"Remi Forax" <forax at univ-mlv.fr>
>> *Cc: *"Brian Goetz" <brian.goetz at oracle.com>, "amber-spec-experts" <
>> amber-spec-experts at openjdk.java.net>
>> *Sent: *Tuesday, September 13, 2022 12:58:47 AM
>> *Subject: *Re: Knocking off two more vestiges of legacy switch
>>
>> It?s too harsh to say your example shows the semantics are just wrong.
>>
>>
>> yes, it's more than there is inconsistencies
>>
>> I think they are right, but possibly incomplete. The exclusion of case
>> 200 is the job of dead code detection logic in the language, the same kind
>> of logic that also reports an error on "foo" instanceof List.
>>
>> Then there are the old murky rules that allow an integral constant like
>> 100 to assign to byte only because 100 fits in the byte range while 200
>> does not. The duals of those rules will surely speak to the restriction of case
>> 200: matching a byte.
>>
>>
>> The problem with that approach is that the semantics of constant patterns
>> and the semantics of primitive type patterns will be not aligned,
>> so if you have both pattern in a switch, users will spot the
>> inconsistency.
>>
>> something like
>>   byte b = ...
>>   switch(b) {
>>     case 200 ->  ... // does not compile, incompatible types between byte
>> and int
>>     case int i -> ... // ok, compiles
>>   }
>>
>
> I've been following along on this discussion and I'm not sure what the
> inconsistency here is.  Remi, can you clarify?
>
> As a developer, the semantics here are intuitive - I can't have a (signed)
> byte that matches 200 so as John said earlier, it's clearly dead code.  On
> the other hand, bytes can always be converted to an int so it makes sense
> that the `case int i` both compiles and matches to the byte.  Can you
> expand on why users would find that confusing?
>
>
> The error messages of javac says the types are incompatible.
>
>
Ok.  So the concern is with the error messages produced by javac?  That
seems fixable but also a separate issue from whether the semantics being
proposed are a good path forward.

And at least jshell is quite clear in the message it produces for similar
code today so this may be a non-issue.

jshell> byte b = 200

|  Error:

|  incompatible types: possible lossy conversion from int to byte

|  byte b = 200;

|           ^-^


--Dan


>
> --Dan
>
>
> R?mi
>
>
>
>>
>> So i agree that we should have primitive type patterns but instead of
>> using the casting rules as model, the actual rules complemented with
>> boolean, long, float and double seems a better fit.
>>
>> Compared to what Brian proposed, it means all primitive patterns are
>> unconditional apart unboxing if the pattern is not total (the same way
>> reference type pattern works with null).
>>
>> R?mi
>>
>> On 12 Sep 2022, at 15:29, Remi Forax wrote:
>>
>> No new rules; just appeal to type patterns.
>>
>> It shows that the semantics you propose for the primitive type pattern is
>> not the right one.
>>
>> Currently, a code like this does not compile
>> byte b = ...
>> switch(b) {
>> case 200 -> ....
>> }
>>
>> because 200 is not a short which is great because otherwise at runtime it
>> will never be reached.
>>
>> But if we apply the rules above + your definition of the primitive
>> pattern, the code above will happily compile because it is equivalent to
>>
>> byte b = ...
>> switch(b) {
>> case short s when s == 200 -> ....
>> }
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/7a0be7e7/attachment-0001.htm>

From joe.darcy at oracle.com  Tue Sep 13 16:48:15 2022
From: joe.darcy at oracle.com (Joe Darcy)
Date: Tue, 13 Sep 2022 09:48:15 -0700
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
Message-ID: <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>


On 9/12/2022 3:29 PM, Remi Forax wrote:
>
>
> ------------------------------------------------------------------------
>
>     *From: *"Brian Goetz" <brian.goetz at oracle.com>
>     *To: *"amber-spec-experts" <amber-spec-experts at openjdk.java.net>
>     *Sent: *Monday, September 12, 2022 9:36:09 PM
>     *Subject: *Knocking off two more vestiges of legacy switch
>
>
[snip]
>
> I agree, it's quite sad that we have to support float and double but 
> as you said composition is more important.
>
It is common for math library methods to have a preamble to screen out 
special values (infinities, NaN, 0.0, 1.0, etc.).

This would be a reasonable use of a switch on float/double switch.

-Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/d4a0b150/attachment.htm>

From brian.goetz at oracle.com  Tue Sep 13 16:55:49 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 13 Sep 2022 12:55:49 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
Message-ID: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>


> It is common for math library methods to have a preamble to screen out 
> special values (infinities, NaN, 0.0, 1.0, etc.).
>
> This would be a reasonable use of a switch on float/double switch.
>
>

Which raises some questions (again) of the semantics of constant 
patterns for exotic floating point values, especially (again) negative zero.

From forax at univ-mlv.fr  Tue Sep 13 16:59:40 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 18:59:40 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
 <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
Message-ID: <1086150907.4171803.1663088380655.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "joe darcy" <joe.darcy at oracle.com>, "Amber Expert Group Observers" <amber-spec-observers at openjdk.org>, "Remi Forax"
> <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Tuesday, September 13, 2022 6:55:49 PM
> Subject: Re: Knocking off two more vestiges of legacy switch

>> It is common for math library methods to have a preamble to screen out
>> special values (infinities, NaN, 0.0, 1.0, etc.).
>>
>> This would be a reasonable use of a switch on float/double switch.
>>
>>
> 
> Which raises some questions (again) of the semantics of constant
> patterns for exotic floating point values, especially (again) negative zero.

You mean, do we use == or Float.equals()/Double.equals() ?

I will vote for the later, like with records.

R?mi

From joe.darcy at oracle.com  Tue Sep 13 17:07:45 2022
From: joe.darcy at oracle.com (Joe Darcy)
Date: Tue, 13 Sep 2022 10:07:45 -0700
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
 <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
Message-ID: <f0ce4889-f9f1-99fa-f5a4-410d04ce6249@oracle.com>

On 9/13/2022 9:55 AM, Brian Goetz wrote:
>
>> It is common for math library methods to have a preamble to screen 
>> out special values (infinities, NaN, 0.0, 1.0, etc.).
>>
>> This would be a reasonable use of a switch on float/double switch.
>>
>>
>
> Which raises some questions (again) of the semantics of constant 
> patterns for exotic floating point values, especially (again) negative 
> zero.


In a switching context, I think there is a stronger case for 
distinguishing between +0.0 and -0.0. The operational semantics I'd 
recommend are to desugar, say a float switch, to an int switch on the 
Float.floatToIntBits mapping of the float case labels. 
Float.floatToIntBits, as opposed to Float.floatToRawIntBits, normalized 
all NaN representations to a single value.

-Joe
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/35bf4f90/attachment.htm>

From forax at univ-mlv.fr  Tue Sep 13 17:31:03 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 19:31:03 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
Message-ID: <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr>

> From: "Dan Heidinga" <heidinga at redhat.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "John Rose" <john.r.rose at oracle.com>, "Brian Goetz"
> <brian.goetz at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Tuesday, September 13, 2022 4:13:48 PM
> Subject: Re: Knocking off two more vestiges of legacy switch

> On Tue, Sep 13, 2022 at 10:08 AM < [ mailto:forax at univ-mlv.fr |
> forax at univ-mlv.fr ] > wrote:

>>> From: "John Rose" < [ mailto:john.r.rose at oracle.com | john.r.rose at oracle.com ] >
>>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >
>>> Cc: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ]
>>> >, "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net |
>>> amber-spec-experts at openjdk.java.net ] >
>>> Sent: Tuesday, September 13, 2022 12:58:47 AM
>>> Subject: Re: Knocking off two more vestiges of legacy switch

>>> It?s too harsh to say your example shows the semantics are just wrong.

>> yes, it's more than there is inconsistencies

>>> I think they are right, but possibly incomplete. The exclusion of case 200 is
>>> the job of dead code detection logic in the language, the same kind of logic
>>> that also reports an error on "foo" instanceof List .

>>> Then there are the old murky rules that allow an integral constant like 100 to
>>> assign to byte only because 100 fits in the byte range while 200 does not. The
>>> duals of those rules will surely speak to the restriction of case 200: matching
>>> a byte.

>> The problem with that approach is that the semantics of constant patterns and
>> the semantics of primitive type patterns will be not aligned,
>> so if you have both pattern in a switch, users will spot the inconsistency.

>> something like
>> byte b = ...
>> switch(b) {
>> case 200 -> ... // does not compile, incompatible types between byte and int
>> case int i -> ... // ok, compiles
>> }

> I've been following along on this discussion and I'm not sure what the
> inconsistency here is. Remi, can you clarify?

> As a developer, the semantics here are intuitive - I can't have a (signed) byte
> that matches 200 so as John said earlier, it's clearly dead code. On the other
> hand, bytes can always be converted to an int so it makes sense that the `case
> int i` both compiles and matches to the byte. Can you expand on why users would
> find that confusing?

So my main concern stay that 
String s = ... 
switch(s) { 
case Comparable<?> c -> ... 
case Object o -> ... 
} 

and 
long l = ... 
switch(l) { 
case float f -> ... 
case double d -> ... 
} 

behave differently. 

> --Dan

R?mi 

>> So i agree that we should have primitive type patterns but instead of using the
>> casting rules as model, the actual rules complemented with boolean, long, float
>> and double seems a better fit.

>> Compared to what Brian proposed, it means all primitive patterns are
>> unconditional apart unboxing if the pattern is not total (the same way
>> reference type pattern works with null).

>> R?mi

>>> On 12 Sep 2022, at 15:29, Remi Forax wrote:

>>>>> No new rules; just appeal to type patterns.
>>>> It shows that the semantics you propose for the primitive type pattern is not
>>>> the right one.

>>>> Currently, a code like this does not compile
>>>> byte b = ...
>>>> switch(b) {
>>>> case 200 -> ....
>>>> }

>>>> because 200 is not a short which is great because otherwise at runtime it will
>>>> never be reached.

>>>> But if we apply the rules above + your definition of the primitive pattern, the
>>>> code above will happily compile because it is equivalent to

>>>> byte b = ...
>>>> switch(b) {
>>>> case short s when s == 200 -> ....
>>>> }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/5afaec5d/attachment-0001.htm>

From brian.goetz at oracle.com  Tue Sep 13 17:31:38 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 13 Sep 2022 13:31:38 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <f0ce4889-f9f1-99fa-f5a4-410d04ce6249@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
 <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
 <f0ce4889-f9f1-99fa-f5a4-410d04ce6249@oracle.com>
Message-ID: <c69d381c-a446-afef-1987-cbb3a5a462f0@oracle.com>


>> Which raises some questions (again) of the semantics of constant 
>> patterns for exotic floating point values, especially (again) 
>> negative zero.
>
>
> In a switching context, I think there is a stronger case for 
> distinguishing between +0.0 and -0.0. The operational semantics I'd 
> recommend are to desugar, say a float switch, to an int switch on the 
> Float.floatToIntBits mapping of the float case labels. 
> Float.floatToIntBits, as opposed to Float.floatToRawIntBits, 
> normalized all NaN representations to a single value.
>

This sounds right to me, but its not just about switch -- this would 
have to be the case for all constant patterns, such as

 ??? if (x instanceof FloatHolder(Float.NaN)) { ... }

But I think your argument still applies here as well.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/5ff0f319/attachment.htm>

From brian.goetz at oracle.com  Tue Sep 13 18:14:52 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Tue, 13 Sep 2022 14:14:52 -0400
Subject: Primitives in instanceof and patterns
In-Reply-To: <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com>
References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com>
 <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr>
 <cb9764c1-f1ec-ee08-9e85-30d56cbe5f21@oracle.com>
 <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr>
 <a9f23c4e-b251-82ea-d283-569278792950@oracle.com>
 <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr>
 <ecdc3f34-d901-b8cc-442c-61c21135e953@oracle.com>
 <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr>
 <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com>
Message-ID: <7fce011b-f87d-07d8-6986-379e9755a2a7@oracle.com>

I'm going to try and address these points *for the benefit of everyone 
else*.? (Note to Remi only: this is not an invitation to continue the 
back and forth, as doing so would likely be unconstructive unless you 
have something either (a) radically new that no one has thought of yet 
and/or (b) something that is so obviously right and compelling that I 
will immediately weep with embarrassment for how wrong I was.? That's 
the bar at this point.? I get that you hate this feature.? You've made 
that manifestly clear.? But unless you have some significantly new light 
to shed on it, it is unconstructive to just keep banging this drum, and 
you are creating an environment where others feel less comfortable 
sharing their thoughts, which is unacceptable.)

> 1) having a primitive pattern doing a range check is useless because 
> this is rare that you want to do a range check + cast in real life,
> ?? How many people have written a code like this
>
> ??? int i = ...
> ??? if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) {
> ????? byte b = (byte) i;
> ????? ...
> ??? }
>
> ?? It's useful when you write a bytecode generator without using an 
> existing library, ok, but how many write a bytecode generator ?
> ?? It should not be the default behavior for the primitive type pattern.

This argument stems from a misunderstanding of what we are trying to 
accomplish here.? Yes, it is correct that `case byte b` is not something 
everyone will use (I have written this many times, though I admit this 
is probably unusual.)? But that's not the point of this exercise; the 
point of the exercise is uniformity, in part because the lack of 
uniformity is complexity, and in part we want to offer new semantic 
symmetries that programmers can count on.? You are trying to tinker at 
the margins, asking if each conversion carries its weight; that's a 
recipe for creating new, ad-hoc complexity surface.? Sometimes that's 
the right move, and sometimes it is unavoidable, but there is such an 
obviously correct interpretation of primitive instanceof here -- "would 
a cast to this type be safe" -- that it would be an unforced error to 
opt for the ad-hoc complexity just because you can't imagine using it 
that often.

If I have a record:

 ??? record R(int x) { }

I can construct it with

 ??? new R(aShort)

but under the strict semantics of primitive type patterns,? I cannot 
deconstruct it with

 ??? case R(short s) { }

which would ask: "could this record have come from a constructor 
invocation `new R(s)`".?? And this is gratuitously different than the 
correspond case with reference widening:

 ??? record S(Object o) { }

 ??? S s = new S("foo");
 ??? if (s instanceof S(String ss)) { ... }

Further, I take objection to your continued characterization of this as 
a "range check", as this is a mischaracterization as well as minimizing 
what is going on.? Casting subsumes boxing and unboxing as well as 
widening and narrowing, so a more correct characterization would be 
"could I cast this without loss or error to a short".? Which applies not 
only to wider and narrower types, but to types like Short and Object.? 
Just like `instanceof` for reference types, which asks whether the type 
could be cast to another type.? And without creating a new context for 
what is allowable.

Not only is the term "useless" unconstructive, but it is not even the 
right measure.? The bar here is not "would people use it a lot."? We're 
making the language simpler by making it more uniform. To say "let's 
gratuitously knock some of the boxes out of the cast matrix because I 
can't imagine using them" only makes the language more complicated.

> 2) It's also useless because there is no need to have it as a pattern, 
> when you can use a cast in the following expression
> ??? Person person = ...
> ??? switch(person) {
> ????? // instead of
> ????? // case Person(double age) -> foo(age);
> ????? // one can write
> ????? case Person(int age) -> foo(age);? // widening cast
> ??? }

Same argument (also you got your example backwards).? I get that you 
think its fine to have to do this, but it is yet another gratuitous 
asymmetry between aggregation and destructuring that confuses people 
about how destructuring works.? Why can you pass an int or a double to 
`new Person`, but could only take an `double` out?? Whereas with 
Object/String, you could take either out?

Again, this is gratuitous complexity, which I think is rooted in your 
unwillingness to let go of "instanceof means subtype."? Sorry, it 
doesn't any more (but it means something that generalizes it.)

> 3) when you read a conditional primitive patterns, you have no idea 
> what is the underlying operation until you go to the declaration 
> (unlike the code just above).

This is the same complaint you had in the past about partial and total 
nested patterns.? As I've said, I understand why you find it 
uncomfortable ("action at a distance"), but we evaluated the pros and 
cons extensively already, and we made our decision.? There's no reason 
to reopen it here, nor are the considerations any different in this case.

> 4) if we change the type pattern to be not just about subtyping, we 
> should revisit the JLS to avoid to have too many different semantics.

This is FUD, implying that we are going to have to reexamine 
everything.? I don't buy it.? Many of the things that lean on subtyping 
today are just ... subtyping.? And the things that have conversions 
involving primitives already lean on conversions and contexts.

By way of concrete example, you raised the question about covariant 
overrides.? Which was a good example, and which I appreciate, but I wish 
you would have raised it differently.

A constructive way to raise this would be: "Do we also want to reexamine 
covariant overrides to use castability (or some other criteria) rather 
than subtyping?"

An unconstructive way to raise this would be: "This feature is bad, look 
at the problems you are creating for covariant overrides, everything 
will have to be reexamined."

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/c0174124/attachment-0001.htm>

From heidinga at redhat.com  Tue Sep 13 18:42:23 2022
From: heidinga at redhat.com (Dan Heidinga)
Date: Tue, 13 Sep 2022 14:42:23 -0400
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
 <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr>
Message-ID: <CAJq4Gi7+x86tat7LBq522U0rbo8yDggmuWifsdabaf1SSRowXg@mail.gmail.com>

<snip>

>
> So my main concern stay that
>   String s = ...
>   switch(s) {
>     case Comparable<?> c -> ... //Dan: matches here as String implements
> Comparable (this case is total on "s" so no further matching)
>     case Object o -> ...
>   }
>
> and
>   long l = ...
>   switch(l) {
>     case float f -> ...  //Dan: matches here if l is convertable to a float
>     case double d -> ... //Dan: otherwise matches here
>   }
>
> behave differently.
>
>
In each case, we're finding the switch case that the value is compatible
with.  Another way to say it is the value is convertable to... or castable
to.  Can you expand on what you mean by "behave differently"?

I'm still working on reading through the "big picture" presentation in [0]
so if there's a particular section there that you think is relevant, I can
re-read that first.  It might be useful for both of us to re-read it and
see how this example fits with the bigger picture being proposed for
pattern matching.

--Dan

[0]
https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/63461135/attachment.htm>

From forax at univ-mlv.fr  Tue Sep 13 19:15:38 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Tue, 13 Sep 2022 21:15:38 +0200 (CEST)
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <CAJq4Gi7+x86tat7LBq522U0rbo8yDggmuWifsdabaf1SSRowXg@mail.gmail.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com>
 <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7g++ttjJXm2=bz3ASqGKyLMFv2HM6r_u38BGc5-3RAKA@mail.gmail.com>
 <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr>
 <CAJq4Gi7+x86tat7LBq522U0rbo8yDggmuWifsdabaf1SSRowXg@mail.gmail.com>
Message-ID: <460160711.4208276.1663096538708.JavaMail.zimbra@u-pem.fr>

> From: "Dan Heidinga" <heidinga at redhat.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "John Rose" <john.r.rose at oracle.com>, "Brian Goetz"
> <brian.goetz at oracle.com>, "amber-spec-experts"
> <amber-spec-experts at openjdk.java.net>
> Sent: Tuesday, September 13, 2022 8:42:23 PM
> Subject: Re: Knocking off two more vestiges of legacy switch

> <snip>

>> So my main concern stay that
>> String s = ...
>> switch(s) {
>> case Comparable<?> c -> ... //Dan: matches here as String implements Comparable
>> (this case is total on "s" so no further matching)
>> case Object o -> ...
>> }

>> and
>> long l = ...
>> switch(l) {
>> case float f -> ... //Dan: matches here if l is convertable to a float
>> case double d -> ... //Dan: otherwise matches here
>> }

>> behave differently.

> In each case, we're finding the switch case that the value is compatible with.
> Another way to say it is the value is convertable to... or castable to. Can you
> expand on what you mean by "behave differently"?

In the first example, both type patterns are total so it does not compile because both patterns will match all Strings. 
In the second example, if we follow the semantics proposed by Brian, the first pattern is partial and is equivalent to iff (l == (long) (float) l) { float f = l; ... } and the second pattern is total. 

> I'm still working on reading through the "big picture" presentation in [0] so if
> there's a particular section there that you think is relevant, I can re-read
> that first. It might be useful for both of us to re-read it and see how this
> example fits with the bigger picture being proposed for pattern matching.

This document give you a nice overview of the problems but some parts are outdated, the following spec correspond to the semantics for Java 19 
https://cr.openjdk.java.net/~gbierman/jep427+405/jep427+405-20220601/specs/patterns-switch-record-patterns-jls.html#jls-15.28 

The proposed semantics of the primitive pattern is described here 
https://mail.openjdk.org/pipermail/amber-spec-experts/2022-September/003497.html 
and here 
https://mail.openjdk.org/pipermail/amber-spec-experts/2022-September/003499.html 

> --Dan

> [0] [
> https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model
> |
> https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model
> ]

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220913/88b3957d/attachment.htm>

From guy.steele at oracle.com  Wed Sep 14 04:05:45 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Wed, 14 Sep 2022 04:05:45 +0000
Subject: Knocking off two more vestiges of legacy switch
In-Reply-To: <f0ce4889-f9f1-99fa-f5a4-410d04ce6249@oracle.com>
References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com>
 <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr>
 <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com>
 <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com>
 <f0ce4889-f9f1-99fa-f5a4-410d04ce6249@oracle.com>
Message-ID: <E275ADC3-668C-43F1-BCA9-17A3D6D52682@oracle.com>

+1 on this suggestion. I believe it is the only approach that could make switch on floats at all useful, and it would be very useful, as Joe says, for expressing special cases in math libraries clearly.

?Guy

On Sep 13, 2022, at 1:07 PM, Joe Darcy <joe.darcy at oracle.com<mailto:joe.darcy at oracle.com>> wrote:

On 9/13/2022 9:55 AM, Brian Goetz wrote:

It is common for math library methods to have a preamble to screen out special values (infinities, NaN, 0.0, 1.0, etc.).

This would be a reasonable use of a switch on float/double switch.


Which raises some questions (again) of the semantics of constant patterns for exotic floating point values, especially (again) negative zero.


In a switching context, I think there is a stronger case for distinguishing between +0.0 and -0.0. The operational semantics I'd recommend are to desugar, say a float switch, to an int switch on the Float.floatToIntBits mapping of the float case labels. Float.floatToIntBits, as opposed to Float.floatToRawIntBits, normalized all NaN representations to a single value.

-Joe

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220914/e07e4982/attachment-0001.htm>

From amaembo at gmail.com  Sat Sep 17 18:06:09 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Sat, 17 Sep 2022 20:06:09 +0200
Subject: [enhanced-switches] My experience of converting old switches to new
 ones
Message-ID: <CAE+3fjajCJ7KtR6kjUDzP__zwitt42F9+QGn0NRjEPqNZfHYuA@mail.gmail.com>

Hello!

Our codebase was updated recently from Java 11 to Java 17 level, and
we started gradually using new Java features. Recently, I converted in
a semi-automated manner ~1000 of the old switches to new ones (either
statements or expressions). Here's my thoughts about this. Probably
somebody will find them interesting.

1. Knowing that the switch never falls through is really relieving. So
if you see an arrow after the first case, you immediately know that
this is a 'simple switch' (not doing fallthrough). If you see colon,
you start looking more precisely: probably something fancy is done in
this switch (otherwise, it's likely that automated refactoring would
be suggested to use an arrow). So the arrow basically separates simple
switches and complex ones.

2. I really have a desire to use expression switches, even when it
requires some code repetition. E.g., a common pattern:

if (cond) {
  switch(val) {
    case A -> return "a";
    case B -> return "b";
  }
}
return "default";

I tend to convert it to

if (cond) {
  return switch(val) {
    case A -> "a";
    case B -> "b";
    default -> "default";
  }
}
return "default";

There's a cost of repeating the "default" expression. However, we also
have a benefit. Now, we know that under the condition we always return
no matter what.

Unfortunately, sometimes the default expression can be non-trivial
(e.g., return super.blahblah(all, my, parameters, passed)). In this
case, I'm reluctant to duplicate it many times. It's quite possible
that in fact this is an impossible case, and it's written there just
because something should be written. However, this requires a deeper
understanding of the original code.

3. It's also sad that signaling about impossible cases is quite long.
`default -> assert false;` is not accepted for obvious reasons.
Writing every time `default -> throw new
IllegalStateException("Unexpected value of "+selectorValue);` is very
verbose and distracts from the actual code. Probably some syntactic
sugar to assert that we covered all possible values in a
non-exhaustive switch would be nice (like `default impossible;` or
whatever). E.g., I observed the following (arguably strange) pattern:

switch((cond1() ? 0 : 1) + (cond2() ? 0 : 2)) {
case 0 -> ...
case 1 -> ...
case 2 -> ...
case 3 -> ...
default -> throw new AssertionError("cannot reach here");
}

4. I really enjoyed exhaustive switch expressions over enums. I
removed probably a hundred of redundant default branches. In switch
statements, people do different things when they are forced to write
default, even though they covered all the enum values:
a. throw something (throw new IllegalStateException(), throw new
AssertionError(), etc.)
b. return something simple (return "", return null, etc.)
c. assert+return something simple: assert false; return null;
d. questionable: join return branch with the last case (case LAST:default: ...)
e. more dangerous: omit the explicit last case and use default instead
it, while it's clear that default actually handles non-mentioned case.

Luckily all of these are unnecessary anymore if you can use switch
expressions. I even forcibly push down switch expressions inside
something (e.g., a call), just to be able to use it. E.g.:

switch(MY_ENUM) {
  case A -> setSomething("a");
  case B -> setSomething("b");
  case C -> setSomething("c");
  default -> throw new IllegalStateException("impossible; all values
are covered");
}

Can be nicely converted to

setSomething(switch(MY_ENUM) {
  case A -> "a";
  case B -> "b";
  case C -> "c";
});

Unfortunately, this is not always the case. Sometimes, you cannot use
switch expression at all, and in this case, inability to specify
exhaustiveness is really annoying. We need total switch statements.

5. At first, I thought that switch expressions are best for return
values, assignment rvalues and variable declaration initializers, but
in other contexts they are too verbose and may make things more
complex than necessary. However, I started liking using them as the
last argument of the call. E.g., before:

switch(x) {
case "a":return wrap(getA());
case "b":return wrap(getB());
case "c":return wrap(getC());
default:throw new IllegalArgumentException();
}

after:

return wrap(switch(x) {
  case "a" -> getA();
  case "b" -> getB();
  case "c" -> getC();
  default -> throw new IllegalArgumentException();
});

If you don't have tail arguments after switch, then you don't lose the
context, and you immediately know that every non-exceptional return
value is wrapped. It's also possible to extract such a switch into a
separate local variable, but even without extraction it reads nicely.

It's also ok to use switch expressions inside other switch
expressions. Especially useful in double-dispatch enum methods (e.g.,
some kind of lattice operations):

enum Item {
BOTTOM, A, B, AB, TOP;
Item join(Item other) {
  return switch(this) {
    case TOP -> this;
    case BOTTOM -> other;
    case A -> switch(other) {
        case A, AB, TOP -> other;
        case B -> AB;
        case BOTTOM -> this;
      };
    case B -> switch(other) {
        case B, AB, TOP -> other;
        case A -> AB;
        case BOTTOM -> this;
      };
    case AB -> switch(other) {
        case TOP -> other;
        case A, B, AB, BOTTOM -> this;
      };
  };
}
}
Reads much better than tons of returns before. Also, thanks to
exhaustiveness checks, you know that every single case is covered.

6. I started to like yield. In some cases, only a couple of branches
of a long switch that returns from every branch have some complex
intermediate computations or conditional branches. In this case, it's
still better to convert it to switch expression, and replace some
returns with yields. And even if every single branch is complex, using
switch expression + yield may make code more clear. E.g., it may
clearly show that the purpose of the whole switch is to assign a value
to the same variable, though computation of variable value in every
branch could be complex.

Also, it can be implicitly assumed that even complex switch
expressions with yields don't produce side-effects. Of course, this is
not controlled by a compiler but it would be a bad practice to produce
them, so there could be an agreement between the team. In this case,
reading the code is simplified a lot. If you see `var something =
switch(...) {...}`, you immediately know that regardless of the switch
complexity, we just calculate the value for `something`, so we can
skip the whole thing if we are not interested in details. If you see a
switch statement, you are less sure whether every single branch does
only this.

7. I really miss `case null`. I saw many switches these days, during
my conversion quest. And it happens quite often in our codebase that
the null case is handled separately before the switch (often the same
as 'default', but sometimes not). In the IntelliJ codebase, we really
use nulls extensively, even though some people may think that it's a
bad idea. It's good that we will have `case null` in future.

8. I also miss `case default`. It's strange, but I often see old
switches where `default:` is joined with other cases. Probably more
often with strings, less often with enums. Something like:

switch(valueFromConfig) {
  case "increase": increase(); break;
  case "decrease": decrease(); break;
  case "enable": enable(); break;
  case "disable": // "disable" is a documented value and we explicitly
process it
  default: // something unknown, but we still want to fallback to
default value which is "disable"
    disable(); break;
}

With Java 17 enhanced switches, we should either delete `case
"disable"`, or duplicate the branch. If we delete, it will not be so
clear anymore that this value is especially processed as "official"
value. In the future, I could use `case "disable", default ->
disable();` which would solve the issue.

9. Some old switches are actually shorted than new ones, and I'm not
sure about conversion. Usually, it's like this:

if (condition) {
  switch(value) {
  case 1: return "a";
  case 2: return "b";
  case 3: return "c";
  // no default case, execution continues
  }
}
... a lot of common code for `condition` is false or `value` is not
listed in cases ...

Here it's hard to use switch expression, and enhanced switch statement
only becomes longer and cluttered with syntax:
switch(value) {
case 1 -> { return "a"; }
case 2 -> { return "b"; }
case 3 -> { return "c"; }
}

Well, it's possible to refactor to something like

String result = !condition ? null : switch(value) {
  case 1 -> "a";
  case 2 -> "b";
  case 3 -> "c";
  default -> null;
  };
if (result != null) return result;
... a lot of common code for `condition` is false or `value` is not
listed in cases ...

But it's questionable whether this makes the code more readable.

10. Sometimes, one or few enum values are peeled off in advance. In
this case, nice conversion becomes problematic. E.g.:

enum Mode {IGNORE, A, B, C}

void updateMode(Mode mode) {
  if (mode == Mode.IGNORE) return;
  System.out.println("Processing...");
  switch(mode) {
    case A -> process("a");
    case B -> process("b");
    case C -> process("c");
  }
}

It's almost convertible to switch expression. However, the switch is
non-exhaustive, and you cannot get exhaustiveness benefits. It's
possible to add a throwing branch, though it's also long and verbose:

void updateMode(Mode mode) {
  if (mode == Mode.IGNORE) return;
  System.out.println("Processing...");
  process(switch(mode) {
    case A -> "a";
    case B -> "b";
    case C -> "c";
    case IGNORE -> throw new AssertionError("impossible; handled before");
    // hooray, exhaustive now!
  });
}

Of course, it would be too much for javac to analyze code to this
extent and allow skipping IGNORE branch, as it was checked before
(IntelliJ analyzer knows this). However, it's still sad. This somehow
corresponds to item 3. Probably short syntax for impossible branches
would be nice.

Thank you for reading my very long email.

With best regards,
Tagir Valeev.

From brian.goetz at oracle.com  Sun Sep 18 13:21:56 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Sun, 18 Sep 2022 09:21:56 -0400
Subject: [enhanced-switches] My experience of converting old switches to
 new ones
In-Reply-To: <CAE+3fjajCJ7KtR6kjUDzP__zwitt42F9+QGn0NRjEPqNZfHYuA@mail.gmail.com>
References: <CAE+3fjajCJ7KtR6kjUDzP__zwitt42F9+QGn0NRjEPqNZfHYuA@mail.gmail.com>
Message-ID: <d042331b-b68a-6fda-8c23-85911b2e251b@oracle.com>

Thanks for the extensive feedback!


> 3. It's also sad that signaling about impossible cases is quite long.
> `default -> assert false;` is not accepted for obvious reasons.

Java lacks suitable abstraction over effects, so we cannot use our 
regular abstraction tools for simplifying a throw -- you have to do it 
all inline, unfortunately.

We have talked about various sugary things here, such as:

 ??? default -> unreachable;
or
 ??? default -> throw;

but could never get all that excited about it; its not that powerful, 
and invariably someone will want to customize the exception.? You could 
have a simple library method:

 ??? AssertionError unreachable() {
 ??????? return new AssertionError("got lost in the weeds");
 ??? }

 ??? default -> throw unreachable();

which seems better than a language feature, though you end up with some 
"junk" frames on the stack trace.? If that point is really unreachable, 
that won't matter.

But as you say, really you'll want to provide some context about the 
data that brought you to this point.? Which suggests you want something 
that is part of switch, so it can at least reproduce the selector.? I 
kind of like your idea about a case that says "impossible", as it is 
tied to the switch and can carry the selector value, so it can give you 
a better error.? (Ideally, something that the existing synthetic 
defaults could be shorthand for.)

> > We need total switch statements.

Is this different from the "default impossible" above?

> 5. At first, I thought that switch expressions are best for return
> values, assignment rvalues and variable declaration initializers, but
> in other contexts they are too verbose and may make things more
> complex than necessary. However, I started liking using them as the
> last argument of the call.

Not unlike lambdas.? A sensible style has emerged in many libraries that 
encourage a single lambda argument at the end, for this same reason.

> 7. I really miss `case null`. I saw many switches these days, during
> my conversion quest. And it happens quite often in our codebase that
> the null case is handled separately before the switch (often the same
> as 'default', but sometimes not). In the IntelliJ codebase, we really
> use nulls extensively, even though some people may think that it's a
> bad idea. It's good that we will have `case null` in future.

Hopefully near future!

> 9. Some old switches are actually shorted than new ones, and I'm not
> sure about conversion.

There's nothing wrong with old switches when you need complex control 
flow.? Even if we made all switches exhaustive, a `default: break` would 
suffice here.? I think there's no need to use the new thing here; you 
want some weird control flow, old switches are good for that.

> Usually, it's like this:
>
> if (condition) {
>    switch(value) {
>    case 1: return "a";
>    case 2: return "b";
>    case 3: return "c";
>    // no default case, execution continues
>    }

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220918/3feeba12/attachment-0001.htm>

From amaembo at gmail.com  Mon Sep 19 07:22:34 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Mon, 19 Sep 2022 09:22:34 +0200
Subject: [enhanced-switches] My experience of converting old switches to
 new ones
In-Reply-To: <d042331b-b68a-6fda-8c23-85911b2e251b@oracle.com>
References: <CAE+3fjajCJ7KtR6kjUDzP__zwitt42F9+QGn0NRjEPqNZfHYuA@mail.gmail.com>
 <d042331b-b68a-6fda-8c23-85911b2e251b@oracle.com>
Message-ID: <CAE+3fjZa3Py6DNGOvVSnCrXK4+_kiPDX1ZvoBiogQJD+XnF=cw@mail.gmail.com>

Hello!

> > We need total switch statements.
>
>
> Is this different from the "default impossible" above?

Yes. I mean, currently we cannot have exhaustiveness checks on enum
switch statements having a compilation error when a new enum constant
is added. We have this for switch expressions and for sealed classes,
but not for switch statements over enums.

With best regards,
Tagir Valeev.

From forax at univ-mlv.fr  Mon Sep 19 08:07:04 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Mon, 19 Sep 2022 10:07:04 +0200 (CEST)
Subject: [enhanced-switches] My experience of converting old switches to
 new ones
In-Reply-To: <CAE+3fjZa3Py6DNGOvVSnCrXK4+_kiPDX1ZvoBiogQJD+XnF=cw@mail.gmail.com>
References: <CAE+3fjajCJ7KtR6kjUDzP__zwitt42F9+QGn0NRjEPqNZfHYuA@mail.gmail.com>
 <d042331b-b68a-6fda-8c23-85911b2e251b@oracle.com>
 <CAE+3fjZa3Py6DNGOvVSnCrXK4+_kiPDX1ZvoBiogQJD+XnF=cw@mail.gmail.com>
Message-ID: <391672854.8279091.1663574824680.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Tagir Valeev" <amaembo at gmail.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Monday, September 19, 2022 9:22:34 AM
> Subject: Re: [enhanced-switches] My experience of converting old switches to new ones

> Hello!
> 
>> > We need total switch statements.
>>
>>
>> Is this different from the "default impossible" above?
> 
> Yes. I mean, currently we cannot have exhaustiveness checks on enum
> switch statements having a compilation error when a new enum constant
> is added. We have this for switch expressions and for sealed classes,
> but not for switch statements over enums.

enum Foo { A, B }

switch(foo) {
  case null -> throw null;
  case A -> ...
  case B -> ...
}

is exhaustive.  

> 
> With best regards,
> Tagir Valeev.

R?mi

From amaembo at gmail.com  Wed Sep 21 12:22:20 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Wed, 21 Sep 2022 14:22:20 +0200
Subject: [string-templates] Processors with side effects
Message-ID: <CAE+3fjYGyo6aXC39OBK05OPebkaxexUFi1oZaVSAbO0odd6AxA@mail.gmail.com>

Hello!

I was thinking about how Java beginners may benefit from string
templates. Some teaching materials rely on System.out.printf to
produce formatted output, like

System.out.printf("Hello %s!", user);

With string templates proposal, we can use

System.out.println(FMT."Hello %s\{user}!");

This is not very exciting. But I realised that PrintStream may
implement TemplateProcessor by itself (returning Void or whatever),
and print directly:

System.out."Hello %s\{user}!";

Will such use cases be encouraged, or this should be considered as
misuse of the feature?

Well, for side effect we may still want to specify formatting options,
like whether it should be a concatenation, or formatting, and with
which locale, so probably it would be better to have an intermediate
method (or even field!) that returns a TemplateProcessor:

System.out.printstr."Hello \{user}!";
System.out.printfmt."Hello %s\{user}!";
System.out.printfmt(myLocale)."Hello %s\{user}!";

That said, in "Safely composing and executing database queries"
section of JEP 430, it's assumed that the DB object always produces a
ResultSet. However, in PreparedStatement there's also executeUpdate()
(returning int) and execute() (returning boolean) which might be
sometimes more appropriate. So a level of indirection between
connection and template processor is probably necessary:

ResultSet resultSet = conn.query()."SELECT \{col} FROM \{table}";
int count = conn.update()."UPDATE \{table} SET \{col} = \{value}";

With best regards,
Tagir Valeev.

From james.laskey at oracle.com  Wed Sep 21 12:54:36 2022
From: james.laskey at oracle.com (Jim Laskey)
Date: Wed, 21 Sep 2022 12:54:36 +0000
Subject: [string-templates] Processors with side effects
In-Reply-To: <CAE+3fjYGyo6aXC39OBK05OPebkaxexUFi1oZaVSAbO0odd6AxA@mail.gmail.com>
References: <CAE+3fjYGyo6aXC39OBK05OPebkaxexUFi1oZaVSAbO0odd6AxA@mail.gmail.com>
Message-ID: <9D921D4D-DA01-4C63-A355-34BE372591A4@oracle.com>


> On Sep 21, 2022, at 9:22 AM, Tagir Valeev <amaembo at gmail.com> wrote:
> 
> Hello!
> 
> I was thinking about how Java beginners may benefit from string
> templates. Some teaching materials rely on System.out.printf to
> produce formatted output, like
> 
> System.out.printf("Hello %s!", user);
> 
> With string templates proposal, we can use
> 
> System.out.println(FMT."Hello %s\{user}!");
> 
> This is not very exciting. But I realised that PrintStream may
> implement TemplateProcessor by itself (returning Void or whatever),
> and print directly:
> 
> System.out."Hello %s\{user}!";
> 
> Will such use cases be encouraged, or this should be considered as
> misuse of the feature?

Misuse will come whether we like or not. The plan is to have a "User Guide to String Templates" influence developers toward safe and reasonable usage.

> 
> Well, for side effect we may still want to specify formatting options,
> like whether it should be a concatenation, or formatting, and with
> which locale, so probably it would be better to have an intermediate
> method (or even field!) that returns a TemplateProcessor:
> 
> System.out.printstr."Hello \{user}!";
> System.out.printfmt."Hello %s\{user}!";
> System.out.printfmt(myLocale)."Hello %s\{user}!";


One flavour you didn?t propose was 

OUT."Hello \{user}!?;

Or

OUT."Hello %s\{user}!?;

Similar to "import static java.lang.System.out? used by some developers.

No doubt, there will be significant spin off and discussion from the JEP?s proposal.

> 
> That said, in "Safely composing and executing database queries"
> section of JEP 430, it's assumed that the DB object always produces a
> ResultSet. However, in PreparedStatement there's also executeUpdate()
> (returning int) and execute() (returning boolean) which might be
> sometimes more appropriate. So a level of indirection between
> connection and template processor is probably necessary:
> 
> ResultSet resultSet = conn.query()."SELECT \{col} FROM \{table}";
> int count = conn.update()."UPDATE \{table} SET \{col} = \{value}";

Since the JEP was originally written we?ve done some more research about what SQL template processors might look like. More expert consultation will take place but the current leaning is toward producing PrepareStatements. So the code will be more like;

PreparedStatement stmt = conn."SELECT \{col} FROM \{table}?;
ResultSet rs = stmt.executeQuery();

Or just

ResultSet rs = conn."SELECT \{col} FROM \{table}?.executeQuery();

The advantage here, beside the validation, is that the statement will only be compiled and optimized (using meta data) once per callsite/connection and reused with different values per iteration.

As stated, we will be gathering more direction from the DB community (think separate JEP.)

Cheers,

? Jim


> 
> With best regards,
> Tagir Valeev.


From brian.goetz at oracle.com  Wed Sep 21 13:03:02 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 21 Sep 2022 09:03:02 -0400
Subject: [string-templates] Processors with side effects
In-Reply-To: <CAE+3fjYGyo6aXC39OBK05OPebkaxexUFi1oZaVSAbO0odd6AxA@mail.gmail.com>
References: <CAE+3fjYGyo6aXC39OBK05OPebkaxexUFi1oZaVSAbO0odd6AxA@mail.gmail.com>
Message-ID: <7c8f8e37-1c6c-2064-22c2-528d74a7639f@oracle.com>


> Hello!
>
> I was thinking about how Java beginners may benefit from string
> templates. Some teaching materials rely on System.out.printf to
> produce formatted output, like
>
> System.out.printf("Hello %s!", user);
>
> With string templates proposal, we can use
>
> System.out.println(FMT."Hello %s\{user}!");
>
> This is not very exciting. But I realised that PrintStream may
> implement TemplateProcessor by itself (returning Void or whatever),
> and print directly:
>
> System.out."Hello %s\{user}!";
>
> Will such use cases be encouraged, or this should be considered as
> misuse of the feature?

We anticipated that some libraries may want to implement 
TemplateProcessor in order to do this.? However, whether we do so for 
PrintStream will require thought.? The former version may not be 
exciting, but it is clear, and users will have no question what is going 
on.? The latter represents an opinionated "this is the formatter we will 
use for the next century", and is a choice to be taken carefully.? One 
of the great things about the current release cadence (plus preview 
mechanism) is that it is not necessary any more to do everything at 
once; we can let things sit for a while and see how they settle.


From brian.goetz at oracle.com  Wed Sep 28 17:57:19 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 28 Sep 2022 13:57:19 -0400
Subject: Paving the on-ramp
Message-ID: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>

At various points, we've explored the question of which program elements 
are most and least helpful for students first learning Java.? After 
considering a number of alternatives over the years, I have a simple 
proposal for smoothing the "on ramp" to Java programming, while not 
creating new things to unlearn.

Markdown source is below, HTML will appear soon at:

https://openjdk.org/projects/amber/design-notes/on-ramp


# Paving the on-ramp

Java is one of the most widely taught programming languages in the 
world.? Tens
of thousands of educators find that the imperative core of the language 
combined
with a straightforward standard library is a foundation that students can
comfortably learn on.? Choosing Java gives educators many degrees of 
freedom:
they can situate students in `jshell` or Notepad or a full-fledged IDE; 
they can
teach imperative, object-oriented, functional, or hybrid programming 
styles; and
they can easily find libraries to interact with external data and services.

No language is perfect, and one of the most common complaints about Java 
is that
it is "too verbose" or has "too much ceremony."? And unfortunately, Java 
imposes
its heaviest ceremony on those first learning the language, who need and
appreciate it the least.? The declaration of a class and the incantation of
`public static void main` is pure mystery to a beginning programmer.? While
these incantations have principled origins and serve a useful organizing 
purpose
in larger programs, they have the effect of placing obstacles in the path of
_becoming_ Java programmers. Educators constantly remind us of the litany of
complexity that students have to confront on Day 1 of class -- when they 
really
just want to write their first program.

As an amusing demonstration of this, in her JavaOne keynote appearance 
in 2019,
[Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked about 
when
she learned to program in Java, and how her teacher performed a rap song
to help students memorize `"public static void main"`.? Our hats are off to
creative educators everywhere for this kind of dedication, but teachers
shouldn't have to do this.

Of course, advanced programmers complain about ceremony too.? We will 
never be
able to satisfy programmers' insatiable appetite for typing fewer 
keystrokes,
and we shouldn't try, because the goal of programming is to write 
programs that
are easy to read and are clearly correct, not programs that were easy to 
type.
But we can try to better align the ceremony commensurate with the value it
brings to a program -- and let simple programs be expressed more simply.

## Concept overload

The classic "Hello World" program looks like this in Java:

```
public class HelloWorld {
 ??? public static void main(String[] args) {
 ??????? System.out.println("Hello World");
 ??? }
}
```

It may only be five lines, but those lines are packed with concepts that are
challenging to absorb without already having some programming experience and
familiarity with object orientation. Let's break down the concepts a student
confronts when writing their first Java program:

 ? - **public** (on the class).? The `public` accessibility level is 
relevant
 ??? only when there is going to be cross-package access; in a simple "Hello
 ??? World" program, there is only one class, which lives in the unnamed 
package.
 ??? They haven't even written a one-line program yet; the notion of access
 ??? control -- keeping parts of a program from accessing other parts of 
it -- is
 ??? still way in their future.

 ? - **class**.? Our student hasn't set out to write a _class_, or model a
 ??? complex system with objects; they want to write a _program_.? In 
Java, a
 ??? program is just a `main` method in some class, but at this point 
our student
 ??? still has no idea what a class is or why they want one.

 ? - **Methods**.? Methods are of course a key concept in Java, but the 
mechanics
 ??? of methods -- parameters, return types, and invocation -- are still
 ??? unfamiliar, and the `main` method is invoked magically from the `java`
 ??? launcher rather than from explicit code.

 ? - **public** (again).? Like the class, the `main` method has to be 
public, but
 ??? again this is only relevant when programs are large enough to require
 ??? packages to organize them.

 ? - **static**.? The `main` method has to be static, and at this point, 
students
 ??? have no context for understanding what a static method is or why 
they want
 ??? one.? Worse, the early exposure to `static` methods will turn out 
to be a
 ??? bad habit that must be later unlearned.? Worse still, the fact that the
 ??? `main` method is `static` creates a seam between `main` and other 
methods;
 ??? either they must become `static` too, or the `main` method must 
trampoline
 ??? to some sort of "instance main" (more ceremony!)? And if we get 
this wrong,
 ??? we get the dreaded and mystifying `"cannot be referenced from a static
 ??? context"` error.

 ? - **main**.? The name `main` has special meaning in a Java program, 
indicating
 ??? the starting point of a program, but this specialness hides behind 
being an
 ??? ordinary method name.? This may contribute to the sense of "so many 
magic
 ??? incantations."

 ? - **String[]**.? The parameter to `main` is an array of strings, 
which are the
 ??? arguments that the `java` launcher collected from the command 
line.? But our
 ??? first program -- likely our first dozen -- will not use command-line
 ??? parameters. Requiring the `String[]` parameter is, at this point, a 
mistake
 ??? waiting to happen, and it will be a long time until this parameter 
makes
 ??? sense.? Worse, educators may be tempted to explain arrays at this 
point,
 ??? which further increases the time-to-first-program.

 ? - **System.out.println**.? If you look closely at this incantation, each
 ??? element in the chain is a different thing -- `System` is a class 
(what's a
 ??? class again?), `out` is a static field (what's a field?), and 
`println` is
 ??? an instance method.? The only part the student cares about right now is
 ??? `println`; the rest of it is an incantation that they do not yet 
understand
 ??? in order to get at the behavior they want.

That's a lot to explain to a student on the first day of class. There's 
a good
chance that by now, class is over and we haven't written any programs 
yet, or
the teacher has said "don't worry what this means, you'll understand it 
later"
six or eight times.? Not only is this a lot of _syntactic_ things to 
absorb, but
each of those things appeals to a different concept (class, method, package,
return value, parameter, array, static, public, etc) that the student 
doesn't
have a framework for understanding yet.? Each of these will have an 
important
role to play in larger programs, but so far, they only contribute to "wow,
programming is complicated."

It won't be practical (or even desirable) to get _all_ of these concepts 
out of
the student's face on day 1, but we can do a lot -- and focus on the 
ones that
do the most to help beginners understand how programs are constructed.

## Goal: a smooth on-ramp

As much as programmers like to rant about ceremony, the real goal here 
is not
mere ceremony reduction, but providing a graceful _on ramp_ to Java 
programming.
This on-ramp should be helpful to beginning programmers by requiring 
only those
concepts that a simple program needs.

Not only should an on-ramp have a gradual slope and offer enough 
acceleration
distance to get onto the highway at the right speed, but its direction must
align with that of the highway.? When a programmer is ready to learn 
about more
advanced concepts, they should not have to discard what they've already 
learned,
but instead easily see how the simple programs they've already written
generalize to more complicated ones, and both the syntatic and conceptual
transformation from "simple" to "full blown" program should be 
straightforward
and unintrusive.? It is a definite non-goal to create a "simplified 
dialect of
Java for students".

We identify three simplifications that should aid both educators and 
students in
navigating the on-ramp to Java, as well as being generally useful to simple
programs beyond the classroom as well:

 ?- A more tolerant launch protocol
 ?- Unnamed classes
 ?- Predefined static imports for the most critical methods and fields

## A more tolerant launch protocol

The Java Language Specification has relatively little to say about how Java
"programs" get launched, other than saying that there is some way to 
indicate
which class is the initial class of a program (JLS 12.1.1) and that a public
static method called `main` whose sole argument is of type `String[]` 
and whose
return is `void` constitutes the entry point of the indicated class.

We can eliminate much of the concept overload simply by relaxing the
interactions between a Java program and the `java` launcher:

 ?- Relax the requirement that the class, and `main` method, be public.? 
Public
 ?? accessibility is only relevant when access crosses packages; simple 
programs
 ?? live in the unnamed package, so cannot be accessed from any other 
package
 ?? anyway.? For a program whose main class is in the unnamed package, 
we can
 ?? drop the requirement that the class or its `main` method be public,
 ?? effectively treating the `java` launcher as if it too resided in the 
unnamed
 ?? package.

 ?- Make the "args" parameter to `main` optional, by allowing the `java` 
launcher to
 ?? first look for a main method with the traditional `main(String[])`
 ?? signature, and then (if not found) for a main method with no arguments.

 ?- Make the `static` modifier on `main` optional, by allowing the 
`java` launcher to
 ?? invoke an instance `main` method (of either signature) by 
instantiating an
 ?? instance using an accessible no-arg constructor and then invoking 
the `main`
 ?? method on it.

This small set of changes to the launch protocol strikes out five of the 
bullet
points in the above list of concepts: public (twice), static, method 
parameters,
and `String[]`.

At this point, our Hello World program is now:

```
class HelloWorld {
 ??? void main() {
 ??????? System.out.println("Hello World");
 ??? }
}
```

It's not any shorter by line count, but we've removed a lot of "horizontal
noise" along with a number of concepts.? Students and educators will 
appreciate
it, but advanced programmers are unlikely to be in any hurry to make these
implicit elements explicit either.

Additionally, the notion of an "instance main" has value well beyond the 
first
day.? Because excessive use of `static` is considered a code smell, many
educators encourage the pattern of "all the static `main` method does is
instantiate an instance and call an instance `main` method" anyway.? 
Formalizing
the "instance main" protocol reduces a layer of boilerplate in these 
cases, and
defers the point at which we have to explain what instance creation is 
-- and
what `static` is.? (Further, allowing the `main` method to be an 
instance method
means that it could be inherited from a superclass, which is useful for 
simple
frameworks such as test runners or service frameworks.)

## Unnamed classes

In a simple program, the `class` declaration often doesn't help either, 
because
other classes (if there are any) are not going to reference it by name, 
and we
don't extend a superclass or implement any interfaces.? If we say an 
"unnamed
class" consists of member declarations without a class header, then our 
Hello
World program becomes:

```
void main() {
 ??? System.out.println("Hello World");
}
```

Such source files can still have fields, methods, and even nested 
classes, so
that as a program evolves from a few statements to needing some 
ancillary state
or helper methods, these can be factored out of the `main` method while 
still
not yet requiring a full class declaration:

```
String greeting() { return "Hello World"; }

void main() {
 ??? System.out.println(greeting());
}
```

This is where treating `main` as an instance method really shines; the 
user has
just declared two methods, and they can freely call each other. Students 
need
not confront the confusing distinction between instance and static 
methods yet;
indeed, if not forced to confront static members on day 1, it might be a 
while
before they do have to learn this distinction.? The fact that there is a
receiver lurking in the background will come in handy later, but right 
now is
not bothering anybody.

[JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
launched directly without compilation; this streamlined launcher pairs 
well with
unnamed classes.

## Predefined static imports

The most important classes, such as `String` and `Integer`, live in the
`java.lang` package, which is automatically on-demand imported into all
compilation units; this is why we do not have to `import 
java.lang.String` in
every class.? Static imports were not added until Java 5, but no 
corresponding
facility for automatic on-demand import of common behavior was added at that
time.? Most programs, however, will want to do console IO, and Java 
forces us to
do this in a roundabout way -- through the static `System.out` and 
`System.in`
fields.? Basic console input and output is a reasonable candidate for
auto-static import, as one or both are needed by most simple programs.? 
While
these are currently instance methods accessed through static fields, we can
easily create static methods for `println` and `readln` which are 
suitable for
static import, and automatically import them.? At which point our first 
program
is now down to:

```
void main() {
 ??? println("Hello World");
}
```

## Putting this all together

We've discussed several simplifications:

 ?- Update the launcher protocol to make public, static, and arguments 
optional
 ?? for main methods, and for main methods to be instance methods (when a
 ?? no-argument constructor is available);
 ?- Make the class wrapper for "main classes" optional (unnamed classes);
 ?- Automatically static import methods like `println`

which together whittle our long list of day-1 concepts down 
considerably.? While
this is still not as minimal as the minimal Python or Ruby program -- 
statements
must still live in a method -- the goal here is not to win at "code 
golf".? The
goal is to ensure that concepts not needed by simple programs need not 
appear in
those programs, while at the same time not encouraging habits that have 
to be
unlearned as programs scale up.

Each of these simplifications is individually small and unintrusive, and 
each is
independent of the others.? And each embodies a simple transformation 
that the
author can easily manually reverse when it makes sense to do so: elided
modifiers and `main` arguments can be added back, the class wrapper can 
be added
back when the affordances of classes are needed (supertypes, 
constructors), and
the full qualifier of static-import can be added back.? And these 
reversals are
independent of one another; they can done in any combination or any order.

This seems to meet the requirements of our on-ramp; we've eliminated 
most of the
day-1 ceremony elements without introducing new concepts that need to be
unlearned. The remaining concepts -- a method is a container for 
statements, and
a program is a Java source file with a `main` method -- are easily 
understood in
relation to their fully specified counterparts.

## Alternatives

Obviously, we've lived with the status quo for 25+ years, so we could 
continue
to do so.? There were other alternatives explored as well; ultimately, 
each of
these fell afoul of one of our goals.

### Can't we go further?

Fans of "code golf" -- of which there are many -- are surely right now 
trying to
figure out how to eliminate the last little bit, the `main` method, and 
allow
statements to exist at the top-level of a program.? We deliberately stopped
short of this because it offers little value beyond the first few 
minutes, and
even that small value quickly becomes something that needs to be unlearned.

The fundamental problem behind allowing such "loose" statements is that
variables can be declared inside both classes (fields) and methods (local
variables), and they share the same syntactic production but not the same
semantics.? So it is unclear (to both compilers and humans) whether a 
"loose"
variable would be a local or a field.? If we tried to adopt some sort of 
simple
heuristic to collapse this ambiguity (e.g., whether it precedes or 
follows the
first statement), that may satisfy the compiler, but now simple refactorings
might subtly change the meaning of the program, and we'd be replacing the
explicit syntactic overhead of `void main()` with an invisible "line" in the
program that subtly affects semantics, and a new subtle rule about the 
meaning
of variable declarations that applies only to unnamed classes. This doesn't
help students, nor is this particularly helpful for all but the most trivial
programs.? It quickly becomes a crutch to be discarded and unlearned, which
falls afoul of our "on ramp" goals.? Of all the concepts on our list, 
"methods"
and "a program is specified by a main method" seem the ones that are 
most worth
asking students to learn early.

### Why not "just" use `jshell`?

While JShell is a great interactive tool, leaning too heavily on it as 
an onramp
would fall afoul of our goals.? A JShell session is not a program, but a
sequence of code snippets.? When we type declarations into `jshell`, 
they are
viewed as implicitly static members of some unspecified class, with
accessibility is ignored completely, and statements execute in a context 
where
all previous declarations are in scope.? This is convenient for 
experimentation
-- the primary goal of `jshell` -- but not such a great mental model for
learning to write Java programs.? Transforming a batch of working 
declarations
in `jshell` to a real Java program would not be sufficiently simple or
unintrusive, and would lead to a non-idiomatic style of code, because the
straightforward translation would have us redeclaring each method, 
class, and
variable declaration as `static`.? Further, this is probably not the 
direction
we want to go when we scale up from a handful of statements and 
declarations to
a simple class -- we probably want to start using classes as classes, 
not just
as containers for static members. JShell is a great tool for exploration and
debugging, and we expect many educators will continue to incorporate it into
their curriculum, but is not the on-ramp programming model we are 
looking for.

### What about "always local"?

One of the main tensions that `main` introduces is that most class 
members are
not `static`, but the `main` method is -- and that forces programmers to
confront the seam between static and non-static members.? JShell answers 
this
with "make everything static".

Another approach would be to "make everything local" -- treat a simple 
program
as being the "unwrapped" body of an implicit main method.? We already allow
variables and classes to be declared local to a method.? We could add local
methods (a useful feature in its own right) and relax some of the 
asymmetries
around nesting (again, an attractive cleanup), and then treat a mix of
declarations and statements without a class wrapper as the body of an 
invisible
`main` method. This seems an attractive model as well -- at first.

While the syntactic overhead of converting back to full-blown classes -- 
wrap
the whole thing in a `main` method and a `class` declaration -- is far less
intrusive than the transformation inherent in `jshell`, this is still not an
ideal on-ramp.? Local variables interact with local classes (and 
methods, when
we have them) in a very different way than instance fields do with instance
methods and inner classes: their scopes are different (no forward 
references),
their initialization rules are different, and captured local variables 
must be
effectively final.? This is a subtly different programming model that 
would then
have to be unlearned when scaling up to full classes. Further, the result of
this wrapping -- where everything is local to the main method -- is also not
"idiomatic Java".? So while local methods may be an attractive feature, 
they are
similarly not the on-ramp we are looking for.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/16190a7f/attachment-0001.htm>

From kevinb at google.com  Wed Sep 28 19:49:33 2022
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 28 Sep 2022 12:49:33 -0700
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>

Virtuous.

The quips about horses having fled the barn are coming, but whether they
did is irrelevant; let's just make Java better now.


On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz <brian.goetz at oracle.com> wrote:

## Concept overload
>

I like that the focus is not just on boilerplate but on the offense of
forcing learners to encounter concepts they *will* need to care about but
don't yet.


 - Relax the requirement that the class, and `main` method, be public.
> Public
>    accessibility is only relevant when access crosses packages; simple
> programs
>    live in the unnamed package, so cannot be accessed from any other
> package
>    anyway.  For a program whose main class is in the unnamed package, we
> can
>    drop the requirement that the class or its `main` method be public,
>    effectively treating the `java` launcher as if it too resided in the
> unnamed
>    package.
>

Alternative: drop the requirement altogether. Most main methods have no
desire to make themselves publicly callable as `TheClass.main(args)`, but
today they are forced to expose that API anyway. I feel like it would still
be conceptually clean to say that `public` is really about whether other
*code* can access it, not whether a VM can get to it at all.


 - Make the "args" parameter to `main` optional, by allowing the `java`
> launcher to
>    first look for a main method with the traditional `main(String[])`
>    signature, and then (if not found) for a main method with no arguments.
>

This seems to leave users vulnerable to some surprises, where the code they
think is being called isn't. Why not make it a compile-time error to
provide both forms?


 - Make the `static` modifier on `main` optional, by allowing the `java`
> launcher to
>    invoke an instance `main` method (of either signature) by instantiating
> an
>    instance using an accessible no-arg constructor and then invoking the
> `main`
>    method on it.
>

I'll give the problems I see with this, without a judgement on what should
be done.

What's the whole idea of main? Well, it's the entry point into the program.
But now it's *not* really the entry point; finding the entry point is more
subtle. (Okay, I concede that static initializers are run first either way;
that undercuts *some* of the strength of my argument here.)

Even if this is okay when I'm writing my own new program, understanding it
as I go, then suppose someone else reads my program. That person has the
burden of remembering to check whether `main` is static or not, and
remembering that some constructor code is happening first if it's not.
Classes that have both main and a constructor will be a mixture of some
that call them in one order and some in the other. That's just, like, messy.

And is it even clear, then, why the VM shouldn't be passing `args` to the
*constructor*, only hoarding it until calling `main`?

On a deep conceptual level... I'd insist that main() *is static*. It is
*the* single entry point into the program; what could be more static than
that? But thinking about our learner, who wrote some `main`s before
learning about static. The instant they learn `static` is a keyword a
method can have, they'll "know" one thing about it already: this is going
to be something new that's *not* true of main(). But then they hear an
explanation that fits `main` perfectly?


Because excessive use of `static` is considered a code smell, many
> educators encourage the pattern of "all the static `main` method does is
> instantiate an instance and call an instance `main` method" anyway.
>

Heavy groan. In my opinion, some ideas are too misguided to take seriously.

The value in that practice is if instance `main` accepts parameters like
`PrintStream` and `Console`, and static main passes in `System.out` and
`System.console()`. That makes all your actual program logic unit-testable.
Great! This actually strikes directly at the heart of what the entire
problem with `static` is! But this isn't the case you're addressing.

Static methods are not a code smell! Static methods that ought to be
overrideable by one of their argument types (Collections.sort()), sure.
Static mutable state is a code smell, definitely -- but a method that
touches that state is equally problematic whether it itself is static or
not. There are some code smells around `static`, but `static` itself is
fresh and flowery.


(Further, allowing the `main` method to be an instance method
> means that it could be inherited from a superclass, which is useful for
> simple
> frameworks such as test runners or service frameworks.)
>

This does not give me a happy feeling. Going into it is a deep discussion
though.

Rest of the response coming soon, I hope.

Just to mention one additional idea. We could permit `main` to optionally
return `int`, becoming the default exit status if `exit` is never called.
Seems elegant for the rare cases where you care about exit status, but (a)
would this feature get in the way in *any* sense for the vast majority of
cases that don't care, or (b) are the cases that care just way too rare for
us to worry about?

I'm not sure about (a). But (b) kinda seems like a yes.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/233c1618/attachment.htm>

From brian.goetz at oracle.com  Wed Sep 28 20:10:02 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 28 Sep 2022 16:10:02 -0400
Subject: Paving the on-ramp
In-Reply-To: <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
Message-ID: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>


>
>     ?- Relax the requirement that the class, and `main` method, be
>     public.? Public
>     ?? accessibility is only relevant when access crosses packages;
>     simple programs
>     ?? live in the unnamed package, so cannot be accessed from any
>     other package
>     ?? anyway.? For a program whose main class is in the unnamed
>     package, we can
>     ?? drop the requirement that the class or its `main` method be public,
>     ?? effectively treating the `java` launcher as if it too resided
>     in the unnamed
>     ?? package.
>
>
> Alternative: drop the requirement altogether. Most main methods have 
> no desire to make themselves publicly?callable as 
> `TheClass.main(args)`, but today they are forced to expose that API 
> anyway. I feel like it would still be conceptually clean?to say that 
> `public` is really about whether other *code* can access it, not 
> whether a VM can get to it at all.

I think we're saying the same thing; main need not be public.

>
>     ?- Make the "args" parameter to `main` optional, by allowing the
>     `java` launcher to
>     ?? first look for a main method with the traditional `main(String[])`
>     ?? signature, and then (if not found) for a main method with no
>     arguments.
>
>
> This seems to leave users vulnerable to some surprises, where the code 
> they think is being called isn't. Why not make it a compile-time error 
> to provide both forms?

Currently, the treatment of methods called "main" is "and also"; it is a 
valid method, *and also* (if it has the right shape) can be used as a 
main entry point.? Making this an error would take some valid programs 
and make them invalid, which seems a shift in the interpretation of the 
magic name "main".? A warning is probably reasonable though.

>
>     ?- Make the `static` modifier on `main` optional, by allowing the
>     `java` launcher to
>     ?? invoke an instance `main` method (of either signature) by
>     instantiating an
>     ?? instance using an accessible no-arg constructor and then
>     invoking the `main`
>     ?? method on it.
>
>
> On a deep conceptual level... I'd insist that main() *is static*. It 
> is *the* single entry point into the program; what could be more 
> static than that? But thinking about our learner, who wrote some 
> `main`s before learning about static. The instant they learn `static` 
> is a keyword a method can have, they'll "know" one thing about it 
> already: this is going to be something new that's *not* true of 
> main(). But then they hear an explanation that fits `main` perfectly?

John likes to say "static has messed up every job we've ever given it", 
and while that seems an exaggeration at first, often turns out to be 
surprisingly accurate.? One subtle thing it messes up here is that one 
cannot effectively inherit a main() method.? But inheriting main() is 
super useful!? Consider a TestCase class in a test framework, or an 
AbstractService class in a services framework.? If the abstract class 
can provide the main() method, then every test case or service _is also 
a program_, one which runs that test case or service.

But, there is cheese-moving here.? In the old model, "main" is just a 
disembodied method, which only accidentally lives in a class, and drags 
the class along for the ride.?? In this model, main-ness moves up the 
stack, becoming a property of a class, not just something a class has.

This tension is evident in JLS 12, which defines the interaction with 
main.? It is full of wiggle words, because it is trying to pretend that 
Java has no concept of "program", just classes, but at the same time, 
there has to be a way to get the computation started.? The JLS tries to 
pretend that "program" is defined almost extralinguistically (by appeal 
to an unspcified launcher program that exists outside of the language), 
but nearly trips over its own feet trying to have it both ways.

The debate among educators about whether main should be allowed to do 
anything it wants, or should only instantiate an object and call a 
single method, illustrates this tension.? So what is really going on 
here is bringing the notion of "program" to classes in a less 
nailed-on-the-side way.

> Just to mention one additional idea. We could permit `main` to 
> optionally return `int`, becoming the default exit status if `exit` is 
> never called. Seems elegant for the rare cases where you care about 
> exit status, but (a) would this feature?get in the way in *any* sense 
> for the vast majority of cases that don't care, or (b) are the cases 
> that care just way too rare for us to worry about?
>
> I'm not sure about (a). But (b) kinda seems like a yes.
>

Considered this (since C lets you do this.)?? Since Java doesn't let you 
overload on return types, we have the option to do this later without 
making the search order any more complicated, so I left it out.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/de998c94/attachment-0001.htm>

From kevinb at google.com  Wed Sep 28 20:27:59 2022
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 28 Sep 2022 13:27:59 -0700
Subject: Paving the on-ramp
In-Reply-To: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
 <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
Message-ID: <CAGKkBktKJ08Paf=5QYruOcUEyKZZ12tV6UeuuuUPJa0EpY5M2A@mail.gmail.com>

On Wed, Sep 28, 2022 at 1:10 PM Brian Goetz <brian.goetz at oracle.com> wrote:

>  - Relax the requirement that the class, and `main` method, be public.
>> Public
>>    accessibility is only relevant when access crosses packages; simple
>> programs
>>    live in the unnamed package, so cannot be accessed from any other
>> package
>>    anyway.  For a program whose main class is in the unnamed package, we
>> can
>>    drop the requirement that the class or its `main` method be public,
>>    effectively treating the `java` launcher as if it too resided in the
>> unnamed
>>    package.
>>
>
> Alternative: drop the requirement altogether. Most main methods have no
> desire to make themselves publicly callable as `TheClass.main(args)`, but
> today they are forced to expose that API anyway. I feel like it would still
> be conceptually clean to say that `public` is really about whether other
> *code* can access it, not whether a VM can get to it at all.
>
> I think we're saying the same thing; main need not be public.
>


You seemed quite clearly to be offering that for classes in the default
package only.

 - Make the "args" parameter to `main` optional, by allowing the `java`
>> launcher to
>>    first look for a main method with the traditional `main(String[])`
>>    signature, and then (if not found) for a main method with no arguments.
>>
>
> This seems to leave users vulnerable to some surprises, where the code
> they think is being called isn't. Why not make it a compile-time error to
> provide both forms?
>
> Currently, the treatment of methods called "main" is "and also"; it is a
> valid method, *and also* (if it has the right shape) can be used as a main
> entry point.  Making this an error would take some valid programs and make
> them invalid, which seems a shift in the interpretation of the magic name
> "main".  A warning is probably reasonable though.
>


Oh, yeah, I have a habit of saying "error" when I am always always
perfectly satisfied with a warning. Of course, the warning goes just on the
method that isn't gonna get called, and the user should be advised to
rename it.

 - Make the `static` modifier on `main` optional, by allowing the `java`
>> launcher to
>>    invoke an instance `main` method (of either signature) by
>> instantiating an
>>    instance using an accessible no-arg constructor and then invoking the
>> `main`
>>    method on it.
>>
>
> On a deep conceptual level... I'd insist that main() *is static*. It is
> *the* single entry point into the program; what could be more static than
> that? But thinking about our learner, who wrote some `main`s before
> learning about static. The instant they learn `static` is a keyword a
> method can have, they'll "know" one thing about it already: this is going
> to be something new that's *not* true of main(). But then they hear an
> explanation that fits `main` perfectly?
>
>
Sorry, just a quick self-reply of clarification: when I said "main IS
static", that was taking Java's model that everything belongs to a class as
*given*. It's not commentary against "main is really a free function".


John likes to say "static has messed up every job we've ever given it", and
> while that seems an exaggeration at first, often turns out to be
> surprisingly accurate.  One subtle thing it messes up here is that one
> cannot effectively inherit a main() method.  But inheriting main() is super
> useful!  Consider a TestCase class in a test framework, or an
> AbstractService class in a services framework.  If the abstract class can
> provide the main() method, then every test case or service _is also a
> program_, one which runs that test case or service.
>


I see that that is "a way" to do a thing. But in my view, implementation
inheritance has messed up every job we've ever given it. :-) It will at the
*least* take me some time and reflection to convince myself that
"inheritable main" isn't horrifying.

Most of the specific counter-arguments I laid out to the non-static main
have dropped out of the thread without acknowledgement, so I'm a little
concerned they'll be forgotten in the discussion.

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/50baa44d/attachment.htm>

From forax at univ-mlv.fr  Wed Sep 28 20:49:50 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 28 Sep 2022 22:49:50 +0200 (CEST)
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 28, 2022 7:57:19 PM
> Subject: Paving the on-ramp

> At various points, we've explored the question of which program elements are
> most and least helpful for students first learning Java. After considering a
> number of alternatives over the years, I have a simple proposal for smoothing
> the "on ramp" to Java programming, while not creating new things to unlearn.

> Markdown source is below, HTML will appear soon at:

> [ https://openjdk.org/projects/amber/design-notes/on-ramp |
> https://openjdk.org/projects/amber/design-notes/on-ramp ]

> # Paving the on-ramp

> Java is one of the most widely taught programming languages in the world. Tens
> of thousands of educators find that the imperative core of the language combined
> with a straightforward standard library is a foundation that students can
> comfortably learn on. Choosing Java gives educators many degrees of freedom:
> they can situate students in `jshell` or Notepad or a full-fledged IDE; they can
> teach imperative, object-oriented, functional, or hybrid programming styles; and
> they can easily find libraries to interact with external data and services.

> No language is perfect, and one of the most common complaints about Java is that
> it is "too verbose" or has "too much ceremony." And unfortunately, Java imposes
> its heaviest ceremony on those first learning the language, who need and
> appreciate it the least. The declaration of a class and the incantation of
> `public static void main` is pure mystery to a beginning programmer. While
> these incantations have principled origins and serve a useful organizing purpose
> in larger programs, they have the effect of placing obstacles in the path of
> _becoming_ Java programmers. Educators constantly remind us of the litany of
> complexity that students have to confront on Day 1 of class -- when they really
> just want to write their first program.

> As an amusing demonstration of this, in her JavaOne keynote appearance in 2019,
> [Aimee Lucido]( [ https://www.youtube.com/watch?v=BkPPFiXUwYk |
> https://www.youtube.com/watch?v=BkPPFiXUwYk ] ) talked about when
> she learned to program in Java, and how her teacher performed a rap song
> to help students memorize `"public static void main"`. Our hats are off to
> creative educators everywhere for this kind of dedication, but teachers
> shouldn't have to do this.

> Of course, advanced programmers complain about ceremony too. We will never be
> able to satisfy programmers' insatiable appetite for typing fewer keystrokes,
> and we shouldn't try, because the goal of programming is to write programs that
> are easy to read and are clearly correct, not programs that were easy to type.
> But we can try to better align the ceremony commensurate with the value it
> brings to a program -- and let simple programs be expressed more simply.

> ## Concept overload

> The classic "Hello World" program looks like this in Java:

> ```
> public class HelloWorld {
> public static void main(String[] args) {
> System.out.println("Hello World");
> }
> }
> ```

> It may only be five lines, but those lines are packed with concepts that are
> challenging to absorb without already having some programming experience and
> familiarity with object orientation. Let's break down the concepts a student
> confronts when writing their first Java program:

> - **public** (on the class). The `public` accessibility level is relevant
> only when there is going to be cross-package access; in a simple "Hello
> World" program, there is only one class, which lives in the unnamed package.
> They haven't even written a one-line program yet; the notion of access
> control -- keeping parts of a program from accessing other parts of it -- is
> still way in their future.

> - **class**. Our student hasn't set out to write a _class_, or model a
> complex system with objects; they want to write a _program_. In Java, a
> program is just a `main` method in some class, but at this point our student
> still has no idea what a class is or why they want one.

> - **Methods**. Methods are of course a key concept in Java, but the mechanics
> of methods -- parameters, return types, and invocation -- are still
> unfamiliar, and the `main` method is invoked magically from the `java`
> launcher rather than from explicit code.

> - **public** (again). Like the class, the `main` method has to be public, but
> again this is only relevant when programs are large enough to require
> packages to organize them.

> - **static**. The `main` method has to be static, and at this point, students
> have no context for understanding what a static method is or why they want
> one. Worse, the early exposure to `static` methods will turn out to be a
> bad habit that must be later unlearned. Worse still, the fact that the
> `main` method is `static` creates a seam between `main` and other methods;
> either they must become `static` too, or the `main` method must trampoline
> to some sort of "instance main" (more ceremony!) And if we get this wrong,
> we get the dreaded and mystifying `"cannot be referenced from a static
> context"` error.

> - **main**. The name `main` has special meaning in a Java program, indicating
> the starting point of a program, but this specialness hides behind being an
> ordinary method name. This may contribute to the sense of "so many magic
> incantations."

> - **String[]**. The parameter to `main` is an array of strings, which are the
> arguments that the `java` launcher collected from the command line. But our
> first program -- likely our first dozen -- will not use command-line
> parameters. Requiring the `String[]` parameter is, at this point, a mistake
> waiting to happen, and it will be a long time until this parameter makes
> sense. Worse, educators may be tempted to explain arrays at this point,
> which further increases the time-to-first-program.

> - **System.out.println**. If you look closely at this incantation, each
> element in the chain is a different thing -- `System` is a class (what's a
> class again?), `out` is a static field (what's a field?), and `println` is
> an instance method. The only part the student cares about right now is
> `println`; the rest of it is an incantation that they do not yet understand
> in order to get at the behavior they want.

> That's a lot to explain to a student on the first day of class. There's a good
> chance that by now, class is over and we haven't written any programs yet, or
> the teacher has said "don't worry what this means, you'll understand it later"
> six or eight times. Not only is this a lot of _syntactic_ things to absorb, but
> each of those things appeals to a different concept (class, method, package,
> return value, parameter, array, static, public, etc) that the student doesn't
> have a framework for understanding yet. Each of these will have an important
> role to play in larger programs, but so far, they only contribute to "wow,
> programming is complicated."

> It won't be practical (or even desirable) to get _all_ of these concepts out of
> the student's face on day 1, but we can do a lot -- and focus on the ones that
> do the most to help beginners understand how programs are constructed.

> ## Goal: a smooth on-ramp

> As much as programmers like to rant about ceremony, the real goal here is not
> mere ceremony reduction, but providing a graceful _on ramp_ to Java programming.
> This on-ramp should be helpful to beginning programmers by requiring only those
> concepts that a simple program needs.

> Not only should an on-ramp have a gradual slope and offer enough acceleration
> distance to get onto the highway at the right speed, but its direction must
> align with that of the highway. When a programmer is ready to learn about more
> advanced concepts, they should not have to discard what they've already learned,
> but instead easily see how the simple programs they've already written
> generalize to more complicated ones, and both the syntatic and conceptual
> transformation from "simple" to "full blown" program should be straightforward
> and unintrusive. It is a definite non-goal to create a "simplified dialect of
> Java for students".

> We identify three simplifications that should aid both educators and students in
> navigating the on-ramp to Java, as well as being generally useful to simple
> programs beyond the classroom as well:

> - A more tolerant launch protocol
> - Unnamed classes
> - Predefined static imports for the most critical methods and fields

> ## A more tolerant launch protocol

> The Java Language Specification has relatively little to say about how Java
> "programs" get launched, other than saying that there is some way to indicate
> which class is the initial class of a program (JLS 12.1.1) and that a public
> static method called `main` whose sole argument is of type `String[]` and whose
> return is `void` constitutes the entry point of the indicated class.

> We can eliminate much of the concept overload simply by relaxing the
> interactions between a Java program and the `java` launcher:

> - Relax the requirement that the class, and `main` method, be public. Public
> accessibility is only relevant when access crosses packages; simple programs
> live in the unnamed package, so cannot be accessed from any other package
> anyway. For a program whose main class is in the unnamed package, we can
> drop the requirement that the class or its `main` method be public,
> effectively treating the `java` launcher as if it too resided in the unnamed
> package.

> - Make the "args" parameter to `main` optional, by allowing the `java` launcher
> to
> first look for a main method with the traditional `main(String[])`
> signature, and then (if not found) for a main method with no arguments.

> - Make the `static` modifier on `main` optional, by allowing the `java` launcher
> to
> invoke an instance `main` method (of either signature) by instantiating an
> instance using an accessible no-arg constructor and then invoking the `main`
> method on it.

> This small set of changes to the launch protocol strikes out five of the bullet
> points in the above list of concepts: public (twice), static, method parameters,
> and `String[]`.

> At this point, our Hello World program is now:

> ```
> class HelloWorld {
> void main() {
> System.out.println("Hello World");
> }
> }
> ```

> It's not any shorter by line count, but we've removed a lot of "horizontal
> noise" along with a number of concepts. Students and educators will appreciate
> it, but advanced programmers are unlikely to be in any hurry to make these
> implicit elements explicit either.

> Additionally, the notion of an "instance main" has value well beyond the first
> day. Because excessive use of `static` is considered a code smell, many
> educators encourage the pattern of "all the static `main` method does is
> instantiate an instance and call an instance `main` method" anyway. Formalizing
> the "instance main" protocol reduces a layer of boilerplate in these cases, and
> defers the point at which we have to explain what instance creation is -- and
> what `static` is. (Further, allowing the `main` method to be an instance method
> means that it could be inherited from a superclass, which is useful for simple
> frameworks such as test runners or service frameworks.)

> ## Unnamed classes

> In a simple program, the `class` declaration often doesn't help either, because
> other classes (if there are any) are not going to reference it by name, and we
> don't extend a superclass or implement any interfaces. If we say an "unnamed
> class" consists of member declarations without a class header, then our Hello
> World program becomes:

> ```
> void main() {
> System.out.println("Hello World");
> }
> ```

> Such source files can still have fields, methods, and even nested classes, so
> that as a program evolves from a few statements to needing some ancillary state
> or helper methods, these can be factored out of the `main` method while still
> not yet requiring a full class declaration:

> ```
> String greeting() { return "Hello World"; }

> void main() {
> System.out.println(greeting());
> }
> ```

> This is where treating `main` as an instance method really shines; the user has
> just declared two methods, and they can freely call each other. Students need
> not confront the confusing distinction between instance and static methods yet;
> indeed, if not forced to confront static members on day 1, it might be a while
> before they do have to learn this distinction. The fact that there is a
> receiver lurking in the background will come in handy later, but right now is
> not bothering anybody.

> [JEP 330]( [ https://openjdk.org/jeps/330 | https://openjdk.org/jeps/330 ] )
> allows single-file programs to be
> launched directly without compilation; this streamlined launcher pairs well with
> unnamed classes.

> ## Predefined static imports

> The most important classes, such as `String` and `Integer`, live in the
> `java.lang` package, which is automatically on-demand imported into all
> compilation units; this is why we do not have to `import java.lang.String` in
> every class. Static imports were not added until Java 5, but no corresponding
> facility for automatic on-demand import of common behavior was added at that
> time. Most programs, however, will want to do console IO, and Java forces us to
> do this in a roundabout way -- through the static `System.out` and `System.in`
> fields. Basic console input and output is a reasonable candidate for
> auto-static import, as one or both are needed by most simple programs. While
> these are currently instance methods accessed through static fields, we can
> easily create static methods for `println` and `readln` which are suitable for
> static import, and automatically import them. At which point our first program
> is now down to:

> ```
> void main() {
> println("Hello World");
> }
> ```

> ## Putting this all together

> We've discussed several simplifications:

> - Update the launcher protocol to make public, static, and arguments optional
> for main methods, and for main methods to be instance methods (when a
> no-argument constructor is available);
> - Make the class wrapper for "main classes" optional (unnamed classes);
> - Automatically static import methods like `println`

> which together whittle our long list of day-1 concepts down considerably. While
> this is still not as minimal as the minimal Python or Ruby program -- statements
> must still live in a method -- the goal here is not to win at "code golf". The
> goal is to ensure that concepts not needed by simple programs need not appear in
> those programs, while at the same time not encouraging habits that have to be
> unlearned as programs scale up.

> Each of these simplifications is individually small and unintrusive, and each is
> independent of the others. And each embodies a simple transformation that the
> author can easily manually reverse when it makes sense to do so: elided
> modifiers and `main` arguments can be added back, the class wrapper can be added
> back when the affordances of classes are needed (supertypes, constructors), and
> the full qualifier of static-import can be added back. And these reversals are
> independent of one another; they can done in any combination or any order.

> This seems to meet the requirements of our on-ramp; we've eliminated most of the
> day-1 ceremony elements without introducing new concepts that need to be
> unlearned. The remaining concepts -- a method is a container for statements, and
> a program is a Java source file with a `main` method -- are easily understood in
> relation to their fully specified counterparts.

> ## Alternatives

> Obviously, we've lived with the status quo for 25+ years, so we could continue
> to do so. There were other alternatives explored as well; ultimately, each of
> these fell afoul of one of our goals.

> ### Can't we go further?

> Fans of "code golf" -- of which there are many -- are surely right now trying to
> figure out how to eliminate the last little bit, the `main` method, and allow
> statements to exist at the top-level of a program. We deliberately stopped
> short of this because it offers little value beyond the first few minutes, and
> even that small value quickly becomes something that needs to be unlearned.

> The fundamental problem behind allowing such "loose" statements is that
> variables can be declared inside both classes (fields) and methods (local
> variables), and they share the same syntactic production but not the same
> semantics. So it is unclear (to both compilers and humans) whether a "loose"
> variable would be a local or a field. If we tried to adopt some sort of simple
> heuristic to collapse this ambiguity (e.g., whether it precedes or follows the
> first statement), that may satisfy the compiler, but now simple refactorings
> might subtly change the meaning of the program, and we'd be replacing the
> explicit syntactic overhead of `void main()` with an invisible "line" in the
> program that subtly affects semantics, and a new subtle rule about the meaning
> of variable declarations that applies only to unnamed classes. This doesn't
> help students, nor is this particularly helpful for all but the most trivial
> programs. It quickly becomes a crutch to be discarded and unlearned, which
> falls afoul of our "on ramp" goals. Of all the concepts on our list, "methods"
> and "a program is specified by a main method" seem the ones that are most worth
> asking students to learn early.

> ### Why not "just" use `jshell`?

> While JShell is a great interactive tool, leaning too heavily on it as an onramp
> would fall afoul of our goals. A JShell session is not a program, but a
> sequence of code snippets. When we type declarations into `jshell`, they are
> viewed as implicitly static members of some unspecified class, with
> accessibility is ignored completely, and statements execute in a context where
> all previous declarations are in scope. This is convenient for experimentation
> -- the primary goal of `jshell` -- but not such a great mental model for
> learning to write Java programs. Transforming a batch of working declarations
> in `jshell` to a real Java program would not be sufficiently simple or
> unintrusive, and would lead to a non-idiomatic style of code, because the
> straightforward translation would have us redeclaring each method, class, and
> variable declaration as `static`. Further, this is probably not the direction
> we want to go when we scale up from a handful of statements and declarations to
> a simple class -- we probably want to start using classes as classes, not just
> as containers for static members. JShell is a great tool for exploration and
> debugging, and we expect many educators will continue to incorporate it into
> their curriculum, but is not the on-ramp programming model we are looking for.

> ### What about "always local"?

> One of the main tensions that `main` introduces is that most class members are
> not `static`, but the `main` method is -- and that forces programmers to
> confront the seam between static and non-static members. JShell answers this
> with "make everything static".

> Another approach would be to "make everything local" -- treat a simple program
> as being the "unwrapped" body of an implicit main method. We already allow
> variables and classes to be declared local to a method. We could add local
> methods (a useful feature in its own right) and relax some of the asymmetries
> around nesting (again, an attractive cleanup), and then treat a mix of
> declarations and statements without a class wrapper as the body of an invisible
> `main` method. This seems an attractive model as well -- at first.

> While the syntactic overhead of converting back to full-blown classes -- wrap
> the whole thing in a `main` method and a `class` declaration -- is far less
> intrusive than the transformation inherent in `jshell`, this is still not an
> ideal on-ramp. Local variables interact with local classes (and methods, when
> we have them) in a very different way than instance fields do with instance
> methods and inner classes: their scopes are different (no forward references),
> their initialization rules are different, and captured local variables must be
> effectively final. This is a subtly different programming model that would then
> have to be unlearned when scaling up to full classes. Further, the result of
> this wrapping -- where everything is local to the main method -- is also not
> "idiomatic Java". So while local methods may be an attractive feature, they are
> similarly not the on-ramp we are looking for.

I agree with the goal, i've several remarks. 

- You do not have to declare a class public to run it so you do not have to explain the first "public". 
So the sutuation is a little less awful that the one you describe :) 

- I know several teachers that uses an interface instead of a class as the default container for methods for the first weeks, because methods are public by default inside an interface (and nested classes are implicitly static). 
The snippet used is something like this 
interface Hello { 
static void main(String[] args) { 
... 
} 
} 

so technically you do not have to explain "public". 

- You can declare a main() on other things that a class, on an interface, on an enum or a record. 
Being able to declare the "main" without to have to declare it static is nice, but the semantics you propose creates new issues, 
because the auto-instantiation does not work if the container is an interface or a record with components. 
This feels too magical to me, and as a teacher i will have to explain it at some point. 

- Currently there is a nice progression in term of complexity, there are 3 steps : 
first, you have a class with no package and you can only use the classes of the JDK or the classes of the current folder, 
then, you have the package declaration and you can have multiple folder, 
and finally if you want to declare non-visible packages, you need module and the module-info. 
I think your idea of a classless compilation unit plays well with the current idea if we consider it has the step zero, 
first you have classless class, then class, then package + class and at the end module + package + class. 

- We should not be able to declare fields inside a classless class, students strugle at the beginning to make the difference between a field and a local variable. 
Every syntax that make that distinction murkier is a bad idea. 
So perhaps what we want is a classless container of methods, not a classless class. 

- At the begining, teaching records is easier than teaching classes because you can do too much with a class while records have a simple syntax and a simple semantics. 
At my university, real classes (not class as container) are only introduced at week 4, when we start to have mutable thingy. 
In a dream world, we should be able to declare records inside a classless class, but i do not see how the compiler will not see a top level record instead of a classless class containing records. 
I suppose, a classless class can not have nested classes/record/enum/interface.. 
- At my uni, we start by teaching Python and JavaScript, then C then Java. We do not teach ipython because the semantics is slightly different from python. 
For the same reason, we do not use jshell for undergraduates because the semantics is sligthly different than java. 
For the same reason, if the semantics of a classless class is different from the semantics of a regular class , we will not use it too. 
I don't think your proposal has that problem, it's more a remainder for me that a classless class can not have a different semantics than a class, 
it can do less but it can not do more or worst do something differently. 

- I don't hink we can add an auto static import without causing source backward compatibility issues, because you can not have several import static using the same last identitfier. 
By example, if an existing class declare 
import static foo.A.println; 

this class will now fail to compile. 
That's why no auto static import was added in Java 5. 

There is also the problem of the comb rule, Java prefers super types methods (even static methods) to static imports. 
So adding a method println() (or any method named like an auto imported static method) to a non final class becomes a hazard. 
You may argue that we altready have that problem now, which is true, but any auto static imports of methods makes this known problem worst. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/b6860f03/attachment-0001.htm>

From brian.goetz at oracle.com  Wed Sep 28 20:56:06 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 28 Sep 2022 16:56:06 -0400
Subject: Paving the on-ramp
In-Reply-To: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr>
Message-ID: <d934a6ee-5f34-4564-0c63-0810fd982c23@oracle.com>


> - You can declare a main() on other things that a class, on an 
> interface, on an enum or a record.
> ? Being able to declare the "main" without to have to declare it 
> static is nice, but the semantics you propose creates new issues,
> ? because the auto-instantiation does not work if the container is an 
> interface or a record with components.
> ? This feels too magical to me, and as a teacher i will have to 
> explain it at some point.

Perhaps, but not on the first day.

> - At the begining, teaching records is easier than teaching classes 
> because you can do too much with a class while records have a simple 
> syntax and a simple semantics.

I agree teaching records first is a good teaching strategy; I have a lot 
to say about curriculum design, but I'd like to keep that a separate 
discussion.? Suffiice it to say that an important secondary goal here is 
unconstraining the order in which things must be taught.

> ? In a dream world, we should be able to declare records inside a 
> classless class, but i do not see how the compiler will not see a top 
> level record instead of a classless class containing records.
>

Hoping to make this dream possible.
> - At my uni, we start by teaching Python and JavaScript, then C then 
> Java. We do not teach ipython because the semantics is slightly 
> different from python.
> ? For the same reason, we do not use jshell for undergraduates because 
> the semantics is sligthly different than java.
> ? For the same reason, if the semantics of a classless class is 
> different from the semantics of a regular class , we will not use it too.

Agree, and this was a strong driving motivation.? This is why we have 
avoided trying to create a "safe subset for beginners", and instead 
focus on allowing unnecessary wrapping to be elided.

> - I don't hink we can add an auto static import without causing source 
> backward compatibility issues, because you can not have several import 
> static using the same last identitfier.
> ? By example, if an existing class declare
> ??? import static foo.A.println;
>
> ? this class will now fail to compile.
> ? That's why no auto static import was added in Java 5.

There is some complexity here, but it does not seem insurmountable.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/07ec7d80/attachment.htm>

From forax at univ-mlv.fr  Wed Sep 28 21:13:06 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Wed, 28 Sep 2022 23:13:06 +0200 (CEST)
Subject: Paving the on-ramp
In-Reply-To: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
 <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
Message-ID: <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Kevin Bourrillion" <kevinb at google.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 28, 2022 10:10:02 PM
> Subject: Re: Paving the on-ramp
>>> - Make the "args" parameter to `main` optional, by allowing the `java` launcher
>>> to
>>> first look for a main method with the traditional `main(String[])`
>>> signature, and then (if not found) for a main method with no arguments.

>> This seems to leave users vulnerable to some surprises, where the code they
>> think is being called isn't. Why not make it a compile-time error to provide
>> both forms?

> Currently, the treatment of methods called "main" is "and also"; it is a valid
> method, *and also* (if it has the right shape) can be used as a main entry
> point. Making this an error would take some valid programs and make them
> invalid, which seems a shift in the interpretation of the magic name "main". A
> warning is probably reasonable though.
The other solution is to do something similar to the compact constructor of a record, a compact main that have a syntax which is not currently valid in Java. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/114a51d4/attachment.htm>

From brian.goetz at oracle.com  Wed Sep 28 21:23:47 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 28 Sep 2022 17:23:47 -0400
Subject: Paving the on-ramp
In-Reply-To: <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
 <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
 <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr>
Message-ID: <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com>

> The other solution is to do something similar to the compact 
> constructor of a record, a compact main that have a syntax which is 
> not currently valid in Java.

An early iteration had something like that.? I liked it for about five 
minutes!? Then I started to dislike it, because (a) it was going to 
quickly become something that needs to be unlearned and (b) it was 
spending syntax on a very narrow use case, narrow in multiple ways.? And 
fixing (a) by generalizing to "compact methods" didn't feel like a win 
either; now it was just two ways to say the same thing.

Of all the concepts that it is worth asking users to internalize early, 
I think "methods as aggregations of statements" is it.? (Yes, in this 
version you still have to confront "void" and "()".)


From forax at univ-mlv.fr  Wed Sep 28 21:35:12 2022
From: forax at univ-mlv.fr (forax at univ-mlv.fr)
Date: Wed, 28 Sep 2022 23:35:12 +0200 (CEST)
Subject: Paving the on-ramp
In-Reply-To: <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
 <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com>
 <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr>
 <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com>
Message-ID: <1702806108.15315388.1664400912191.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "Kevin Bourrillion" <kevinb at google.com>, "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 28, 2022 11:23:47 PM
> Subject: Re: Paving the on-ramp

>> The other solution is to do something similar to the compact
>> constructor of a record, a compact main that have a syntax which is
>> not currently valid in Java.
> 
> An early iteration had something like that.? I liked it for about five
> minutes!? Then I started to dislike it, because (a) it was going to
> quickly become something that needs to be unlearned and (b) it was
> spending syntax on a very narrow use case, narrow in multiple ways.? And
> fixing (a) by generalizing to "compact methods" didn't feel like a win
> either; now it was just two ways to say the same thing.
> 
> Of all the concepts that it is worth asking users to internalize early,
> I think "methods as aggregations of statements" is it.? (Yes, in this
> version you still have to confront "void" and "()".)

That the main issue with main :)

It's the entry point so it's a special case but at the same time you do not want to spend a lot of effort to make it different from a method, so having a special syntax is too much.

And not making it a method by allowing to write statements without a method like in JavaScript does not work well because you can not write statements inside a class.

R?mi

From kevinb at google.com  Thu Sep 29 00:37:47 2022
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 28 Sep 2022 17:37:47 -0700
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <CAGKkBku=CdNABAnQbSwOTeA2CSW39VFJ1ovnDAfhYvMwyxgqeQ@mail.gmail.com>

Again, big fan of getting to a streamlined main() source file.

A major design goal of yours seems clear: to get there without rendering
Java source files explicitly bimorphic ("class" source files all look like
this, "main" source files all look like that). Instead you have a set of
independent features that can compose to get you there in a "smooth ramp".
The design looks heavily influenced by that goal.

And it sounds virtuous. But... is it? Really?

Take a language that has this pretty streamlined already (I'll use the one
I know):

```
fun main() {
    ...
}
```

As my program grows and gets more complex, I will make changes like

* use more other libraries
* add args to main()
* add helper methods
* add constants
* create new classes and use them from here

But: when and why would I be motivated to change *this* code *itself* to
"become" a class, become instantiable, acquire instance state, etc. etc.? I
don't imagine ever having that urge. main() is just main()! It's just a way
in. Isn't it literally just a way to (a) transfer control back and forth
and (b) hand me args?

If I need those other qualities, then I create a class to get them, maybe
even right below main(), and I use it. I'm already going to be regularly
needing to do that anyway just as my code grows.


A quick clarification:

On Wed, Sep 28, 2022 at 12:49 PM Kevin Bourrillion <kevinb at google.com>
wrote:

Because excessive use of `static` is considered a code smell, many
>> educators encourage the pattern of "all the static `main` method does is
>> instantiate an instance and call an instance `main` method" anyway.
>>
>
> Heavy groan. In my opinion, some ideas are too misguided to take seriously.
>
> The value in that practice is if instance `main` accepts parameters like
> `PrintStream` and `Console`, and static main passes in `System.out` and
> `System.console()`. That makes all your actual program logic unit-testable.
> Great! This actually strikes directly at the heart of what the entire
> problem with `static` is! But this isn't the case you're addressing.
>


Note I was only reacting to "static bad!" here. I would be happy if *that*
argument were dropped, but you do still have another valid argument: that
`static` is another backward default, and the viral burden of putting it
not just on main() but every helper method you factor out is pure nuisance.
(I'd suggest mentioning the viral nature of this particular burden
higher/more prominently in the doc, as it's currently out of place under
the "unnamed classes" section.)

(That doesn't mean "so let's do it"; I still hope to see that benefit
carefully measured against the drawbacks. Btw, *some* of those drawbacks
might be eased by disallowing an explicit constructor... and jeez, please
disallow type parameters too... I'm leaving the exact meaning of "disallow"
undefined here.)


To resume with the original text...


On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz <brian.goetz at oracle.com> wrote:

## Unnamed classes
>
> In a simple program, the `class` declaration often doesn't help either,
> because
> other classes (if there are any) are not going to reference it by name,
> and we
> don't extend a superclass or implement any interfaces.
>


How do I tell `java` which class file to load and call main() on? Class
name based on file name, I guess?

Tiny side benefit of dropping all the `static`s: then if you also use an
unnamed class you can still make method references to your own helper
methods.


If we say an "unnamed
> class" consists of member declarations without a class header, then our
> Hello
> World program becomes:
>
> ```
> void main() {
>     System.out.println("Hello World");
> }
> ```
>


One or more class annotations could appear below package/imports?


Such source files can still have fields, methods, and even nested classes,
>


Do those get compiled to real nested classes, nested inside an unnamed
class? So if I edit a "regular" `Foo.java` file, go down below the last `}`
and add a `main` function there, does that cause the whole `Foo` class
above to be reinterpreted as "nested inside an unnamed class" instead of
top-level?


Students need
>
not confront the confusing distinction between instance and static methods
> yet;
> indeed, if not forced to confront static members on day 1, it might be a
> while
> before they do have to learn this distinction.
>


Well, they'll confront it from the calling side, `str.length()` looks quite
different from unqualified calls and classname-qualified calls. They'd
ideally get a chance to understand that first before making their own
classes.

This is my notion of a natural progression:

1. Write procedural code: calling static methods, using existing data
types, soon calling their instance methods
2. Proceed to creating your own types (from simple data types onward) and
using them too
3. One day learn that your main() function is actually a method of an
instantiable type too... at pub trivia night, then promptly forget it


> The fact that there is a
> receiver lurking in the background will come in handy later,
>


(My claim up above is "I don't think it will.")


-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/88942e1c/attachment.htm>

From brian.goetz at oracle.com  Thu Sep 29 01:36:45 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Wed, 28 Sep 2022 21:36:45 -0400
Subject: Paving the on-ramp
In-Reply-To: <CAGKkBku=CdNABAnQbSwOTeA2CSW39VFJ1ovnDAfhYvMwyxgqeQ@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBku=CdNABAnQbSwOTeA2CSW39VFJ1ovnDAfhYvMwyxgqeQ@mail.gmail.com>
Message-ID: <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com>


> A major design goal of yours seems clear: to get there without 
> rendering Java source files explicitly?bimorphic?("class" source files 
> all look like this, "main" source files all look like that). Instead 
> you have a set of independent features that can compose to get you 
> there in a "smooth ramp". The design looks heavily influenced by that 
> goal.

Yes, I sometimes call this "telescoping", because there's a chain of "x 
is short for y is short for z".? For example, with lambdas:

 ??? x -> e

is-short-for

 ??? (x) -> e

is-short-for

 ??? (var x) -> e

is-short-for

 ??? (int x) -> e? // or whatever the arg is

As a design convention, it enables a mental model where there is really 
just one form, with varying things you could leave out. Early in the 
Lambda days, we saw articles like "there are N forms of lambda 
expressions", and that stuff infuriates me, it is as if people go out of 
their way to find more complex mental models than necessary.

> As my program grows and gets more complex, I will make changes like
>
> * use more other libraries
> * add args to main()
> * add helper methods
> * add constants
> * create new classes and use them from here
>
> But: when and why would I be motivated to change?*this* code *itself* 
> to "become" a class, become instantiable, acquire instance state, etc. 
> etc.? I don't imagine ever having that urge. main() is just main()! 
> It's just a way in. Isn't it literally just a way to (a) transfer 
> control back and forth and (b) hand me args?

This doesn't seem like such a leap to me.? You might start out 
hardcoding a file path that will be read.? Then you might decide to let 
that be passed in (so you add the args parameter to main).? Then you 
might want to treat the filename to be read as a field so it can be 
shared across methods, so you turn it into a constructor parameter.? One 
could imagine "introduce X" refactorings to do all of these.? The 
process of hardcoding to main() parameter to constructor argument is a 
natural sedimentation of things finding their right level.? (And even if 
you don't do all of this, knowing that its an ordinary class (like an 
enum or a record) just with a concise syntax means you don't have to 
learn new concepts.? I don't want Foo classes and Bar classes.)

>
> Note I was only reacting to "static bad!" here. I would be happy if 
> *that* argument were dropped, but you do still have another valid 
> argument: that `static` is another backward default, and the viral 
> burden of putting it not just on main() but every helper method you 
> factor out is pure nuisance. (I'd suggest mentioning the viral nature 
> of this particular burden higher/more prominently in the doc, as it's 
> currently out of place under the "unnamed classes" section.)
>
> (That doesn't mean "so let's do it"; I still hope to see that benefit 
> carefully measured against the drawbacks. Btw, *some* of those 
> drawbacks might be eased by disallowing an explicit constructor... and 
> jeez, please disallow type parameters too... I'm leaving the exact 
> meaning of "disallow" undefined here.)

Indeed, I intend that there are no explicit constructors or instance 
initializers here.? (There can't be constructors, because the class is 
unnamed!)? I think I said somewhere "such classes can contain ..." and 
didn't list constructors, but I should have been more explicit.

>
>     ## Unnamed classes
>
>     In a simple program, the `class` declaration often doesn't help
>     either, because
>     other classes (if there are any) are not going to reference it by
>     name, and we
>     don't extend a superclass or implement any interfaces.
>
>
> How do I tell `java` which class file to load and call main() on? 
> Class name based on file name, I guess?

Sadly yes.? More sad stories coming on this front, Jim can tell.

>
>     If we say an "unnamed
>     class" consists of member declarations without a class header,
>     then our Hello
>     World program becomes:
>
>     ```
>     void main() {
>     ??? System.out.println("Hello World");
>     }
>     ```
>
>
>
> One or more class annotations could appear below package/imports?

No package statement (unnamed classes live in the unnamed package), but 
imports are OK.? No class annotations.? No type variables.? No 
superclasses.

>     Such source files can still have fields, methods, and even nested
>     classes,
>
>
> Do those get compiled to real nested classes, nested inside an unnamed 
> class? So if I edit a "regular" `Foo.java` file, go down below the 
> last `}` and add a `main` function there, does that cause the whole 
> `Foo` class above to be reinterpreted as "nested inside an unnamed 
> class" instead of top-level?

To be discussed!

> This is my notion of a natural progression:
>
> 1. Write procedural code: calling static methods, using existing data 
> types, soon calling their instance methods
> 2. Proceed to creating your own types (from simple data types onward) 
> and using them too
> 3. One day learn that your main() function is actually a method of an 
> instantiable type too... at pub trivia night, then promptly forget it

Right.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/f0539d6d/attachment-0001.htm>

From guy.steele at oracle.com  Thu Sep 29 03:41:24 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 29 Sep 2022 03:41:24 +0000
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <06823323-6214-438A-80A7-184310F01C55@oracle.com>

This is headed in the right direction, but I worry about the use of dei ex machina that have the property that they are NOT easily explained in terms of something the user could have written. Perhaps these could all be dispensed with by using an alternate strategy of code rewriting, (at least as an explanation, if not also as an implementation mechanism).

(1) Instead of having a magic ?unnamed? class, which has bizarre properties such as not having a constructor (or at least not a constructor you can mention in a `new` expression), only to then require a second magic rule about what you put in the command line ?java ??, why not simply use the much more obvious rule that if a compilation unit doesn't have a class header, then a class header is _supplied_ by the compiler, and the name of the class is taken from the filename of the compilation unit?

(2) Instead of complicating the Java launch protocol, why not leave it along, and instead use the existing mechanism of ?in situation X, if the user fails to provide method Y, the compiler will provide a definition automatically??  Specifically, in a compilation unit named Foo.java for which a class header has to be provided automatically, if a method with signature ?main()? is present but no static method with signature ?main(String[])? is present, then a static method with signature ?main(String[])? is automatically provided by the compiler.

(2a) If the method with signature ?main()? is static, the provided method is

public static void main(String[] args) { main(); }

(2b) If the method with signature ?main()? is not static, the provided method is

public static void main(String[] args) { new Foo().main(); }

Notice that this mechanism also automatically makes the keyword ?public? optional on the declaration of ?main()?.

(3) Instead of speaking of automatic imports, speak of the compiler automatically providing certain import statements if the compilation unit doesn?t have a class header.

That way _everything_ (the name of class when a class header is not provided, the behavior when you write variously abbreviated definitions of method `main`, and the automatic importation of certain libraries) can be explained in terms of source-code rewrites that the programmer can do once the programmer learns enough about more advanced features.

?Guy


On Sep 28, 2022, at 1:57 PM, Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:

At various points, we've explored the question of which program elements are most and least helpful for students first learning Java.  After considering a number of alternatives over the years, I have a simple proposal for smoothing the "on ramp" to Java programming, while not creating new things to unlearn.

Markdown source is below, HTML will appear soon at:

    https://openjdk.org/projects/amber/design-notes/on-ramp


# Paving the on-ramp

Java is one of the most widely taught programming languages in the world.  Tens
of thousands of educators find that the imperative core of the language combined
with a straightforward standard library is a foundation that students can
comfortably learn on.  Choosing Java gives educators many degrees of freedom:
they can situate students in `jshell` or Notepad or a full-fledged IDE; they can
teach imperative, object-oriented, functional, or hybrid programming styles; and
they can easily find libraries to interact with external data and services.

No language is perfect, and one of the most common complaints about Java is that
it is "too verbose" or has "too much ceremony."  And unfortunately, Java imposes
its heaviest ceremony on those first learning the language, who need and
appreciate it the least.  The declaration of a class and the incantation of
`public static void main` is pure mystery to a beginning programmer.  While
these incantations have principled origins and serve a useful organizing purpose
in larger programs, they have the effect of placing obstacles in the path of
_becoming_ Java programmers. Educators constantly remind us of the litany of
complexity that students have to confront on Day 1 of class -- when they really
just want to write their first program.

As an amusing demonstration of this, in her JavaOne keynote appearance in 2019,
[Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked about when
she learned to program in Java, and how her teacher performed a rap song
to help students memorize `"public static void main"`.  Our hats are off to
creative educators everywhere for this kind of dedication, but teachers
shouldn't have to do this.

Of course, advanced programmers complain about ceremony too.  We will never be
able to satisfy programmers' insatiable appetite for typing fewer keystrokes,
and we shouldn't try, because the goal of programming is to write programs that
are easy to read and are clearly correct, not programs that were easy to type.
But we can try to better align the ceremony commensurate with the value it
brings to a program -- and let simple programs be expressed more simply.

## Concept overload

The classic "Hello World" program looks like this in Java:

```
public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello World");
    }
}
```

It may only be five lines, but those lines are packed with concepts that are
challenging to absorb without already having some programming experience and
familiarity with object orientation. Let's break down the concepts a student
confronts when writing their first Java program:

  - **public** (on the class).  The `public` accessibility level is relevant
    only when there is going to be cross-package access; in a simple "Hello
    World" program, there is only one class, which lives in the unnamed package.
    They haven't even written a one-line program yet; the notion of access
    control -- keeping parts of a program from accessing other parts of it -- is
    still way in their future.

  - **class**.  Our student hasn't set out to write a _class_, or model a
    complex system with objects; they want to write a _program_.  In Java, a
    program is just a `main` method in some class, but at this point our student
    still has no idea what a class is or why they want one.

  - **Methods**.  Methods are of course a key concept in Java, but the mechanics
    of methods -- parameters, return types, and invocation -- are still
    unfamiliar, and the `main` method is invoked magically from the `java`
    launcher rather than from explicit code.

  - **public** (again).  Like the class, the `main` method has to be public, but
    again this is only relevant when programs are large enough to require
    packages to organize them.

  - **static**.  The `main` method has to be static, and at this point, students
    have no context for understanding what a static method is or why they want
    one.  Worse, the early exposure to `static` methods will turn out to be a
    bad habit that must be later unlearned.  Worse still, the fact that the
    `main` method is `static` creates a seam between `main` and other methods;
    either they must become `static` too, or the `main` method must trampoline
    to some sort of "instance main" (more ceremony!)  And if we get this wrong,
    we get the dreaded and mystifying `"cannot be referenced from a static
    context"` error.

  - **main**.  The name `main` has special meaning in a Java program, indicating
    the starting point of a program, but this specialness hides behind being an
    ordinary method name.  This may contribute to the sense of "so many magic
    incantations."

  - **String[]**.  The parameter to `main` is an array of strings, which are the
    arguments that the `java` launcher collected from the command line.  But our
    first program -- likely our first dozen -- will not use command-line
    parameters. Requiring the `String[]` parameter is, at this point, a mistake
    waiting to happen, and it will be a long time until this parameter makes
    sense.  Worse, educators may be tempted to explain arrays at this point,
    which further increases the time-to-first-program.

  - **System.out.println**.  If you look closely at this incantation, each
    element in the chain is a different thing -- `System` is a class (what's a
    class again?), `out` is a static field (what's a field?), and `println` is
    an instance method.  The only part the student cares about right now is
    `println`; the rest of it is an incantation that they do not yet understand
    in order to get at the behavior they want.

That's a lot to explain to a student on the first day of class.  There's a good
chance that by now, class is over and we haven't written any programs yet, or
the teacher has said "don't worry what this means, you'll understand it later"
six or eight times.  Not only is this a lot of _syntactic_ things to absorb, but
each of those things appeals to a different concept (class, method, package,
return value, parameter, array, static, public, etc) that the student doesn't
have a framework for understanding yet.  Each of these will have an important
role to play in larger programs, but so far, they only contribute to "wow,
programming is complicated."

It won't be practical (or even desirable) to get _all_ of these concepts out of
the student's face on day 1, but we can do a lot -- and focus on the ones that
do the most to help beginners understand how programs are constructed.

## Goal: a smooth on-ramp

As much as programmers like to rant about ceremony, the real goal here is not
mere ceremony reduction, but providing a graceful _on ramp_ to Java programming.
This on-ramp should be helpful to beginning programmers by requiring only those
concepts that a simple program needs.

Not only should an on-ramp have a gradual slope and offer enough acceleration
distance to get onto the highway at the right speed, but its direction must
align with that of the highway.  When a programmer is ready to learn about more
advanced concepts, they should not have to discard what they've already learned,
but instead easily see how the simple programs they've already written
generalize to more complicated ones, and both the syntatic and conceptual
transformation from "simple" to "full blown" program should be straightforward
and unintrusive.  It is a definite non-goal to create a "simplified dialect of
Java for students".

We identify three simplifications that should aid both educators and students in
navigating the on-ramp to Java, as well as being generally useful to simple
programs beyond the classroom as well:

 - A more tolerant launch protocol
 - Unnamed classes
 - Predefined static imports for the most critical methods and fields

## A more tolerant launch protocol

The Java Language Specification has relatively little to say about how Java
"programs" get launched, other than saying that there is some way to indicate
which class is the initial class of a program (JLS 12.1.1) and that a public
static method called `main` whose sole argument is of type `String[]` and whose
return is `void` constitutes the entry point of the indicated class.

We can eliminate much of the concept overload simply by relaxing the
interactions between a Java program and the `java` launcher:

 - Relax the requirement that the class, and `main` method, be public.  Public
   accessibility is only relevant when access crosses packages; simple programs
   live in the unnamed package, so cannot be accessed from any other package
   anyway.  For a program whose main class is in the unnamed package, we can
   drop the requirement that the class or its `main` method be public,
   effectively treating the `java` launcher as if it too resided in the unnamed
   package.

 - Make the "args" parameter to `main` optional, by allowing the `java` launcher to
   first look for a main method with the traditional `main(String[])`
   signature, and then (if not found) for a main method with no arguments.

 - Make the `static` modifier on `main` optional, by allowing the `java` launcher to
   invoke an instance `main` method (of either signature) by instantiating an
   instance using an accessible no-arg constructor and then invoking the `main`
   method on it.

This small set of changes to the launch protocol strikes out five of the bullet
points in the above list of concepts: public (twice), static, method parameters,
and `String[]`.

At this point, our Hello World program is now:

```
class HelloWorld {
    void main() {
        System.out.println("Hello World");
    }
}
```

It's not any shorter by line count, but we've removed a lot of "horizontal
noise" along with a number of concepts.  Students and educators will appreciate
it, but advanced programmers are unlikely to be in any hurry to make these
implicit elements explicit either.

Additionally, the notion of an "instance main" has value well beyond the first
day.  Because excessive use of `static` is considered a code smell, many
educators encourage the pattern of "all the static `main` method does is
instantiate an instance and call an instance `main` method" anyway.  Formalizing
the "instance main" protocol reduces a layer of boilerplate in these cases, and
defers the point at which we have to explain what instance creation is -- and
what `static` is.  (Further, allowing the `main` method to be an instance method
means that it could be inherited from a superclass, which is useful for simple
frameworks such as test runners or service frameworks.)

## Unnamed classes

In a simple program, the `class` declaration often doesn't help either, because
other classes (if there are any) are not going to reference it by name, and we
don't extend a superclass or implement any interfaces.  If we say an "unnamed
class" consists of member declarations without a class header, then our Hello
World program becomes:

```
void main() {
    System.out.println("Hello World");
}
```

Such source files can still have fields, methods, and even nested classes, so
that as a program evolves from a few statements to needing some ancillary state
or helper methods, these can be factored out of the `main` method while still
not yet requiring a full class declaration:

```
String greeting() { return "Hello World"; }

void main() {
    System.out.println(greeting());
}
```

This is where treating `main` as an instance method really shines; the user has
just declared two methods, and they can freely call each other.  Students need
not confront the confusing distinction between instance and static methods yet;
indeed, if not forced to confront static members on day 1, it might be a while
before they do have to learn this distinction.  The fact that there is a
receiver lurking in the background will come in handy later, but right now is
not bothering anybody.

[JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
launched directly without compilation; this streamlined launcher pairs well with
unnamed classes.

## Predefined static imports

The most important classes, such as `String` and `Integer`, live in the
`java.lang` package, which is automatically on-demand imported into all
compilation units; this is why we do not have to `import java.lang.String` in
every class.  Static imports were not added until Java 5, but no corresponding
facility for automatic on-demand import of common behavior was added at that
time.  Most programs, however, will want to do console IO, and Java forces us to
do this in a roundabout way -- through the static `System.out` and `System.in`
fields.  Basic console input and output is a reasonable candidate for
auto-static import, as one or both are needed by most simple programs.  While
these are currently instance methods accessed through static fields, we can
easily create static methods for `println` and `readln` which are suitable for
static import, and automatically import them.  At which point our first program
is now down to:

```
void main() {
    println("Hello World");
}
```

## Putting this all together

We've discussed several simplifications:

 - Update the launcher protocol to make public, static, and arguments optional
   for main methods, and for main methods to be instance methods (when a
   no-argument constructor is available);
 - Make the class wrapper for "main classes" optional (unnamed classes);
 - Automatically static import methods like `println`

which together whittle our long list of day-1 concepts down considerably.  While
this is still not as minimal as the minimal Python or Ruby program -- statements
must still live in a method -- the goal here is not to win at "code golf".  The
goal is to ensure that concepts not needed by simple programs need not appear in
those programs, while at the same time not encouraging habits that have to be
unlearned as programs scale up.

Each of these simplifications is individually small and unintrusive, and each is
independent of the others.  And each embodies a simple transformation that the
author can easily manually reverse when it makes sense to do so: elided
modifiers and `main` arguments can be added back, the class wrapper can be added
back when the affordances of classes are needed (supertypes, constructors), and
the full qualifier of static-import can be added back.  And these reversals are
independent of one another; they can done in any combination or any order.

This seems to meet the requirements of our on-ramp; we've eliminated most of the
day-1 ceremony elements without introducing new concepts that need to be
unlearned. The remaining concepts -- a method is a container for statements, and
a program is a Java source file with a `main` method -- are easily understood in
relation to their fully specified counterparts.

## Alternatives

Obviously, we've lived with the status quo for 25+ years, so we could continue
to do so.  There were other alternatives explored as well; ultimately, each of
these fell afoul of one of our goals.

### Can't we go further?

Fans of "code golf" -- of which there are many -- are surely right now trying to
figure out how to eliminate the last little bit, the `main` method, and allow
statements to exist at the top-level of a program.  We deliberately stopped
short of this because it offers little value beyond the first few minutes, and
even that small value quickly becomes something that needs to be unlearned.

The fundamental problem behind allowing such "loose" statements is that
variables can be declared inside both classes (fields) and methods (local
variables), and they share the same syntactic production but not the same
semantics.  So it is unclear (to both compilers and humans) whether a "loose"
variable would be a local or a field.  If we tried to adopt some sort of simple
heuristic to collapse this ambiguity (e.g., whether it precedes or follows the
first statement), that may satisfy the compiler, but now simple refactorings
might subtly change the meaning of the program, and we'd be replacing the
explicit syntactic overhead of `void main()` with an invisible "line" in the
program that subtly affects semantics, and a new subtle rule about the meaning
of variable declarations that applies only to unnamed classes.  This doesn't
help students, nor is this particularly helpful for all but the most trivial
programs.  It quickly becomes a crutch to be discarded and unlearned, which
falls afoul of our "on ramp" goals.  Of all the concepts on our list, "methods"
and "a program is specified by a main method" seem the ones that are most worth
asking students to learn early.

### Why not "just" use `jshell`?

While JShell is a great interactive tool, leaning too heavily on it as an onramp
would fall afoul of our goals.  A JShell session is not a program, but a
sequence of code snippets.  When we type declarations into `jshell`, they are
viewed as implicitly static members of some unspecified class, with
accessibility is ignored completely, and statements execute in a context where
all previous declarations are in scope.  This is convenient for experimentation
-- the primary goal of `jshell` -- but not such a great mental model for
learning to write Java programs.  Transforming a batch of working declarations
in `jshell` to a real Java program would not be sufficiently simple or
unintrusive, and would lead to a non-idiomatic style of code, because the
straightforward translation would have us redeclaring each method, class, and
variable declaration as `static`.  Further, this is probably not the direction
we want to go when we scale up from a handful of statements and declarations to
a simple class -- we probably want to start using classes as classes, not just
as containers for static members. JShell is a great tool for exploration and
debugging, and we expect many educators will continue to incorporate it into
their curriculum, but is not the on-ramp programming model we are looking for.

### What about "always local"?

One of the main tensions that `main` introduces is that most class members are
not `static`, but the `main` method is -- and that forces programmers to
confront the seam between static and non-static members.  JShell answers this
with "make everything static".

Another approach would be to "make everything local" -- treat a simple program
as being the "unwrapped" body of an implicit main method.  We already allow
variables and classes to be declared local to a method.  We could add local
methods (a useful feature in its own right) and relax some of the asymmetries
around nesting (again, an attractive cleanup), and then treat a mix of
declarations and statements without a class wrapper as the body of an invisible
`main` method. This seems an attractive model as well -- at first.

While the syntactic overhead of converting back to full-blown classes -- wrap
the whole thing in a `main` method and a `class` declaration -- is far less
intrusive than the transformation inherent in `jshell`, this is still not an
ideal on-ramp.  Local variables interact with local classes (and methods, when
we have them) in a very different way than instance fields do with instance
methods and inner classes: their scopes are different (no forward references),
their initialization rules are different, and captured local variables must be
effectively final.  This is a subtly different programming model that would then
have to be unlearned when scaling up to full classes. Further, the result of
this wrapping -- where everything is local to the main method -- is also not
"idiomatic Java".  So while local methods may be an attractive feature, they are
similarly not the on-ramp we are looking for.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/8b1f830e/attachment-0001.htm>

From kevinb at google.com  Thu Sep 29 04:12:58 2022
From: kevinb at google.com (Kevin Bourrillion)
Date: Wed, 28 Sep 2022 21:12:58 -0700
Subject: Paving the on-ramp
In-Reply-To: <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBku=CdNABAnQbSwOTeA2CSW39VFJ1ovnDAfhYvMwyxgqeQ@mail.gmail.com>
 <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com>
Message-ID: <CAGKkBkvGztuAamc6K+YsST90vbkJYD3B_YS=bcGT3yavCeHFiA@mail.gmail.com>

Meta-comment: I think you have the right *motivating* use cases (beginners,
small/temporary programs), but I expect pretty much *any* main method to
want to use this, and I don't see why it shouldn't. That makes those use
cases important and worth reasonable attempts at accommodation, regardless
of whether we'd even be doing this for their sake alone.


On Wed, Sep 28, 2022 at 6:36 PM Brian Goetz <brian.goetz at oracle.com> wrote:

>
>
> A major design goal of yours seems clear: to get there without rendering
> Java source files explicitly bimorphic ("class" source files all look like
> this, "main" source files all look like that). Instead you have a set of
> independent features that can compose to get you there in a "smooth ramp".
> The design looks heavily influenced by that goal.
>
>
> Yes, I sometimes call this "telescoping", because there's a chain of "x is
> short for y is short for z".  For example, with lambdas:
>
>     x -> e
>
> is-short-for
>
>     (x) -> e
>
> is-short-for
>
>     (var x) -> e
>
> is-short-for
>
>     (int x) -> e  // or whatever the arg is
>
> As a design convention, it enables a mental model where there is really
> just one form, with varying things you could leave out.  Early in the
> Lambda days, we saw articles like "there are N forms of lambda
> expressions", and that stuff infuriates me, it is as if people go out of
> their way to find more complex mental models than necessary.
>


Right, and that design was good inasmuch as there were good use cases for
every one of the rungs (and there were).


As my program grows and gets more complex, I will make changes like
>
> * use more other libraries
> * add args to main()
> * add helper methods
> * add constants
> * create new classes and use them from here
>
> But: when and why would I be motivated to change *this* code *itself* to
> "become" a class, become instantiable, acquire instance state, etc. etc.? I
> don't imagine ever having that urge. main() is just main()! It's just a way
> in. Isn't it literally just a way to (a) transfer control back and forth
> and (b) hand me args?
>
> This doesn't seem like such a leap to me.  You might start out hardcoding
> a file path that will be read.  Then you might decide to let that be passed
> in (so you add the args parameter to main).  Then you might want to treat
> the filename to be read as a field so it can be shared across methods, so
> you turn it into a constructor parameter.
>

So far so good up to that last part. A constructor parameter? I thought you
were going to say you just add the field and all your non-static methods
read and write it at will. Getting a bit lost in the twists n' folds.


>   One could imagine "introduce X" refactorings to do all of these.  The
> process of hardcoding to main() parameter to constructor argument is a
> natural sedimentation of things finding their right level.  (And even if
> you don't do all of this, knowing that its an ordinary class (like an enum
> or a record) just with a concise syntax means you don't have to learn new
> concepts.  I don't want Foo classes and Bar classes.)
>
>
> Note I was only reacting to "static bad!" here. I would be happy if *that*
> argument were dropped, but you do still have another valid argument: that
> `static` is another backward default, and the viral burden of putting it
> not just on main() but every helper method you factor out is pure nuisance.
> (I'd suggest mentioning the viral nature of this particular burden
> higher/more prominently in the doc, as it's currently out of place under
> the "unnamed classes" section.)
>
> (That doesn't mean "so let's do it"; I still hope to see that benefit
> carefully measured against the drawbacks. Btw, *some* of those drawbacks
> might be eased by disallowing an explicit constructor... and jeez, please
> disallow type parameters too... I'm leaving the exact meaning of "disallow"
> undefined here.)
>
>
> Indeed, I intend that there are no explicit constructors or instance
> initializers here.  (There can't be constructors, because the class is
> unnamed!)
>

Hmm, I was under the impression I could drop all my `static`s while keeping
the class signature if I wanted? But, if I can and even then explicit
constrs and initers are banned, then indeed, at least one of my drawbacks
is invalid. I don't think it undercuts my overall case that much.


> I think I said somewhere "such classes can contain ..." and didn't list
> constructors, but I should have been more explicit.
>
>
> ## Unnamed classes
>>
>> In a simple program, the `class` declaration often doesn't help either,
>> because
>> other classes (if there are any) are not going to reference it by name,
>> and we
>> don't extend a superclass or implement any interfaces.
>>
>
> How do I tell `java` which class file to load and call main() on? Class
> name based on file name, I guess?
>
>
> Sadly yes.  More sad stories coming on this front, Jim can tell.
>
>
> If we say an "unnamed
>> class" consists of member declarations without a class header, then our
>> Hello
>> World program becomes:
>>
>> ```
>> void main() {
>>     System.out.println("Hello World");
>> }
>> ```
>>
>
>
> One or more class annotations could appear below package/imports?
>
>
> No package statement (unnamed classes live in the unnamed package), but
> imports are OK.
>

I'm confused; what does any of this have to do with package location? Isn't
that orthogonal to everything we're discussing?

I'm also not sure why we're talking about "unnamed" so much; the condition
we're talking about is really "signatureless" or "body-only", which as far
as I know could be a purely source-level distinction, producing a
completely normal-looking, named class in a classfile. (Wrote that before
Guy's message came in, even.)


No class annotations.  No type variables.  No superclasses.
>

Okay, in the motivating use cases (beginner/small/temp), yeah, they're
awfully unlikely to want class annotations.

But again, to me every main() method out there is a use case too. And
plenty of class annotations are used for purposes that aren't about
*classes*, just "this whole range of code here". It feels like they should
be allowed, unless you want to talk about `ElementType.COMPILATION_UNIT`
... :-)

Such source files can still have fields, methods, and even nested classes,
>>
>
> Do those get compiled to real nested classes, nested inside an unnamed
> class? So if I edit a "regular" `Foo.java` file, go down below the last `}`
> and add a `main` function there, does that cause the whole `Foo` class
> above to be reinterpreted as "nested inside an unnamed class" instead of
> top-level?
>
>
> To be discussed!
>
> This is my notion of a natural progression:
>
> 1. Write procedural code: calling static methods, using existing data
> types, soon calling their instance methods
> 2. Proceed to creating your own types (from simple data types onward) and
> using them too
> 3. One day learn that your main() function is actually a method of an
> instantiable type too... at pub trivia night, then promptly forget it
>
>
> Right.
>
>
>
>

-- 
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220928/24bb2b95/attachment.htm>

From amaembo at gmail.com  Thu Sep 29 07:07:44 2022
From: amaembo at gmail.com (Tagir Valeev)
Date: Thu, 29 Sep 2022 09:07:44 +0200
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>

Hello!

Very interesting writing, thanks! A couple of notes from me:

> ## Unnamed classes
> ...
> Such source files can still have fields, methods, and even nested classes, so
> that as a program evolves from a few statements to needing some ancillary state
> or helper methods, these can be factored out of the `main` method while still

I wonder how we tell apart unnamed class syntax and normal class
syntax. E.g., consider the source file:

// Hello.java
public class Hello {
  // tons of logic
}

void main() {
}

Will it be considered as a correct Java file, having Hello class as a
nested class of top-level unnamed class?
If yes, then, adding a main method after the class declaration, I
change the class semantics, making it an inner class.
This looks like action at a distance and may cause confusion. E.g., I
just wrote a main() method outside of Hello class instead of inside,
and boom,
now Hello is not resolvable from other classes, for no apparent reason.

I assume that the main() method is required for an unnamed class, and
if there are only other top-level declarations,
then it should be a compilation error, right?

> ## Predefined static imports
> ```
> void main() {
>     println("Hello World");
> }
> ```

I wonder how it will play with existing static star imports. We
already saw problems when updated to Java 9 or Java 14 that
star-imported class named Module or Record becomes unresolvable. If
existing code already imports static method named println from
somewhere, will this code become invalid?

With best regards,
Tagir Valeev.

From forax at univ-mlv.fr  Thu Sep 29 07:39:45 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 29 Sep 2022 09:39:45 +0200 (CEST)
Subject: Paving the on-ramp
In-Reply-To: <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
Message-ID: <910090311.15460599.1664437163375.JavaMail.zimbra@u-pem.fr>

----- Original Message -----
> From: "Tagir Valeev" <amaembo at gmail.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Thursday, September 29, 2022 9:07:44 AM
> Subject: Re: Paving the on-ramp

> Hello!
> 
> Very interesting writing, thanks! A couple of notes from me:
> 
>> ## Unnamed classes
>> ...
>> Such source files can still have fields, methods, and even nested classes, so
>> that as a program evolves from a few statements to needing some ancillary state
>> or helper methods, these can be factored out of the `main` method while still
> 
> I wonder how we tell apart unnamed class syntax and normal class
> syntax. E.g., consider the source file:
> 
> // Hello.java
> public class Hello {
>  // tons of logic
> }
> 
> void main() {
> }
> 
> Will it be considered as a correct Java file, having Hello class as a
> nested class of top-level unnamed class?
> If yes, then, adding a main method after the class declaration, I
> change the class semantics, making it an inner class.
> This looks like action at a distance and may cause confusion. E.g., I
> just wrote a main() method outside of Hello class instead of inside,
> and boom,
> now Hello is not resolvable from other classes, for no apparent reason.

There are several ways to try to tame that issue
- we can restrict unnamed class to only work if it is run by java Hello.java, so no Hello.class is generated at compile time, no problem with Hello being resolvable.
- we can disallow an unnamed class to contains a nested class with the same name as the unnamed class, the error message will still be hard to decipher for beginners.
- we can disallow nested class in unnamed class, but that a bummer because being able to write records inside an unnamed class is a great combo.

> 
> I assume that the main() method is required for an unnamed class, and
> if there are only other top-level declarations,
> then it should be a compilation error, right ?

I do not think you can because having a file named Foo.java containing only a non public class Bar is currently legal in Java.

> 
>> ## Predefined static imports
>> ```
>> void main() {
>>     println("Hello World");
>> }
>> ```
> 
> I wonder how it will play with existing static star imports. We
> already saw problems when updated to Java 9 or Java 14 that
> star-imported class named Module or Record becomes unresolvable. If
> existing code already imports static method named println from
> somewhere, will this code become invalid?

yes, i've asked the same question to Brian.
We need the predefined static imports to be resolved after the classical static imports are resolved.

BTW, there is a connection with the templated string spec here, because STR or FMT also needs to be predefined static imports.

> 
> With best regards,
> Tagir Valeev.

regards,
R?mi

From forax at univ-mlv.fr  Thu Sep 29 08:01:58 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Thu, 29 Sep 2022 10:01:58 +0200 (CEST)
Subject: Paving the on-ramp
In-Reply-To: <06823323-6214-438A-80A7-184310F01C55@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <06823323-6214-438A-80A7-184310F01C55@oracle.com>
Message-ID: <755558451.15481407.1664438518341.JavaMail.zimbra@u-pem.fr>

> From: "Guy Steele" <guy.steele at oracle.com>
> To: "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Thursday, September 29, 2022 5:41:24 AM
> Subject: Re: Paving the on-ramp

> This is headed in the right direction, but I worry about the use of dei ex
> machina that have the property that they are NOT easily explained in terms of
> something the user could have written. Perhaps these could all be dispensed
> with by using an alternate strategy of code rewriting, (at least as an
> explanation, if not also as an implementation mechanism).

> (1) Instead of having a magic ?unnamed? class, which has bizarre properties such
> as not having a constructor (or at least not a constructor you can mention in a
> `new` expression), only to then require a second magic rule about what you put
> in the command line ?java ??, why not simply use the much more obvious rule
> that if a compilation unit doesn't have a class header, then a class header is
> _supplied_ by the compiler, and the name of the class is taken from the
> filename of the compilation unit?

You can have both, an unnamed class is syntactic sugar but we do not want to allow puzzling combinations. 
Declaring a constructor in Java use the class name but it's not clear to me that we should allow the class name of an unnamed class to be denotable, it seems to magical to me. 

> (2) Instead of complicating the Java launch protocol, why not leave it along,
> and instead use the existing mechanism of ?in situation X, if the user fails to
> provide method Y, the compiler will provide a definition automatically??
> Specifically, in a compilation unit named Foo.java for which a class header has
> to be provided automatically, if a method with signature ?main()? is present
> but no static method with signature ?main(String[])? is present, then a static
> method with signature ?main(String[])? is automatically provided by the
> compiler.

> (2a) If the method with signature ?main()? is static, the provided method is

> public static void main(String[] args) { main(); }

> (2b) If the method with signature ?main()? is not static, the provided method is

> public static void main(String[] args) { new Foo().main(); }

> Notice that this mechanism also automatically makes the keyword ?public?
> optional on the declaration of ?main()?.

I agree on that, it also goes well with the warning Kevin was mentioning. 

> (3) Instead of speaking of automatic imports, speak of the compiler
> automatically providing certain import statements if the compilation unit
> doesn?t have a class header.

I disagree about this one, if we can write println() inside an unnamed class, we should be able to write println() inside a classical class. 
You can disallow things inside an unamed class but you can not have a behavior that works on unnamed class and that will not work on a classical class, otherwise an unnamed class is not really Java, it's a kid sandbox. 

> That way _everything_ (the name of class when a class header is not provided,
> the behavior when you write variously abbreviated definitions of method `main`,
> and the automatic importation of certain libraries) can be explained in terms
> of source-code rewrites that the programmer can do once the programmer learns
> enough about more advanced features.

> ?Guy

R?mi 

>> On Sep 28, 2022, at 1:57 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com |
>> brian.goetz at oracle.com ] > wrote:

>> At various points, we've explored the question of which program elements are
>> most and least helpful for students first learning Java. After considering a
>> number of alternatives over the years, I have a simple proposal for smoothing
>> the "on ramp" to Java programming, while not creating new things to unlearn.

>> Markdown source is below, HTML will appear soon at:

>> [ https://openjdk.org/projects/amber/design-notes/on-ramp |
>> https://openjdk.org/projects/amber/design-notes/on-ramp ]

>> # Paving the on-ramp

>> Java is one of the most widely taught programming languages in the world. Tens
>> of thousands of educators find that the imperative core of the language combined
>> with a straightforward standard library is a foundation that students can
>> comfortably learn on. Choosing Java gives educators many degrees of freedom:
>> they can situate students in `jshell` or Notepad or a full-fledged IDE; they can
>> teach imperative, object-oriented, functional, or hybrid programming styles; and
>> they can easily find libraries to interact with external data and services.

>> No language is perfect, and one of the most common complaints about Java is that
>> it is "too verbose" or has "too much ceremony." And unfortunately, Java imposes
>> its heaviest ceremony on those first learning the language, who need and
>> appreciate it the least. The declaration of a class and the incantation of
>> `public static void main` is pure mystery to a beginning programmer. While
>> these incantations have principled origins and serve a useful organizing purpose
>> in larger programs, they have the effect of placing obstacles in the path of
>> _becoming_ Java programmers. Educators constantly remind us of the litany of
>> complexity that students have to confront on Day 1 of class -- when they really
>> just want to write their first program.

>> As an amusing demonstration of this, in her JavaOne keynote appearance in 2019,
>> [Aimee Lucido]( [ https://www.youtube.com/watch?v=BkPPFiXUwYk |
>> https://www.youtube.com/watch?v=BkPPFiXUwYk ] ) talked about when
>> she learned to program in Java, and how her teacher performed a rap song
>> to help students memorize `"public static void main"`. Our hats are off to
>> creative educators everywhere for this kind of dedication, but teachers
>> shouldn't have to do this.

>> Of course, advanced programmers complain about ceremony too. We will never be
>> able to satisfy programmers' insatiable appetite for typing fewer keystrokes,
>> and we shouldn't try, because the goal of programming is to write programs that
>> are easy to read and are clearly correct, not programs that were easy to type.
>> But we can try to better align the ceremony commensurate with the value it
>> brings to a program -- and let simple programs be expressed more simply.

>> ## Concept overload

>> The classic "Hello World" program looks like this in Java:

>> ```
>> public class HelloWorld {
>> public static void main(String[] args) {
>> System.out.println("Hello World");
>> }
>> }
>> ```

>> It may only be five lines, but those lines are packed with concepts that are
>> challenging to absorb without already having some programming experience and
>> familiarity with object orientation. Let's break down the concepts a student
>> confronts when writing their first Java program:

>> - **public** (on the class). The `public` accessibility level is relevant
>> only when there is going to be cross-package access; in a simple "Hello
>> World" program, there is only one class, which lives in the unnamed package.
>> They haven't even written a one-line program yet; the notion of access
>> control -- keeping parts of a program from accessing other parts of it -- is
>> still way in their future.

>> - **class**. Our student hasn't set out to write a _class_, or model a
>> complex system with objects; they want to write a _program_. In Java, a
>> program is just a `main` method in some class, but at this point our student
>> still has no idea what a class is or why they want one.

>> - **Methods**. Methods are of course a key concept in Java, but the mechanics
>> of methods -- parameters, return types, and invocation -- are still
>> unfamiliar, and the `main` method is invoked magically from the `java`
>> launcher rather than from explicit code.

>> - **public** (again). Like the class, the `main` method has to be public, but
>> again this is only relevant when programs are large enough to require
>> packages to organize them.

>> - **static**. The `main` method has to be static, and at this point, students
>> have no context for understanding what a static method is or why they want
>> one. Worse, the early exposure to `static` methods will turn out to be a
>> bad habit that must be later unlearned. Worse still, the fact that the
>> `main` method is `static` creates a seam between `main` and other methods;
>> either they must become `static` too, or the `main` method must trampoline
>> to some sort of "instance main" (more ceremony!) And if we get this wrong,
>> we get the dreaded and mystifying `"cannot be referenced from a static
>> context"` error.

>> - **main**. The name `main` has special meaning in a Java program, indicating
>> the starting point of a program, but this specialness hides behind being an
>> ordinary method name. This may contribute to the sense of "so many magic
>> incantations."

>> - **String[]**. The parameter to `main` is an array of strings, which are the
>> arguments that the `java` launcher collected from the command line. But our
>> first program -- likely our first dozen -- will not use command-line
>> parameters. Requiring the `String[]` parameter is, at this point, a mistake
>> waiting to happen, and it will be a long time until this parameter makes
>> sense. Worse, educators may be tempted to explain arrays at this point,
>> which further increases the time-to-first-program.

>> - **System.out.println**. If you look closely at this incantation, each
>> element in the chain is a different thing -- `System` is a class (what's a
>> class again?), `out` is a static field (what's a field?), and `println` is
>> an instance method. The only part the student cares about right now is
>> `println`; the rest of it is an incantation that they do not yet understand
>> in order to get at the behavior they want.

>> That's a lot to explain to a student on the first day of class. There's a good
>> chance that by now, class is over and we haven't written any programs yet, or
>> the teacher has said "don't worry what this means, you'll understand it later"
>> six or eight times. Not only is this a lot of _syntactic_ things to absorb, but
>> each of those things appeals to a different concept (class, method, package,
>> return value, parameter, array, static, public, etc) that the student doesn't
>> have a framework for understanding yet. Each of these will have an important
>> role to play in larger programs, but so far, they only contribute to "wow,
>> programming is complicated."

>> It won't be practical (or even desirable) to get _all_ of these concepts out of
>> the student's face on day 1, but we can do a lot -- and focus on the ones that
>> do the most to help beginners understand how programs are constructed.

>> ## Goal: a smooth on-ramp

>> As much as programmers like to rant about ceremony, the real goal here is not
>> mere ceremony reduction, but providing a graceful _on ramp_ to Java programming.
>> This on-ramp should be helpful to beginning programmers by requiring only those
>> concepts that a simple program needs.

>> Not only should an on-ramp have a gradual slope and offer enough acceleration
>> distance to get onto the highway at the right speed, but its direction must
>> align with that of the highway. When a programmer is ready to learn about more
>> advanced concepts, they should not have to discard what they've already learned,
>> but instead easily see how the simple programs they've already written
>> generalize to more complicated ones, and both the syntatic and conceptual
>> transformation from "simple" to "full blown" program should be straightforward
>> and unintrusive. It is a definite non-goal to create a "simplified dialect of
>> Java for students".

>> We identify three simplifications that should aid both educators and students in
>> navigating the on-ramp to Java, as well as being generally useful to simple
>> programs beyond the classroom as well:

>> - A more tolerant launch protocol
>> - Unnamed classes
>> - Predefined static imports for the most critical methods and fields

>> ## A more tolerant launch protocol

>> The Java Language Specification has relatively little to say about how Java
>> "programs" get launched, other than saying that there is some way to indicate
>> which class is the initial class of a program (JLS 12.1.1) and that a public
>> static method called `main` whose sole argument is of type `String[]` and whose
>> return is `void` constitutes the entry point of the indicated class.

>> We can eliminate much of the concept overload simply by relaxing the
>> interactions between a Java program and the `java` launcher:

>> - Relax the requirement that the class, and `main` method, be public. Public
>> accessibility is only relevant when access crosses packages; simple programs
>> live in the unnamed package, so cannot be accessed from any other package
>> anyway. For a program whose main class is in the unnamed package, we can
>> drop the requirement that the class or its `main` method be public,
>> effectively treating the `java` launcher as if it too resided in the unnamed
>> package.

>> - Make the "args" parameter to `main` optional, by allowing the `java` launcher
>> to
>> first look for a main method with the traditional `main(String[])`
>> signature, and then (if not found) for a main method with no arguments.

>> - Make the `static` modifier on `main` optional, by allowing the `java` launcher
>> to
>> invoke an instance `main` method (of either signature) by instantiating an
>> instance using an accessible no-arg constructor and then invoking the `main`
>> method on it.

>> This small set of changes to the launch protocol strikes out five of the bullet
>> points in the above list of concepts: public (twice), static, method parameters,
>> and `String[]`.

>> At this point, our Hello World program is now:

>> ```
>> class HelloWorld {
>> void main() {
>> System.out.println("Hello World");
>> }
>> }
>> ```

>> It's not any shorter by line count, but we've removed a lot of "horizontal
>> noise" along with a number of concepts. Students and educators will appreciate
>> it, but advanced programmers are unlikely to be in any hurry to make these
>> implicit elements explicit either.

>> Additionally, the notion of an "instance main" has value well beyond the first
>> day. Because excessive use of `static` is considered a code smell, many
>> educators encourage the pattern of "all the static `main` method does is
>> instantiate an instance and call an instance `main` method" anyway. Formalizing
>> the "instance main" protocol reduces a layer of boilerplate in these cases, and
>> defers the point at which we have to explain what instance creation is -- and
>> what `static` is. (Further, allowing the `main` method to be an instance method
>> means that it could be inherited from a superclass, which is useful for simple
>> frameworks such as test runners or service frameworks.)

>> ## Unnamed classes

>> In a simple program, the `class` declaration often doesn't help either, because
>> other classes (if there are any) are not going to reference it by name, and we
>> don't extend a superclass or implement any interfaces. If we say an "unnamed
>> class" consists of member declarations without a class header, then our Hello
>> World program becomes:

>> ```
>> void main() {
>> System.out.println("Hello World");
>> }
>> ```

>> Such source files can still have fields, methods, and even nested classes, so
>> that as a program evolves from a few statements to needing some ancillary state
>> or helper methods, these can be factored out of the `main` method while still
>> not yet requiring a full class declaration:

>> ```
>> String greeting() { return "Hello World"; }

>> void main() {
>> System.out.println(greeting());
>> }
>> ```

>> This is where treating `main` as an instance method really shines; the user has
>> just declared two methods, and they can freely call each other. Students need
>> not confront the confusing distinction between instance and static methods yet;
>> indeed, if not forced to confront static members on day 1, it might be a while
>> before they do have to learn this distinction. The fact that there is a
>> receiver lurking in the background will come in handy later, but right now is
>> not bothering anybody.

>> [JEP 330]( [ https://openjdk.org/jeps/330 | https://openjdk.org/jeps/330 ] )
>> allows single-file programs to be
>> launched directly without compilation; this streamlined launcher pairs well with
>> unnamed classes.

>> ## Predefined static imports

>> The most important classes, such as `String` and `Integer`, live in the
>> `java.lang` package, which is automatically on-demand imported into all
>> compilation units; this is why we do not have to `import java.lang.String` in
>> every class. Static imports were not added until Java 5, but no corresponding
>> facility for automatic on-demand import of common behavior was added at that
>> time. Most programs, however, will want to do console IO, and Java forces us to
>> do this in a roundabout way -- through the static `System.out` and `System.in`
>> fields. Basic console input and output is a reasonable candidate for
>> auto-static import, as one or both are needed by most simple programs. While
>> these are currently instance methods accessed through static fields, we can
>> easily create static methods for `println` and `readln` which are suitable for
>> static import, and automatically import them. At which point our first program
>> is now down to:

>> ```
>> void main() {
>> println("Hello World");
>> }
>> ```

>> ## Putting this all together

>> We've discussed several simplifications:

>> - Update the launcher protocol to make public, static, and arguments optional
>> for main methods, and for main methods to be instance methods (when a
>> no-argument constructor is available);
>> - Make the class wrapper for "main classes" optional (unnamed classes);
>> - Automatically static import methods like `println`

>> which together whittle our long list of day-1 concepts down considerably. While
>> this is still not as minimal as the minimal Python or Ruby program -- statements
>> must still live in a method -- the goal here is not to win at "code golf". The
>> goal is to ensure that concepts not needed by simple programs need not appear in
>> those programs, while at the same time not encouraging habits that have to be
>> unlearned as programs scale up.

>> Each of these simplifications is individually small and unintrusive, and each is
>> independent of the others. And each embodies a simple transformation that the
>> author can easily manually reverse when it makes sense to do so: elided
>> modifiers and `main` arguments can be added back, the class wrapper can be added
>> back when the affordances of classes are needed (supertypes, constructors), and
>> the full qualifier of static-import can be added back. And these reversals are
>> independent of one another; they can done in any combination or any order.

>> This seems to meet the requirements of our on-ramp; we've eliminated most of the
>> day-1 ceremony elements without introducing new concepts that need to be
>> unlearned. The remaining concepts -- a method is a container for statements, and
>> a program is a Java source file with a `main` method -- are easily understood in
>> relation to their fully specified counterparts.

>> ## Alternatives

>> Obviously, we've lived with the status quo for 25+ years, so we could continue
>> to do so. There were other alternatives explored as well; ultimately, each of
>> these fell afoul of one of our goals.

>> ### Can't we go further?

>> Fans of "code golf" -- of which there are many -- are surely right now trying to
>> figure out how to eliminate the last little bit, the `main` method, and allow
>> statements to exist at the top-level of a program. We deliberately stopped
>> short of this because it offers little value beyond the first few minutes, and
>> even that small value quickly becomes something that needs to be unlearned.

>> The fundamental problem behind allowing such "loose" statements is that
>> variables can be declared inside both classes (fields) and methods (local
>> variables), and they share the same syntactic production but not the same
>> semantics. So it is unclear (to both compilers and humans) whether a "loose"
>> variable would be a local or a field. If we tried to adopt some sort of simple
>> heuristic to collapse this ambiguity (e.g., whether it precedes or follows the
>> first statement), that may satisfy the compiler, but now simple refactorings
>> might subtly change the meaning of the program, and we'd be replacing the
>> explicit syntactic overhead of `void main()` with an invisible "line" in the
>> program that subtly affects semantics, and a new subtle rule about the meaning
>> of variable declarations that applies only to unnamed classes. This doesn't
>> help students, nor is this particularly helpful for all but the most trivial
>> programs. It quickly becomes a crutch to be discarded and unlearned, which
>> falls afoul of our "on ramp" goals. Of all the concepts on our list, "methods"
>> and "a program is specified by a main method" seem the ones that are most worth
>> asking students to learn early.

>> ### Why not "just" use `jshell`?

>> While JShell is a great interactive tool, leaning too heavily on it as an onramp
>> would fall afoul of our goals. A JShell session is not a program, but a
>> sequence of code snippets. When we type declarations into `jshell`, they are
>> viewed as implicitly static members of some unspecified class, with
>> accessibility is ignored completely, and statements execute in a context where
>> all previous declarations are in scope. This is convenient for experimentation
>> -- the primary goal of `jshell` -- but not such a great mental model for
>> learning to write Java programs. Transforming a batch of working declarations
>> in `jshell` to a real Java program would not be sufficiently simple or
>> unintrusive, and would lead to a non-idiomatic style of code, because the
>> straightforward translation would have us redeclaring each method, class, and
>> variable declaration as `static`. Further, this is probably not the direction
>> we want to go when we scale up from a handful of statements and declarations to
>> a simple class -- we probably want to start using classes as classes, not just
>> as containers for static members. JShell is a great tool for exploration and
>> debugging, and we expect many educators will continue to incorporate it into
>> their curriculum, but is not the on-ramp programming model we are looking for.

>> ### What about "always local"?

>> One of the main tensions that `main` introduces is that most class members are
>> not `static`, but the `main` method is -- and that forces programmers to
>> confront the seam between static and non-static members. JShell answers this
>> with "make everything static".

>> Another approach would be to "make everything local" -- treat a simple program
>> as being the "unwrapped" body of an implicit main method. We already allow
>> variables and classes to be declared local to a method. We could add local
>> methods (a useful feature in its own right) and relax some of the asymmetries
>> around nesting (again, an attractive cleanup), and then treat a mix of
>> declarations and statements without a class wrapper as the body of an invisible
>> `main` method. This seems an attractive model as well -- at first.

>> While the syntactic overhead of converting back to full-blown classes -- wrap
>> the whole thing in a `main` method and a `class` declaration -- is far less
>> intrusive than the transformation inherent in `jshell`, this is still not an
>> ideal on-ramp. Local variables interact with local classes (and methods, when
>> we have them) in a very different way than instance fields do with instance
>> methods and inner classes: their scopes are different (no forward references),
>> their initialization rules are different, and captured local variables must be
>> effectively final. This is a subtly different programming model that would then
>> have to be unlearned when scaling up to full classes. Further, the result of
>> this wrapping -- where everything is local to the main method -- is also not
>> "idiomatic Java". So while local methods may be an attractive feature, they are
>> similarly not the on-ramp we are looking for.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/1710a674/attachment-0001.htm>

From brian.goetz at oracle.com  Thu Sep 29 13:54:59 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 09:54:59 -0400
Subject: Paving the on-ramp
In-Reply-To: <06823323-6214-438A-80A7-184310F01C55@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <06823323-6214-438A-80A7-184310F01C55@oracle.com>
Message-ID: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com>


> (1) Instead of having a magic ?unnamed? class, which has bizarre 
> properties such as not having a constructor (or at least not a 
> constructor you can mention in a `new` expression), only to then 
> require a second magic rule about what you put in the command line 
> ?java ??, why not simply use the much more obvious rule that if a 
> compilation unit doesn't have a class header, then a class header is 
> _supplied_ by the compiler, and the name of the class is taken from 
> the filename of the compilation unit?

The implementation does something like this, which is almost a forced 
move due to the vagaries of the various extralinguistic rules like 
"Foo.class should not contain a class other than Foo" (enforced by the 
class loader.)? So indeed, if Foo.java contains an "unnamed" class, 
Foo.class will contain a class called "Foo".

The main difference (if there is one) is the meaning of the name Foo in 
the body of the class.? This relates to another "unnamed" JEP in flight, 
which is "unnamed variables", such as:

 ??? var _ = mySideEffects();

Here, _ refers to a variable whose name is not entered into the symbol 
table, and is therefore write-once, read-none.? The proposal herein for 
unnamed classes treats the class name the same way. (Full disclosure: 
since there is a Foo.class with a class called Foo in it, it is hard to 
stop _other_ classes from instantiating it.)

What you are suggesting is to instead take that 
extralinguistically-derived name and make it official.? This reduces 
some of the restrictions (you can have constructors) but seems like it 
creates new ghosts from different machines, since now there is a name 
that has meaning in the language but which didn't come from any Java 
source code.

> (2) Instead of complicating the Java launch protocol, why not leave it 
> along, and instead use the existing mechanism of ?in situation X, if 
> the user fails to provide method Y, the compiler will provide a 
> definition automatically?? ?Specifically, in a compilation unit named 
> Foo.java for which a class header has to be provided automatically, if 
> a method with signature ?main()? is present but no static method with 
> signature ?main(String[])? is present, then a static method with 
> signature ?main(String[])? is automatically provided by the compiler.

Saying that you can only use these two mechanisms together seems a sharp 
edge that users will get caught on.? The two simplifications are 
orthogonal; there is "instance main" and there is "low ceremony 
classes", but coupling the two in this way means you have to give up one 
if you don't use the other.

However, your "if you don't provide..." approach is an entirely valid 
way to implement "instance main" -- by injecting additional methods into 
the compiled class rather than modifying the launcher. It would be 
specified slightly differently (since it is also reflectively visible) 
but that's OK.

> (3) Instead of speaking of automatic imports, speak of the compiler 
> automatically providing certain import statements if the compilation 
> unit doesn?t have a class header. 

If we did this, when a class "graduates" from a low-ceremony class to a 
full class, then they'd have to go back and fix up all the println 
calls, and similarly it would put users in a position of "you can have 
ceremony reduction X, but only if you qualify for ceremony reduction 
Y."? It is surely a weaker argument that `println` needs to be 
effectively global, but after having programmed without saying 
"System.out" in front of println for only a few weeks, one already feels 
like going back is a punishment. (Its small, I know, but in some 
situations you type it a lot.)? We have also seen the need for automatic 
imports elsewhere, such as in JEP 430, where a feature of the language 
carries with it a static member (the STR and FMT template processors), 
and requiring an explicit static import seems burdensome.

Taken together, coupling "instance main" and "auto static imports" to 
"no class header" means that we have created a "beginners dialect" which 
is different, and which has to be unlearned and undone as soon as a 
class graduates.? I would prefer to have these be orthogonal features to 
the extent possible.

> That way _everything_ (the name of class when a class header is not 
> provided, the behavior when you write variously abbreviated 
> definitions of method `main`, and the automatic importation of certain 
> libraries) can be explained in terms of source-code rewrites that the 
> programmer can do once the programmer learns enough about more 
> advanced features.
>
> ?Guy
>
>
>> On Sep 28, 2022, at 1:57 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>> At various points, we've explored the question of which program 
>> elements are most and least helpful for students first learning 
>> Java.? After considering a number of alternatives over the years, I 
>> have a simple proposal for smoothing the "on ramp" to Java 
>> programming, while not creating new things to unlearn.
>>
>> Markdown source is below, HTML will appear soon at:
>>
>> https://openjdk.org/projects/amber/design-notes/on-ramp
>>
>>
>> # Paving the on-ramp
>>
>> Java is one of the most widely taught programming languages in the 
>> world.? Tens
>> of thousands of educators find that the imperative core of the 
>> language combined
>> with a straightforward standard library is a foundation that students can
>> comfortably learn on.? Choosing Java gives educators many degrees of 
>> freedom:
>> they can situate students in `jshell` or Notepad or a full-fledged 
>> IDE; they can
>> teach imperative, object-oriented, functional, or hybrid programming 
>> styles; and
>> they can easily find libraries to interact with external data and 
>> services.
>>
>> No language is perfect, and one of the most common complaints about 
>> Java is that
>> it is "too verbose" or has "too much ceremony." And unfortunately, 
>> Java imposes
>> its heaviest ceremony on those first learning the language, who need and
>> appreciate it the least.? The declaration of a class and the 
>> incantation of
>> `public static void main` is pure mystery to a beginning programmer.? 
>> While
>> these incantations have principled origins and serve a useful 
>> organizing purpose
>> in larger programs, they have the effect of placing obstacles in the 
>> path of
>> _becoming_ Java programmers. Educators constantly remind us of the 
>> litany of
>> complexity that students have to confront on Day 1 of class -- when 
>> they really
>> just want to write their first program.
>>
>> As an amusing demonstration of this, in her JavaOne keynote 
>> appearance in 2019,
>> [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked 
>> about when
>> she learned to program in Java, and how her teacher performed a rap song
>> to help students memorize `"public static void main"`.? Our hats are 
>> off to
>> creative educators everywhere for this kind of dedication, but teachers
>> shouldn't have to do this.
>>
>> Of course, advanced programmers complain about ceremony too.? We will 
>> never be
>> able to satisfy programmers' insatiable appetite for typing fewer 
>> keystrokes,
>> and we shouldn't try, because the goal of programming is to write 
>> programs that
>> are easy to read and are clearly correct, not programs that were easy 
>> to type.
>> But we can try to better align the ceremony commensurate with the 
>> value it
>> brings to a program -- and let simple programs be expressed more simply.
>>
>> ## Concept overload
>>
>> The classic "Hello World" program looks like this in Java:
>>
>> ```
>> public class HelloWorld {
>> ??? public static void main(String[] args) {
>> ??????? System.out.println("Hello World");
>> ??? }
>> }
>> ```
>>
>> It may only be five lines, but those lines are packed with concepts 
>> that are
>> challenging to absorb without already having some programming 
>> experience and
>> familiarity with object orientation. Let's break down the concepts a 
>> student
>> confronts when writing their first Java program:
>>
>> ? - **public** (on the class).? The `public` accessibility level is 
>> relevant
>> ??? only when there is going to be cross-package access; in a simple 
>> "Hello
>> ??? World" program, there is only one class, which lives in the 
>> unnamed package.
>> ??? They haven't even written a one-line program yet; the notion of 
>> access
>> ??? control -- keeping parts of a program from accessing other parts 
>> of it -- is
>> ??? still way in their future.
>>
>> ? - **class**.? Our student hasn't set out to write a _class_, or model a
>> ??? complex system with objects; they want to write a _program_.? In 
>> Java, a
>> ??? program is just a `main` method in some class, but at this point 
>> our student
>> ??? still has no idea what a class is or why they want one.
>>
>> ? - **Methods**.? Methods are of course a key concept in Java, but 
>> the mechanics
>> ??? of methods -- parameters, return types, and invocation -- are still
>> ??? unfamiliar, and the `main` method is invoked magically from the 
>> `java`
>> ??? launcher rather than from explicit code.
>>
>> ? - **public** (again).? Like the class, the `main` method has to be 
>> public, but
>> ??? again this is only relevant when programs are large enough to require
>> ??? packages to organize them.
>>
>> ? - **static**.? The `main` method has to be static, and at this 
>> point, students
>> ??? have no context for understanding what a static method is or why 
>> they want
>> ??? one.? Worse, the early exposure to `static` methods will turn out 
>> to be a
>> ??? bad habit that must be later unlearned.? Worse still, the fact 
>> that the
>> ??? `main` method is `static` creates a seam between `main` and other 
>> methods;
>> ??? either they must become `static` too, or the `main` method must 
>> trampoline
>> ??? to some sort of "instance main" (more ceremony!)? And if we get 
>> this wrong,
>> ??? we get the dreaded and mystifying `"cannot be referenced from a 
>> static
>> ??? context"` error.
>>
>> ? - **main**.? The name `main` has special meaning in a Java program, 
>> indicating
>> ??? the starting point of a program, but this specialness hides 
>> behind being an
>> ??? ordinary method name.? This may contribute to the sense of "so 
>> many magic
>> ??? incantations."
>>
>> ? - **String[]**.? The parameter to `main` is an array of strings, 
>> which are the
>> ??? arguments that the `java` launcher collected from the command 
>> line.? But our
>> ??? first program -- likely our first dozen -- will not use command-line
>> ??? parameters. Requiring the `String[]` parameter is, at this point, 
>> a mistake
>> ??? waiting to happen, and it will be a long time until this 
>> parameter makes
>> ??? sense.? Worse, educators may be tempted to explain arrays at this 
>> point,
>> ??? which further increases the time-to-first-program.
>>
>> ? - **System.out.println**.? If you look closely at this incantation, 
>> each
>> ??? element in the chain is a different thing -- `System` is a class 
>> (what's a
>> ??? class again?), `out` is a static field (what's a field?), and 
>> `println` is
>> ??? an instance method.? The only part the student cares about right 
>> now is
>> ??? `println`; the rest of it is an incantation that they do not yet 
>> understand
>> ??? in order to get at the behavior they want.
>>
>> That's a lot to explain to a student on the first day of class.? 
>> There's a good
>> chance that by now, class is over and we haven't written any programs 
>> yet, or
>> the teacher has said "don't worry what this means, you'll understand 
>> it later"
>> six or eight times.? Not only is this a lot of _syntactic_ things to 
>> absorb, but
>> each of those things appeals to a different concept (class, method, 
>> package,
>> return value, parameter, array, static, public, etc) that the student 
>> doesn't
>> have a framework for understanding yet.? Each of these will have an 
>> important
>> role to play in larger programs, but so far, they only contribute to 
>> "wow,
>> programming is complicated."
>>
>> It won't be practical (or even desirable) to get _all_ of these 
>> concepts out of
>> the student's face on day 1, but we can do a lot -- and focus on the 
>> ones that
>> do the most to help beginners understand how programs are constructed.
>>
>> ## Goal: a smooth on-ramp
>>
>> As much as programmers like to rant about ceremony, the real goal 
>> here is not
>> mere ceremony reduction, but providing a graceful _on ramp_ to Java 
>> programming.
>> This on-ramp should be helpful to beginning programmers by requiring 
>> only those
>> concepts that a simple program needs.
>>
>> Not only should an on-ramp have a gradual slope and offer enough 
>> acceleration
>> distance to get onto the highway at the right speed, but its 
>> direction must
>> align with that of the highway.? When a programmer is ready to learn 
>> about more
>> advanced concepts, they should not have to discard what they've 
>> already learned,
>> but instead easily see how the simple programs they've already written
>> generalize to more complicated ones, and both the syntatic and conceptual
>> transformation from "simple" to "full blown" program should be 
>> straightforward
>> and unintrusive.? It is a definite non-goal to create a "simplified 
>> dialect of
>> Java for students".
>>
>> We identify three simplifications that should aid both educators and 
>> students in
>> navigating the on-ramp to Java, as well as being generally useful to 
>> simple
>> programs beyond the classroom as well:
>>
>> ?- A more tolerant launch protocol
>> ?- Unnamed classes
>> ?- Predefined static imports for the most critical methods and fields
>>
>> ## A more tolerant launch protocol
>>
>> The Java Language Specification has relatively little to say about 
>> how Java
>> "programs" get launched, other than saying that there is some way to 
>> indicate
>> which class is the initial class of a program (JLS 12.1.1) and that a 
>> public
>> static method called `main` whose sole argument is of type `String[]` 
>> and whose
>> return is `void` constitutes the entry point of the indicated class.
>>
>> We can eliminate much of the concept overload simply by relaxing the
>> interactions between a Java program and the `java` launcher:
>>
>> ?- Relax the requirement that the class, and `main` method, be 
>> public.? Public
>> ?? accessibility is only relevant when access crosses packages; 
>> simple programs
>> ?? live in the unnamed package, so cannot be accessed from any other 
>> package
>> ?? anyway.? For a program whose main class is in the unnamed package, 
>> we can
>> ?? drop the requirement that the class or its `main` method be public,
>> ?? effectively treating the `java` launcher as if it too resided in 
>> the unnamed
>> ?? package.
>>
>> ?- Make the "args" parameter to `main` optional, by allowing the 
>> `java` launcher to
>> ?? first look for a main method with the traditional `main(String[])`
>> ?? signature, and then (if not found) for a main method with no 
>> arguments.
>>
>> ?- Make the `static` modifier on `main` optional, by allowing the 
>> `java` launcher to
>> ?? invoke an instance `main` method (of either signature) by 
>> instantiating an
>> ?? instance using an accessible no-arg constructor and then invoking 
>> the `main`
>> ?? method on it.
>>
>> This small set of changes to the launch protocol strikes out five of 
>> the bullet
>> points in the above list of concepts: public (twice), static, method 
>> parameters,
>> and `String[]`.
>>
>> At this point, our Hello World program is now:
>>
>> ```
>> class HelloWorld {
>> ??? void main() {
>> ??????? System.out.println("Hello World");
>> ??? }
>> }
>> ```
>>
>> It's not any shorter by line count, but we've removed a lot of 
>> "horizontal
>> noise" along with a number of concepts.? Students and educators will 
>> appreciate
>> it, but advanced programmers are unlikely to be in any hurry to make 
>> these
>> implicit elements explicit either.
>>
>> Additionally, the notion of an "instance main" has value well beyond 
>> the first
>> day.? Because excessive use of `static` is considered a code smell, many
>> educators encourage the pattern of "all the static `main` method does is
>> instantiate an instance and call an instance `main` method" anyway.? 
>> Formalizing
>> the "instance main" protocol reduces a layer of boilerplate in these 
>> cases, and
>> defers the point at which we have to explain what instance creation 
>> is -- and
>> what `static` is.? (Further, allowing the `main` method to be an 
>> instance method
>> means that it could be inherited from a superclass, which is useful 
>> for simple
>> frameworks such as test runners or service frameworks.)
>>
>> ## Unnamed classes
>>
>> In a simple program, the `class` declaration often doesn't help 
>> either, because
>> other classes (if there are any) are not going to reference it by 
>> name, and we
>> don't extend a superclass or implement any interfaces.? If we say an 
>> "unnamed
>> class" consists of member declarations without a class header, then 
>> our Hello
>> World program becomes:
>>
>> ```
>> void main() {
>> ??? System.out.println("Hello World");
>> }
>> ```
>>
>> Such source files can still have fields, methods, and even nested 
>> classes, so
>> that as a program evolves from a few statements to needing some 
>> ancillary state
>> or helper methods, these can be factored out of the `main` method 
>> while still
>> not yet requiring a full class declaration:
>>
>> ```
>> String greeting() { return "Hello World"; }
>>
>> void main() {
>> ??? System.out.println(greeting());
>> }
>> ```
>>
>> This is where treating `main` as an instance method really shines; 
>> the user has
>> just declared two methods, and they can freely call each other.? 
>> Students need
>> not confront the confusing distinction between instance and static 
>> methods yet;
>> indeed, if not forced to confront static members on day 1, it might 
>> be a while
>> before they do have to learn this distinction. The fact that there is a
>> receiver lurking in the background will come in handy later, but 
>> right now is
>> not bothering anybody.
>>
>> [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
>> launched directly without compilation; this streamlined launcher 
>> pairs well with
>> unnamed classes.
>>
>> ## Predefined static imports
>>
>> The most important classes, such as `String` and `Integer`, live in the
>> `java.lang` package, which is automatically on-demand imported into all
>> compilation units; this is why we do not have to `import 
>> java.lang.String` in
>> every class.? Static imports were not added until Java 5, but no 
>> corresponding
>> facility for automatic on-demand import of common behavior was added 
>> at that
>> time.? Most programs, however, will want to do console IO, and Java 
>> forces us to
>> do this in a roundabout way -- through the static `System.out` and 
>> `System.in`
>> fields.? Basic console input and output is a reasonable candidate for
>> auto-static import, as one or both are needed by most simple 
>> programs.? While
>> these are currently instance methods accessed through static fields, 
>> we can
>> easily create static methods for `println` and `readln` which are 
>> suitable for
>> static import, and automatically import them.? At which point our 
>> first program
>> is now down to:
>>
>> ```
>> void main() {
>> ??? println("Hello World");
>> }
>> ```
>>
>> ## Putting this all together
>>
>> We've discussed several simplifications:
>>
>> ?- Update the launcher protocol to make public, static, and arguments 
>> optional
>> ?? for main methods, and for main methods to be instance methods (when a
>> ?? no-argument constructor is available);
>> ?- Make the class wrapper for "main classes" optional (unnamed classes);
>> ?- Automatically static import methods like `println`
>>
>> which together whittle our long list of day-1 concepts down 
>> considerably.? While
>> this is still not as minimal as the minimal Python or Ruby program -- 
>> statements
>> must still live in a method -- the goal here is not to win at "code 
>> golf".? The
>> goal is to ensure that concepts not needed by simple programs need 
>> not appear in
>> those programs, while at the same time not encouraging habits that 
>> have to be
>> unlearned as programs scale up.
>>
>> Each of these simplifications is individually small and unintrusive, 
>> and each is
>> independent of the others.? And each embodies a simple transformation 
>> that the
>> author can easily manually reverse when it makes sense to do so: elided
>> modifiers and `main` arguments can be added back, the class wrapper 
>> can be added
>> back when the affordances of classes are needed (supertypes, 
>> constructors), and
>> the full qualifier of static-import can be added back.? And these 
>> reversals are
>> independent of one another; they can done in any combination or any 
>> order.
>>
>> This seems to meet the requirements of our on-ramp; we've eliminated 
>> most of the
>> day-1 ceremony elements without introducing new concepts that need to be
>> unlearned. The remaining concepts -- a method is a container for 
>> statements, and
>> a program is a Java source file with a `main` method -- are easily 
>> understood in
>> relation to their fully specified counterparts.
>>
>> ## Alternatives
>>
>> Obviously, we've lived with the status quo for 25+ years, so we could 
>> continue
>> to do so.? There were other alternatives explored as well; 
>> ultimately, each of
>> these fell afoul of one of our goals.
>>
>> ### Can't we go further?
>>
>> Fans of "code golf" -- of which there are many -- are surely right 
>> now trying to
>> figure out how to eliminate the last little bit, the `main` method, 
>> and allow
>> statements to exist at the top-level of a program.? We deliberately 
>> stopped
>> short of this because it offers little value beyond the first few 
>> minutes, and
>> even that small value quickly becomes something that needs to be 
>> unlearned.
>>
>> The fundamental problem behind allowing such "loose" statements is that
>> variables can be declared inside both classes (fields) and methods (local
>> variables), and they share the same syntactic production but not the same
>> semantics.? So it is unclear (to both compilers and humans) whether a 
>> "loose"
>> variable would be a local or a field.? If we tried to adopt some sort 
>> of simple
>> heuristic to collapse this ambiguity (e.g., whether it precedes or 
>> follows the
>> first statement), that may satisfy the compiler, but now simple 
>> refactorings
>> might subtly change the meaning of the program, and we'd be replacing the
>> explicit syntactic overhead of `void main()` with an invisible "line" 
>> in the
>> program that subtly affects semantics, and a new subtle rule about 
>> the meaning
>> of variable declarations that applies only to unnamed classes.? This 
>> doesn't
>> help students, nor is this particularly helpful for all but the most 
>> trivial
>> programs.? It quickly becomes a crutch to be discarded and unlearned, 
>> which
>> falls afoul of our "on ramp" goals.? Of all the concepts on our list, 
>> "methods"
>> and "a program is specified by a main method" seem the ones that are 
>> most worth
>> asking students to learn early.
>>
>> ### Why not "just" use `jshell`?
>>
>> While JShell is a great interactive tool, leaning too heavily on it 
>> as an onramp
>> would fall afoul of our goals.? A JShell session is not a program, but a
>> sequence of code snippets.? When we type declarations into `jshell`, 
>> they are
>> viewed as implicitly static members of some unspecified class, with
>> accessibility is ignored completely, and statements execute in a 
>> context where
>> all previous declarations are in scope.? This is convenient for 
>> experimentation
>> -- the primary goal of `jshell` -- but not such a great mental model for
>> learning to write Java programs.? Transforming a batch of working 
>> declarations
>> in `jshell` to a real Java program would not be sufficiently simple or
>> unintrusive, and would lead to a non-idiomatic style of code, because the
>> straightforward translation would have us redeclaring each method, 
>> class, and
>> variable declaration as `static`.? Further, this is probably not the 
>> direction
>> we want to go when we scale up from a handful of statements and 
>> declarations to
>> a simple class -- we probably want to start using classes as classes, 
>> not just
>> as containers for static members. JShell is a great tool for 
>> exploration and
>> debugging, and we expect many educators will continue to incorporate 
>> it into
>> their curriculum, but is not the on-ramp programming model we are 
>> looking for.
>>
>> ### What about "always local"?
>>
>> One of the main tensions that `main` introduces is that most class 
>> members are
>> not `static`, but the `main` method is -- and that forces programmers to
>> confront the seam between static and non-static members.? JShell 
>> answers this
>> with "make everything static".
>>
>> Another approach would be to "make everything local" -- treat a 
>> simple program
>> as being the "unwrapped" body of an implicit main method.? We already 
>> allow
>> variables and classes to be declared local to a method.? We could add 
>> local
>> methods (a useful feature in its own right) and relax some of the 
>> asymmetries
>> around nesting (again, an attractive cleanup), and then treat a mix of
>> declarations and statements without a class wrapper as the body of an 
>> invisible
>> `main` method. This seems an attractive model as well -- at first.
>>
>> While the syntactic overhead of converting back to full-blown classes 
>> -- wrap
>> the whole thing in a `main` method and a `class` declaration -- is 
>> far less
>> intrusive than the transformation inherent in `jshell`, this is still 
>> not an
>> ideal on-ramp.? Local variables interact with local classes (and 
>> methods, when
>> we have them) in a very different way than instance fields do with 
>> instance
>> methods and inner classes: their scopes are different (no forward 
>> references),
>> their initialization rules are different, and captured local 
>> variables must be
>> effectively final.? This is a subtly different programming model that 
>> would then
>> have to be unlearned when scaling up to full classes. Further, the 
>> result of
>> this wrapping -- where everything is local to the main method -- is 
>> also not
>> "idiomatic Java".? So while local methods may be an attractive 
>> feature, they are
>> similarly not the on-ramp we are looking for.
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/87fcc910/attachment-0001.htm>

From brian.goetz at oracle.com  Thu Sep 29 14:01:19 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 10:01:19 -0400
Subject: Paving the on-ramp
In-Reply-To: <CAGKkBkvGztuAamc6K+YsST90vbkJYD3B_YS=bcGT3yavCeHFiA@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBku=CdNABAnQbSwOTeA2CSW39VFJ1ovnDAfhYvMwyxgqeQ@mail.gmail.com>
 <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com>
 <CAGKkBkvGztuAamc6K+YsST90vbkJYD3B_YS=bcGT3yavCeHFiA@mail.gmail.com>
Message-ID: <fa3c28f5-9461-b3b7-aaf1-d76470bcadd0@oracle.com>


>     Indeed, I intend that there are no explicit constructors or
>     instance initializers here.? (There can't be constructors, because
>     the class is unnamed!)
>
>
> Hmm, I was under the impression I could drop all my `static`s?while 
> keeping the class signature if I wanted? But, if I can and even then 
> explicit constrs and initers are banned, then indeed, at least one of 
> my drawbacks is invalid. I don't think it undercuts my overall case 
> that much.

Yes you can.? Example:

 ??? class InstanceMain implements Serializable {
 ??????? public InstanceMain() { }

 ??????? public void main() { ... }
 ??? }

and if you `java InstanceMain`, the launcher will do `new 
InstanceMain().main()`.

The two features -- no class header and instance main -- are 
orthogonal.? If you don't have a class header, you don't get explicit 
constructors.? If you use instance main, you must have a no-arg 
constructor, which could be supplied explicitlly (if there is a class 
header) or implicitly (whether or not there is a class header.)

>>
>>     One or more class annotations could appear below package/imports?
>
>     No package statement (unnamed classes live in the unnamed
>     package), but imports are OK.
>
>
> I'm confused; what does any of this have to do with package location? 
> Isn't that orthogonal to everything we're discussing?

There's a world where package is relevant here, but it seems pretty 
esoteric.? If you define a class with no name, the thing you want to do 
with it is launch it directly.? Seems like putting it in a package makes 
little sense here.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/1548b647/attachment.htm>

From brian.goetz at oracle.com  Thu Sep 29 14:57:57 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 10:57:57 -0400
Subject: Paving the on-ramp
In-Reply-To: <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
Message-ID: <ba36870a-b91b-1bc4-aceb-d6d295804e03@oracle.com>


On 9/29/2022 3:07 AM, Tagir Valeev wrote:
>> ## Unnamed classes
>> ...
>> Such source files can still have fields, methods, and even nested classes, so
>> that as a program evolves from a few statements to needing some ancillary state
>> or helper methods, these can be factored out of the `main` method while still
> I wonder how we tell apart unnamed class syntax and normal class
> syntax. E.g., consider the source file:
>
> // Hello.java
> public class Hello {
>    // tons of logic
> }
>
> void main() {
> }
>
> Will it be considered as a correct Java file, having Hello class as a
> nested class of top-level unnamed class?
> If yes, then, adding a main method after the class declaration, I
> change the class semantics, making it an inner class.
> This looks like action at a distance and may cause confusion. E.g., I
> just wrote a main() method outside of Hello class instead of inside,
> and boom,
> now Hello is not resolvable from other classes, for no apparent reason.

Yes, this is where the bodies are buried.? At some point, the file name 
is likely to come into play, even though we would prefer it not.? (Note 
that we have a little of that issue with "auxilliary classes" today.)?? 
I think the move here is that for unnamed classes, if there is a "top 
level" nested class that matches the file name, we call that an error.

> I assume that the main() method is required for an unnamed class, and
> if there are only other top-level declarations,
> then it should be a compilation error, right?

Probably so, yes.

>
>> ## Predefined static imports
>> ```
>> void main() {
>>      println("Hello World");
>> }
>> ```
> I wonder how it will play with existing static star imports. We
> already saw problems when updated to Java 9 or Java 14 that
> star-imported class named Module or Record becomes unresolvable. If
> existing code already imports static method named println from
> somewhere, will this code become invalid?

"Star" is the right word.? Currently we have a scheme where single 
imports and beat star imports, so that if someone declares their own 
`println` method, it wins.? Details to be worked out.


From james.laskey at oracle.com  Thu Sep 29 16:47:17 2022
From: james.laskey at oracle.com (Jim Laskey)
Date: Thu, 29 Sep 2022 16:47:17 +0000
Subject: Paving the on-ramp
In-Reply-To: <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAE+3fjYHK7gtXiK1yFZFCEdb1FOKMSkEiiExuYKJLH--q=9qMA@mail.gmail.com>
Message-ID: <0A644B30-2AB1-425C-8DCA-1F706BD586BB@oracle.com>

    // Hello.java
    public class Hello {
     // tons of logic
    }

    void main() {
    }

and

    void main() {
    }

    // Hello.java
    public class Hello {
     // tons of logic
    }

Are equivalent, as though the file content is wrapped in an outer class.

public class {$name} {}
    // Hello.java
    public class Hello {
     // tons of logic
    }

    void main() {
    }
}

The trigger for an unnamed class is a method or field defined at the top level. So the order doesn?t matter.

{$name} is derived from the source file name and must be a valid identifier.

When running the source launcher, the class name doesn't matter (we could allow only with the source launcher).
When compiling with javac we have to stuff the class somewhere and using a name derived from the source makes sense.
So if the source is Hello.java you can access the class Hello.Hello from an external reference.

Cheers,

? Jim


On Sep 29, 2022, at 4:07 AM, Tagir Valeev <amaembo at gmail.com<mailto:amaembo at gmail.com>> wrote:

Hello!

Very interesting writing, thanks! A couple of notes from me:

## Unnamed classes
...
Such source files can still have fields, methods, and even nested classes, so
that as a program evolves from a few statements to needing some ancillary state
or helper methods, these can be factored out of the `main` method while still

I wonder how we tell apart unnamed class syntax and normal class
syntax. E.g., consider the source file:

// Hello.java
public class Hello {
 // tons of logic
}

void main() {
}

Will it be considered as a correct Java file, having Hello class as a
nested class of top-level unnamed class?
If yes, then, adding a main method after the class declaration, I
change the class semantics, making it an inner class.
This looks like action at a distance and may cause confusion. E.g., I
just wrote a main() method outside of Hello class instead of inside,
and boom,
now Hello is not resolvable from other classes, for no apparent reason.

I assume that the main() method is required for an unnamed class, and
if there are only other top-level declarations,
then it should be a compilation error, right?

## Predefined static imports
```
void main() {
   println("Hello World");
}
```

I wonder how it will play with existing static star imports. We
already saw problems when updated to Java 9 or Java 14 that
star-imported class named Module or Record becomes unresolvable. If
existing code already imports static method named println from
somewhere, will this code become invalid?

With best regards,
Tagir Valeev.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/c4dc4f12/attachment-0001.htm>

From james.laskey at oracle.com  Thu Sep 29 16:53:42 2022
From: james.laskey at oracle.com (Jim Laskey)
Date: Thu, 29 Sep 2022 16:53:42 +0000
Subject: Paving the on-ramp
In-Reply-To: <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
Message-ID: <49A8B9F9-D0D1-4E80-9BAD-E870E7CF6C90@oracle.com>

Another safer approach we are playing with is to synthesize a public static void main method in an unnamed class when missing. The contents of that method would then invoke the user's main.

Cheers,

? Jim


On Sep 28, 2022, at 4:49 PM, Kevin Bourrillion <kevinb at google.com<mailto:kevinb at google.com>> wrote:

Virtuous.

The quips about horses having fled the barn are coming, but whether they did is irrelevant; let's just make Java better now.


On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz <brian.goetz at oracle.com<mailto:brian.goetz at oracle.com>> wrote:

## Concept overload

I like that the focus is not just on boilerplate but on the offense of forcing learners to encounter concepts they *will* need to care about but don't yet.


 - Relax the requirement that the class, and `main` method, be public.  Public
   accessibility is only relevant when access crosses packages; simple programs
   live in the unnamed package, so cannot be accessed from any other package
   anyway.  For a program whose main class is in the unnamed package, we can
   drop the requirement that the class or its `main` method be public,
   effectively treating the `java` launcher as if it too resided in the unnamed
   package.

Alternative: drop the requirement altogether. Most main methods have no desire to make themselves publicly callable as `TheClass.main(args)`, but today they are forced to expose that API anyway. I feel like it would still be conceptually clean to say that `public` is really about whether other *code* can access it, not whether a VM can get to it at all.


 - Make the "args" parameter to `main` optional, by allowing the `java` launcher to
   first look for a main method with the traditional `main(String[])`
   signature, and then (if not found) for a main method with no arguments.

This seems to leave users vulnerable to some surprises, where the code they think is being called isn't. Why not make it a compile-time error to provide both forms?


 - Make the `static` modifier on `main` optional, by allowing the `java` launcher to
   invoke an instance `main` method (of either signature) by instantiating an
   instance using an accessible no-arg constructor and then invoking the `main`
   method on it.

I'll give the problems I see with this, without a judgement on what should be done.

What's the whole idea of main? Well, it's the entry point into the program. But now it's not really the entry point; finding the entry point is more subtle. (Okay, I concede that static initializers are run first either way; that undercuts *some* of the strength of my argument here.)

Even if this is okay when I'm writing my own new program, understanding it as I go, then suppose someone else reads my program. That person has the burden of remembering to check whether `main` is static or not, and remembering that some constructor code is happening first if it's not. Classes that have both main and a constructor will be a mixture of some that call them in one order and some in the other. That's just, like, messy.

And is it even clear, then, why the VM shouldn't be passing `args` to the constructor, only hoarding it until calling `main`?

On a deep conceptual level... I'd insist that main() *is static*. It is *the* single entry point into the program; what could be more static than that? But thinking about our learner, who wrote some `main`s before learning about static. The instant they learn `static` is a keyword a method can have, they'll "know" one thing about it already: this is going to be something new that's *not* true of main(). But then they hear an explanation that fits `main` perfectly?


Because excessive use of `static` is considered a code smell, many
educators encourage the pattern of "all the static `main` method does is
instantiate an instance and call an instance `main` method" anyway.

Heavy groan. In my opinion, some ideas are too misguided to take seriously.

The value in that practice is if instance `main` accepts parameters like `PrintStream` and `Console`, and static main passes in `System.out` and `System.console()`. That makes all your actual program logic unit-testable. Great! This actually strikes directly at the heart of what the entire problem with `static` is! But this isn't the case you're addressing.

Static methods are not a code smell! Static methods that ought to be overrideable by one of their argument types (Collections.sort()), sure. Static mutable state is a code smell, definitely -- but a method that touches that state is equally problematic whether it itself is static or not. There are some code smells around `static`, but `static` itself is fresh and flowery.


(Further, allowing the `main` method to be an instance method
means that it could be inherited from a superclass, which is useful for simple
frameworks such as test runners or service frameworks.)

This does not give me a happy feeling. Going into it is a deep discussion though.

Rest of the response coming soon, I hope.

Just to mention one additional idea. We could permit `main` to optionally return `int`, becoming the default exit status if `exit` is never called. Seems elegant for the rare cases where you care about exit status, but (a) would this feature get in the way in *any* sense for the vast majority of cases that don't care, or (b) are the cases that care just way too rare for us to worry about?

I'm not sure about (a). But (b) kinda seems like a yes.

--
Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com<mailto:kevinb at google.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/535bee65/attachment-0001.htm>

From angelos.bimpoudis at oracle.com  Thu Sep 29 17:03:33 2022
From: angelos.bimpoudis at oracle.com (Angelos Bimpoudis)
Date: Thu, 29 Sep 2022 17:03:33 +0000
Subject: Draft JEP: Unnamed local variables and patterns
Message-ID: <SA2PR10MB4667912B493AE1937C77CC8382579@SA2PR10MB4667.namprd10.prod.outlook.com>

Dear experts,

The draft JEP for unnamed local variables and patterns, that has been previously discussed on this list is available at:

https://bugs.openjdk.org/browse/JDK-8294349

Comments welcomed!
Angelos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/165c4634/attachment.htm>

From brian.goetz at oracle.com  Thu Sep 29 17:06:22 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 13:06:22 -0400
Subject: Paving the on-ramp
In-Reply-To: <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <CAGKkBkvNzrfBKiYpr+eGT2ruTb0c6=C7LX1jbQtWmWsanWFwPw@mail.gmail.com>
Message-ID: <259fc28b-acc8-ac75-d163-a1d6ef34883f@oracle.com>

This question came up in a few different forms, but there's a reason why 
I've only relaxed the "must be public" for classes in the unnamed 
package: security.

There may be existing classes with a package-private instance main() 
method, and they may have reasonably assumed that these are only 
callable from within the package.? If the launcher can barge in and open 
non-public classes and call non-public methods, that may be surprising.? 
Restricting the "main can be non public" to the unnamed package is 
justifiable because we can reasonably treat the launcher as being part 
of the unnamed package (and therefore this rule falls out from ordinary 
access control) and because it is disadvised to distribute libraries 
that use the unnamed package, reserving it instead for local 
experimentation.

Framing the launcher as "just some Java code in the unnamed package" 
also demystifies the launcher a bit.

On 9/28/2022 3:49 PM, Kevin Bourrillion wrote:
>
>     ?- Relax the requirement that the class, and `main` method, be
>     public. Public
>     ?? accessibility is only relevant when access crosses packages;
>     simple programs
>     ?? live in the unnamed package, so cannot be accessed from any
>     other package
>     ?? anyway.? For a program whose main class is in the unnamed
>     package, we can
>     ?? drop the requirement that the class or its `main` method be public,
>     ?? effectively treating the `java` launcher as if it too resided
>     in the unnamed
>     ?? package.
>
>
> Alternative: drop the requirement altogether. Most main methods have 
> no desire to make themselves publicly?callable as 
> `TheClass.main(args)`, but today they are forced to expose that API 
> anyway. I feel like it would still be conceptually clean?to say that 
> `public` is really about whether other *code* can access it, not 
> whether a VM can get to it at all.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/9882b505/attachment.htm>

From brian.goetz at oracle.com  Thu Sep 29 18:20:42 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 14:20:42 -0400
Subject: Paving the on-ramp
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <39a19c35-d15a-a05f-ac3e-6aa7fd3adbcb@oracle.com>

One thing that this forces us to confront, in some manner or other, are 
the rules about the relationship between the file name and the class 
name, which are spread out in a few different places.

The compiler issues an error when a top-level *public* class does not 
match the file it is in.? Order is irrelevant; the following are valid 
Foo.java files:

--
public class Foo { }
class Bar { }
--
class Foo { }
class Bar { }
--
class Bar { }
public class Foo { }
--

but the following are illegal for Foo.java:

--
public class Foo { }
public class Bar { }
--
public class Bar { }
--

The standard class loader implementation enforces that a class file 
X.class must contain a class called X (javap will warn about this too.)? 
When compiling code in-memory, the javac "FileManager" abstraction still 
requires a "file name" for each source unit being compiled, even if 
there is no actual file.


The prototype implementation we have does infer a class name from the 
file name in the obvious way; we just don't enter it into the symbol 
table.? But, if you put this in Foo.java:

--
void main() { }
--

You'll get a Foo.class with class Foo in it, and if you put that class 
file on the class path, *other* classes can instantiate it by name.? So 
the prototype is currently in a half-here, half-there situation.? It is 
probably overkill to try to have some ACC_UNNAMED marking to prevent 
this.? So we might accept this odd state, or we might embrace it as Guy 
suggests, and go ahead and enter Foo in the symbol table, and even let 
people declare constructors.? That means that the name "unnamed class" 
would no longer be an accurate name (a shame, since its friends "unnamed 
module" and "unnamed package" have been saving it a seat.)? This is a 
workable direction, though not a forced move.


The other connection point with the file name is the one Tagir brought 
up, which is the effect of accidentally putting a method or field 
outside the braces.? If you have a class

--
class Foo { }
void x() { }
--

today, this is an error; under this proposal, this becomes a valid 
unnamed class with a *nested* class Foo, which may not be what was 
meant.? We can reduce the possibility of this by issuing a warning/error 
if an unnamed class has a "top-level" nested class whose name matches 
the file.? This seems reasonably consistent with the existing rules 
constraining file names and class names.? If we combined this with the 
previous move, this becomes "An unnamed class cannot have a top-level 
nested class of the same name".


On 9/28/2022 1:57 PM, Brian Goetz wrote:
> At various points, we've explored the question of which program 
> elements are most and least helpful for students first learning Java.? 
> After considering a number of alternatives over the years, I have a 
> simple proposal for smoothing the "on ramp" to Java programming, while 
> not creating new things to unlearn.
>
> Markdown source is below, HTML will appear soon at:
>
> https://openjdk.org/projects/amber/design-notes/on-ramp
>
>
> # Paving the on-ramp
>
> Java is one of the most widely taught programming languages in the 
> world.? Tens
> of thousands of educators find that the imperative core of the 
> language combined
> with a straightforward standard library is a foundation that students can
> comfortably learn on.? Choosing Java gives educators many degrees of 
> freedom:
> they can situate students in `jshell` or Notepad or a full-fledged 
> IDE; they can
> teach imperative, object-oriented, functional, or hybrid programming 
> styles; and
> they can easily find libraries to interact with external data and 
> services.
>
> No language is perfect, and one of the most common complaints about 
> Java is that
> it is "too verbose" or has "too much ceremony."? And unfortunately, 
> Java imposes
> its heaviest ceremony on those first learning the language, who need and
> appreciate it the least.? The declaration of a class and the 
> incantation of
> `public static void main` is pure mystery to a beginning programmer.? 
> While
> these incantations have principled origins and serve a useful 
> organizing purpose
> in larger programs, they have the effect of placing obstacles in the 
> path of
> _becoming_ Java programmers. Educators constantly remind us of the 
> litany of
> complexity that students have to confront on Day 1 of class -- when 
> they really
> just want to write their first program.
>
> As an amusing demonstration of this, in her JavaOne keynote appearance 
> in 2019,
> [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked 
> about when
> she learned to program in Java, and how her teacher performed a rap song
> to help students memorize `"public static void main"`.? Our hats are 
> off to
> creative educators everywhere for this kind of dedication, but teachers
> shouldn't have to do this.
>
> Of course, advanced programmers complain about ceremony too. We will 
> never be
> able to satisfy programmers' insatiable appetite for typing fewer 
> keystrokes,
> and we shouldn't try, because the goal of programming is to write 
> programs that
> are easy to read and are clearly correct, not programs that were easy 
> to type.
> But we can try to better align the ceremony commensurate with the value it
> brings to a program -- and let simple programs be expressed more simply.
>
> ## Concept overload
>
> The classic "Hello World" program looks like this in Java:
>
> ```
> public class HelloWorld {
> ??? public static void main(String[] args) {
> ??????? System.out.println("Hello World");
> ??? }
> }
> ```
>
> It may only be five lines, but those lines are packed with concepts 
> that are
> challenging to absorb without already having some programming 
> experience and
> familiarity with object orientation. Let's break down the concepts a 
> student
> confronts when writing their first Java program:
>
> ? - **public** (on the class).? The `public` accessibility level is 
> relevant
> ??? only when there is going to be cross-package access; in a simple 
> "Hello
> ??? World" program, there is only one class, which lives in the 
> unnamed package.
> ??? They haven't even written a one-line program yet; the notion of access
> ??? control -- keeping parts of a program from accessing other parts 
> of it -- is
> ??? still way in their future.
>
> ? - **class**.? Our student hasn't set out to write a _class_, or model a
> ??? complex system with objects; they want to write a _program_.? In 
> Java, a
> ??? program is just a `main` method in some class, but at this point 
> our student
> ??? still has no idea what a class is or why they want one.
>
> ? - **Methods**.? Methods are of course a key concept in Java, but the 
> mechanics
> ??? of methods -- parameters, return types, and invocation -- are still
> ??? unfamiliar, and the `main` method is invoked magically from the `java`
> ??? launcher rather than from explicit code.
>
> ? - **public** (again).? Like the class, the `main` method has to be 
> public, but
> ??? again this is only relevant when programs are large enough to require
> ??? packages to organize them.
>
> ? - **static**.? The `main` method has to be static, and at this 
> point, students
> ??? have no context for understanding what a static method is or why 
> they want
> ??? one.? Worse, the early exposure to `static` methods will turn out 
> to be a
> ??? bad habit that must be later unlearned.? Worse still, the fact 
> that the
> ??? `main` method is `static` creates a seam between `main` and other 
> methods;
> ??? either they must become `static` too, or the `main` method must 
> trampoline
> ??? to some sort of "instance main" (more ceremony!)? And if we get 
> this wrong,
> ??? we get the dreaded and mystifying `"cannot be referenced from a static
> ??? context"` error.
>
> ? - **main**.? The name `main` has special meaning in a Java program, 
> indicating
> ??? the starting point of a program, but this specialness hides behind 
> being an
> ??? ordinary method name.? This may contribute to the sense of "so 
> many magic
> ??? incantations."
>
> ? - **String[]**.? The parameter to `main` is an array of strings, 
> which are the
> ??? arguments that the `java` launcher collected from the command 
> line.? But our
> ??? first program -- likely our first dozen -- will not use command-line
> ??? parameters. Requiring the `String[]` parameter is, at this point, 
> a mistake
> ??? waiting to happen, and it will be a long time until this parameter 
> makes
> ??? sense.? Worse, educators may be tempted to explain arrays at this 
> point,
> ??? which further increases the time-to-first-program.
>
> ? - **System.out.println**.? If you look closely at this incantation, each
> ??? element in the chain is a different thing -- `System` is a class 
> (what's a
> ??? class again?), `out` is a static field (what's a field?), and 
> `println` is
> ??? an instance method.? The only part the student cares about right 
> now is
> ??? `println`; the rest of it is an incantation that they do not yet 
> understand
> ??? in order to get at the behavior they want.
>
> That's a lot to explain to a student on the first day of class.? 
> There's a good
> chance that by now, class is over and we haven't written any programs 
> yet, or
> the teacher has said "don't worry what this means, you'll understand 
> it later"
> six or eight times.? Not only is this a lot of _syntactic_ things to 
> absorb, but
> each of those things appeals to a different concept (class, method, 
> package,
> return value, parameter, array, static, public, etc) that the student 
> doesn't
> have a framework for understanding yet.? Each of these will have an 
> important
> role to play in larger programs, but so far, they only contribute to "wow,
> programming is complicated."
>
> It won't be practical (or even desirable) to get _all_ of these 
> concepts out of
> the student's face on day 1, but we can do a lot -- and focus on the 
> ones that
> do the most to help beginners understand how programs are constructed.
>
> ## Goal: a smooth on-ramp
>
> As much as programmers like to rant about ceremony, the real goal here 
> is not
> mere ceremony reduction, but providing a graceful _on ramp_ to Java 
> programming.
> This on-ramp should be helpful to beginning programmers by requiring 
> only those
> concepts that a simple program needs.
>
> Not only should an on-ramp have a gradual slope and offer enough 
> acceleration
> distance to get onto the highway at the right speed, but its direction 
> must
> align with that of the highway.? When a programmer is ready to learn 
> about more
> advanced concepts, they should not have to discard what they've 
> already learned,
> but instead easily see how the simple programs they've already written
> generalize to more complicated ones, and both the syntatic and conceptual
> transformation from "simple" to "full blown" program should be 
> straightforward
> and unintrusive.? It is a definite non-goal to create a "simplified 
> dialect of
> Java for students".
>
> We identify three simplifications that should aid both educators and 
> students in
> navigating the on-ramp to Java, as well as being generally useful to 
> simple
> programs beyond the classroom as well:
>
> ?- A more tolerant launch protocol
> ?- Unnamed classes
> ?- Predefined static imports for the most critical methods and fields
>
> ## A more tolerant launch protocol
>
> The Java Language Specification has relatively little to say about how 
> Java
> "programs" get launched, other than saying that there is some way to 
> indicate
> which class is the initial class of a program (JLS 12.1.1) and that a 
> public
> static method called `main` whose sole argument is of type `String[]` 
> and whose
> return is `void` constitutes the entry point of the indicated class.
>
> We can eliminate much of the concept overload simply by relaxing the
> interactions between a Java program and the `java` launcher:
>
> ?- Relax the requirement that the class, and `main` method, be 
> public.? Public
> ?? accessibility is only relevant when access crosses packages; simple 
> programs
> ?? live in the unnamed package, so cannot be accessed from any other 
> package
> ?? anyway.? For a program whose main class is in the unnamed package, 
> we can
> ?? drop the requirement that the class or its `main` method be public,
> ?? effectively treating the `java` launcher as if it too resided in 
> the unnamed
> ?? package.
>
> ?- Make the "args" parameter to `main` optional, by allowing the 
> `java` launcher to
> ?? first look for a main method with the traditional `main(String[])`
> ?? signature, and then (if not found) for a main method with no arguments.
>
> ?- Make the `static` modifier on `main` optional, by allowing the 
> `java` launcher to
> ?? invoke an instance `main` method (of either signature) by 
> instantiating an
> ?? instance using an accessible no-arg constructor and then invoking 
> the `main`
> ?? method on it.
>
> This small set of changes to the launch protocol strikes out five of 
> the bullet
> points in the above list of concepts: public (twice), static, method 
> parameters,
> and `String[]`.
>
> At this point, our Hello World program is now:
>
> ```
> class HelloWorld {
> ??? void main() {
> ??????? System.out.println("Hello World");
> ??? }
> }
> ```
>
> It's not any shorter by line count, but we've removed a lot of "horizontal
> noise" along with a number of concepts.? Students and educators will 
> appreciate
> it, but advanced programmers are unlikely to be in any hurry to make these
> implicit elements explicit either.
>
> Additionally, the notion of an "instance main" has value well beyond 
> the first
> day.? Because excessive use of `static` is considered a code smell, many
> educators encourage the pattern of "all the static `main` method does is
> instantiate an instance and call an instance `main` method" anyway.? 
> Formalizing
> the "instance main" protocol reduces a layer of boilerplate in these 
> cases, and
> defers the point at which we have to explain what instance creation is 
> -- and
> what `static` is.? (Further, allowing the `main` method to be an 
> instance method
> means that it could be inherited from a superclass, which is useful 
> for simple
> frameworks such as test runners or service frameworks.)
>
> ## Unnamed classes
>
> In a simple program, the `class` declaration often doesn't help 
> either, because
> other classes (if there are any) are not going to reference it by 
> name, and we
> don't extend a superclass or implement any interfaces.? If we say an 
> "unnamed
> class" consists of member declarations without a class header, then 
> our Hello
> World program becomes:
>
> ```
> void main() {
> ??? System.out.println("Hello World");
> }
> ```
>
> Such source files can still have fields, methods, and even nested 
> classes, so
> that as a program evolves from a few statements to needing some 
> ancillary state
> or helper methods, these can be factored out of the `main` method 
> while still
> not yet requiring a full class declaration:
>
> ```
> String greeting() { return "Hello World"; }
>
> void main() {
> ??? System.out.println(greeting());
> }
> ```
>
> This is where treating `main` as an instance method really shines; the 
> user has
> just declared two methods, and they can freely call each other.? 
> Students need
> not confront the confusing distinction between instance and static 
> methods yet;
> indeed, if not forced to confront static members on day 1, it might be 
> a while
> before they do have to learn this distinction.? The fact that there is a
> receiver lurking in the background will come in handy later, but right 
> now is
> not bothering anybody.
>
> [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
> launched directly without compilation; this streamlined launcher pairs 
> well with
> unnamed classes.
>
> ## Predefined static imports
>
> The most important classes, such as `String` and `Integer`, live in the
> `java.lang` package, which is automatically on-demand imported into all
> compilation units; this is why we do not have to `import 
> java.lang.String` in
> every class.? Static imports were not added until Java 5, but no 
> corresponding
> facility for automatic on-demand import of common behavior was added 
> at that
> time.? Most programs, however, will want to do console IO, and Java 
> forces us to
> do this in a roundabout way -- through the static `System.out` and 
> `System.in`
> fields.? Basic console input and output is a reasonable candidate for
> auto-static import, as one or both are needed by most simple 
> programs.? While
> these are currently instance methods accessed through static fields, 
> we can
> easily create static methods for `println` and `readln` which are 
> suitable for
> static import, and automatically import them.? At which point our 
> first program
> is now down to:
>
> ```
> void main() {
> ??? println("Hello World");
> }
> ```
>
> ## Putting this all together
>
> We've discussed several simplifications:
>
> ?- Update the launcher protocol to make public, static, and arguments 
> optional
> ?? for main methods, and for main methods to be instance methods (when a
> ?? no-argument constructor is available);
> ?- Make the class wrapper for "main classes" optional (unnamed classes);
> ?- Automatically static import methods like `println`
>
> which together whittle our long list of day-1 concepts down 
> considerably.? While
> this is still not as minimal as the minimal Python or Ruby program -- 
> statements
> must still live in a method -- the goal here is not to win at "code 
> golf".? The
> goal is to ensure that concepts not needed by simple programs need not 
> appear in
> those programs, while at the same time not encouraging habits that 
> have to be
> unlearned as programs scale up.
>
> Each of these simplifications is individually small and unintrusive, 
> and each is
> independent of the others.? And each embodies a simple transformation 
> that the
> author can easily manually reverse when it makes sense to do so: elided
> modifiers and `main` arguments can be added back, the class wrapper 
> can be added
> back when the affordances of classes are needed (supertypes, 
> constructors), and
> the full qualifier of static-import can be added back.? And these 
> reversals are
> independent of one another; they can done in any combination or any order.
>
> This seems to meet the requirements of our on-ramp; we've eliminated 
> most of the
> day-1 ceremony elements without introducing new concepts that need to be
> unlearned. The remaining concepts -- a method is a container for 
> statements, and
> a program is a Java source file with a `main` method -- are easily 
> understood in
> relation to their fully specified counterparts.
>
> ## Alternatives
>
> Obviously, we've lived with the status quo for 25+ years, so we could 
> continue
> to do so.? There were other alternatives explored as well; ultimately, 
> each of
> these fell afoul of one of our goals.
>
> ### Can't we go further?
>
> Fans of "code golf" -- of which there are many -- are surely right now 
> trying to
> figure out how to eliminate the last little bit, the `main` method, 
> and allow
> statements to exist at the top-level of a program.? We deliberately 
> stopped
> short of this because it offers little value beyond the first few 
> minutes, and
> even that small value quickly becomes something that needs to be 
> unlearned.
>
> The fundamental problem behind allowing such "loose" statements is that
> variables can be declared inside both classes (fields) and methods (local
> variables), and they share the same syntactic production but not the same
> semantics.? So it is unclear (to both compilers and humans) whether a 
> "loose"
> variable would be a local or a field.? If we tried to adopt some sort 
> of simple
> heuristic to collapse this ambiguity (e.g., whether it precedes or 
> follows the
> first statement), that may satisfy the compiler, but now simple 
> refactorings
> might subtly change the meaning of the program, and we'd be replacing the
> explicit syntactic overhead of `void main()` with an invisible "line" 
> in the
> program that subtly affects semantics, and a new subtle rule about the 
> meaning
> of variable declarations that applies only to unnamed classes.? This 
> doesn't
> help students, nor is this particularly helpful for all but the most 
> trivial
> programs.? It quickly becomes a crutch to be discarded and unlearned, 
> which
> falls afoul of our "on ramp" goals.? Of all the concepts on our list, 
> "methods"
> and "a program is specified by a main method" seem the ones that are 
> most worth
> asking students to learn early.
>
> ### Why not "just" use `jshell`?
>
> While JShell is a great interactive tool, leaning too heavily on it as 
> an onramp
> would fall afoul of our goals.? A JShell session is not a program, but a
> sequence of code snippets.? When we type declarations into `jshell`, 
> they are
> viewed as implicitly static members of some unspecified class, with
> accessibility is ignored completely, and statements execute in a 
> context where
> all previous declarations are in scope.? This is convenient for 
> experimentation
> -- the primary goal of `jshell` -- but not such a great mental model for
> learning to write Java programs.? Transforming a batch of working 
> declarations
> in `jshell` to a real Java program would not be sufficiently simple or
> unintrusive, and would lead to a non-idiomatic style of code, because the
> straightforward translation would have us redeclaring each method, 
> class, and
> variable declaration as `static`.? Further, this is probably not the 
> direction
> we want to go when we scale up from a handful of statements and 
> declarations to
> a simple class -- we probably want to start using classes as classes, 
> not just
> as containers for static members. JShell is a great tool for 
> exploration and
> debugging, and we expect many educators will continue to incorporate 
> it into
> their curriculum, but is not the on-ramp programming model we are 
> looking for.
>
> ### What about "always local"?
>
> One of the main tensions that `main` introduces is that most class 
> members are
> not `static`, but the `main` method is -- and that forces programmers to
> confront the seam between static and non-static members. JShell 
> answers this
> with "make everything static".
>
> Another approach would be to "make everything local" -- treat a simple 
> program
> as being the "unwrapped" body of an implicit main method.? We already 
> allow
> variables and classes to be declared local to a method.? We could add 
> local
> methods (a useful feature in its own right) and relax some of the 
> asymmetries
> around nesting (again, an attractive cleanup), and then treat a mix of
> declarations and statements without a class wrapper as the body of an 
> invisible
> `main` method. This seems an attractive model as well -- at first.
>
> While the syntactic overhead of converting back to full-blown classes 
> -- wrap
> the whole thing in a `main` method and a `class` declaration -- is far 
> less
> intrusive than the transformation inherent in `jshell`, this is still 
> not an
> ideal on-ramp.? Local variables interact with local classes (and 
> methods, when
> we have them) in a very different way than instance fields do with 
> instance
> methods and inner classes: their scopes are different (no forward 
> references),
> their initialization rules are different, and captured local variables 
> must be
> effectively final.? This is a subtly different programming model that 
> would then
> have to be unlearned when scaling up to full classes. Further, the 
> result of
> this wrapping -- where everything is local to the main method -- is 
> also not
> "idiomatic Java".? So while local methods may be an attractive 
> feature, they are
> similarly not the on-ramp we are looking for.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/18deff5e/attachment-0001.htm>

From john.r.rose at oracle.com  Thu Sep 29 20:04:15 2022
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 29 Sep 2022 13:04:15 -0700
Subject: Paving the on-ramp
In-Reply-To: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com>
References: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com>
Message-ID: <B07993CE-54F3-4269-A609-F9D95CA0216E@oracle.com>

On Sep 29, 2022, at 6:55 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
>> (3) Instead of speaking of automatic imports, speak of the compiler automatically providing certain import statements if the compilation unit doesn?t have a class header.
>
> If we did this, when a class "graduates" from a low-ceremony class to a full class, then they'd have to go back and fix up all the println calls, and similarly it would put users in a position of "you can have ceremony reduction X, but only if you qualify for ceremony reduction Y."
> ?
>
> Taken together, coupling "instance main" and "auto static imports" to "no class header" means that we have created a "beginners dialect" which is different, and which has to be unlearned and undone as soon as a class graduates.  I would prefer to have these be orthogonal features to the extent possible.

I like the principle behind Guy?s moves for removing magic, by implicitly adding stuff you could have had explicitly.

But adding `public static main` when there is an instance `main` is not a big payoff, though, since (a) you don?t want to apply such a rule to all class files in existence today, and (b) applying it only to the unnamed classes couples the two features in a (probably) confusing way.

So, with my VM hat on, I say, fine, let?s add another trick to the launcher?s bag of tricks:  If (1) a class is mentioned on a cammand line, then (2) we look for a somewhat wider range of methods (but all named `main`).

Again, having implicit static-imports that one could have written explicitly is very good, and it is a fine de-mystification move to say ?one will be written for you if you didn?t write it already?.  I think that?s a way to explain our handling of `java.lang.*` today, isn?t it?

So there?s a small risk to adding more ?stuff? to `java.lang.*`.  (The problem with `Module` was mentioned in this thread.) And something equivalent to `import static java.lang.StaticImports.*` will further poke the bear, depending on how rich we make the set of imported names.

Here?s a suggestion regarding static imports, specifically, that would match the on-ramp goals and mitigate the risk of name pollution from new *static* imports:

  1. If there is no `import java.lang.*` the program acts as if it were inserted.
  2. If there are *no imports at all*, the program acts as if *two* imports were inserted:  `import java.lang.*` and `import static java.lang.StaticImports.*` (or whatever the name is).

The effect of this is an empty set of imports will get a predictable, useful, and up-to-date set of default names.  That makes for good on-ramp conditions.  To get control over those imports, the user starts adding explicit imports at the top of the file.  We proceed up the on-ramp by a series of one-line changes, not wholesale refactorings.

This is akin to today?s mitigation of the problem with `java.lang.Module`:  You mitigate by *adding another import*, by-name import of your chosen class named `Module`.  That?s how Java has always worked.  Removing an intrusive static import from `java.lang` would (under the above rule) be mitigated more simply; just add any import at all, even a redundant `import java.lang.*`.  That?s a little magic, but the story is clear:  You get a certain ?menu? of imports if you don?t specify *any*.

(Q:  What would break if we also auto-imported `java.util.*` under the null-import condition?  How disruptive would that be??)

I agree, in hindsight, with Guy?s point about unnamed classes in named packages.  I don?t see a deep coupling between those two parts of the language, so don?t make a shallow one.

In general, shallow couplings lead to the problem of ?beginner?s dialect? Brian mentioned:  If simplifications A and B are coupled, when you graduate from one you have to ?complicate? to the others.  In the case of the unnamed package, when you graduate your program to a named package (perhaps because it is now a unit test or utility that needs package API access) you might not want to graduate it, at the same time, from its unnamed format.

With my VM hat on again, I have a tentative suggestion for ?fixing? the problem with an *unintentionally* linkable/denotable class. (As pointed out, that could be a class named `Foo` just because it is anonymous in a file that happens to have the ?pretty name? `Foo.java`.)

Suggestion:  Allow classfiles (in newer classfile versions) to specify `ACC_PRIVATE` in their `access_flags` for the class.  With the obvious (!?) meaning:  A class marked private (at the VM level) will fail access checks except to itself and its nestmates (if any).  Roll it out as a VM feature first, and later as a slightly-incompatible language change for nested classes.  Heck, even named classes (that?s a compatible extension).

(Immediate use cases: All non-denotable classes are compiled `ACC_PRIVATE`.  That includes both ?on ramp? unnamed classes and also any ?inner class? which doesn?t have a linkable bytecode name.)

Second suggestion (independent of first):  In the example of `Foo.java`, ?poison? the name `Foo` in a predictable way (prepend `$` or add `$unnamed` for example), and also mark the class as Synthetic (or with a new attribute).  Then, liberalize the launcher *ever so slightly* so as to allow (1) either the exactly matching name as today, (2) the predictably poisoned name (`Foo$unnamed`) if the class is also marked as synthetic/unnamed/whatever with an attribute.  This will put unnamed classes on a common footing with other classes (local & anonymous inner classes) that already have linkable-but-unpredictable names.  This is simpler than supporting `ACC_PRIVATE` and probably easier in the resolver (since there are just two names to check instead of one).

Third suggestion, probably not usable:  We have properly anonymous classes in the VM (VMACs), which have names that not even the class itself can resolve; they have a special ability to self-resolve `CONSTANT_Class` but it is hardwired and doesn?t go through a class-loader.  We could try to do something like this for unnamed classes, *but* it would not scale well to unnamed classes *which have named nested classes*.  To name those nested classes `Foo.Bar` you need a resolvable name like `Foo$unnamed$Bar`.  (But the classes could be marked `ACC_PRIVATE`; see above.)

I don?t know a clean way to fix the syntax ambiguity between (a) nested class/interface of unnamed class (new) and (b) non-public top-level (package-member) class/interface (old).  Here are two dirty workarounds, both of which make such secondary classes into inherently non-linkable inner classes:

1. Put all your nested classes together in a method body.
2. Put all your nested classes in an instance initializer (magic braces!).

Both have the problem that the class names don?t scope to the whole top-level (unnamed) class, so they are non-starters I guess, but might jog someone else?s imagination for a better workaround.

Here?s another workaround, which I guess Brian already mentioned:

3. If your user is wishing for nested classes or interfaces (or more likely records), then it?s time to learn about type definitions, so require them to ?graduate? to a top-level class *at that time*.

Tentative suggestion, again for brainstorming:  A way to smooth *that* move might be to provide yet more syntaxes to declare a *class which is not denotable but which has a body*.  Something like a truncated class header with a body:  `class /*empty header*/ { ?body here?}`.  The rule would be: If you are defining classes, it?s time to acknowledge you are defining a top-level one to surround them, but you don?t have to name it yet; it?s ?just there?.

(On this slippery slope, maybe allow nested unnamed classes as well? And/or unnamed-but-denotable constructors: `class { ? String field; class(String field) { this.field = field; } ? `.  This doesn?t appeal much to me, at least until we have compelling new use cases for anonymous classes, not already covered by `new Object() { ? }`.  Enhanced inference could make such a class into a poly-type-expression, someday, for some contexts where supers would be inferred.  I think that?s what C# does in this vein.)

OK, that?s enough BS (brainstorming, of course) from me.

From guy.steele at oracle.com  Thu Sep 29 20:58:45 2022
From: guy.steele at oracle.com (Guy Steele)
Date: Thu, 29 Sep 2022 20:58:45 +0000
Subject: Paving the on-ramp
In-Reply-To: <B07993CE-54F3-4269-A609-F9D95CA0216E@oracle.com>
References: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com>
 <B07993CE-54F3-4269-A609-F9D95CA0216E@oracle.com>
Message-ID: <1F34BC13-4FC8-4061-877C-72DF08EB1EB9@oracle.com>


> On Sep 29, 2022, at 4:04 PM, John Rose <john.r.rose at oracle.com> wrote:
> . . .
> 
> I like the principle behind Guy?s moves for removing magic, by implicitly adding stuff you could have had explicitly.

Thanks for the phrasing of those last nine words. And someone else has pointed out to me that the expression of my ideas would have been clearer, more general, and more accurate if I had spoken in terms of ?implicit declaration? under thus-and-so circumstances, rather than assuming that the compiler is necessarily the mechanism by which those implicit declarations are handled.

I?m not insistent on the particular solutions I suggested; I?m just happy to have gotten everyone else thinking in that direction.

?Guy


From john.r.rose at oracle.com  Thu Sep 29 22:20:11 2022
From: john.r.rose at oracle.com (John Rose)
Date: Thu, 29 Sep 2022 15:20:11 -0700
Subject: Paving the on-ramp
In-Reply-To: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr>
Message-ID: <81497E7C-98E8-4650-9F18-FC6BF3EBF3F9@oracle.com>

On 28 Sep 2022, at 13:49, Remi Forax wrote:

> - We should not be able to declare fields inside a classless class, 
> students strugle at the beginning to make the difference between a 
> field and a local variable.
> Every syntax that make that distinction murkier is a bad idea.
> So perhaps what we want is a classless container of methods, not a 
> classless class.

Hmmm?  That would be an interface.  I?ll pull on that thread a 
little:

An interface has no non-static fields and (bonus) its static fields are 
always constant.  So you can teach interface *as a container* without 
getting into mutability.

Methods would have to be implicitly decorated with `default` in an 
anonymous *interface*.

The execution of an instance-main anonymous interface would look almost 
*exactly* like that for a class:

`public static void main(String[] av) { new <ThisClass>(){}.main(); }`

The only difference is the `{}`.  Abstracts would be forbidden in an 
anonymous interface:  Every method has a body, just as every field has 
an initializer.

Bonus:  No instance initializers, since it?s an interface. (No 
constructors either.)  So the headaches about initialization-related 
syntaxes go away without additional special pleading.

Objection:  *That?s no interface!*  Well, true.  Except it is an 
interface to the system, being a launch point.  (Is that just a bad 
pun?)  Also, folks use interfaces today as an idiom for a lightweight 
container of Java code (at least, I do that).

Bonus:  If the ?instance main? feature is supported *only for 
interface containers* then some issues of accidentally creating a main 
(in existing code) go away, simply because the attack surface (for 
accidents) gets smaller.  Yes, that?s a yucky bonus.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/4eae8766/attachment.htm>

From brian.goetz at oracle.com  Thu Sep 29 22:20:21 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Thu, 29 Sep 2022 18:20:21 -0400
Subject: Paving the on-ramp: couplings
In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
Message-ID: <cbef21b3-6d05-7b7a-b788-c07d372b2b6b@oracle.com>

I thought it would be useful to enumerate where the couplings are 
between the various features here.? The goal is to avoid what John is 
calling "shallow couplings".? I'll break the features down into more 
granularity:

 ?- predefined static imports
 ?- public is optional on main
 ?- args are optional on main
 ?- main can be instance or static
 ?- unnamed classes

And they interact with these (currently):

 ?- unnamed packages
 ?- constructors

Coupling: unnamed classes must live in the unnamed package.

Coupling: public is only optional on main methods in the unnamed package.

Coupling: instance main requires a no-arg constructor.

Coupling: unnamed classes don't get constructors.

Coupling: unnamed classes must have a main.


Before we set to arguing whether these couplings are OK or not, what 
others have I missed?


(Bonus naming round: while I like the concept of unnamed classes, it may 
not be a perfect fit; if we decide the fit is too poor, we could call 
them "implicit classes".)


On 9/28/2022 1:57 PM, Brian Goetz wrote:
> At various points, we've explored the question of which program 
> elements are most and least helpful for students first learning Java.? 
> After considering a number of alternatives over the years, I have a 
> simple proposal for smoothing the "on ramp" to Java programming, while 
> not creating new things to unlearn.
>
> Markdown source is below, HTML will appear soon at:
>
> https://openjdk.org/projects/amber/design-notes/on-ramp
>
>
> # Paving the on-ramp
>
> Java is one of the most widely taught programming languages in the 
> world.? Tens
> of thousands of educators find that the imperative core of the 
> language combined
> with a straightforward standard library is a foundation that students can
> comfortably learn on.? Choosing Java gives educators many degrees of 
> freedom:
> they can situate students in `jshell` or Notepad or a full-fledged 
> IDE; they can
> teach imperative, object-oriented, functional, or hybrid programming 
> styles; and
> they can easily find libraries to interact with external data and 
> services.
>
> No language is perfect, and one of the most common complaints about 
> Java is that
> it is "too verbose" or has "too much ceremony."? And unfortunately, 
> Java imposes
> its heaviest ceremony on those first learning the language, who need and
> appreciate it the least.? The declaration of a class and the 
> incantation of
> `public static void main` is pure mystery to a beginning programmer.? 
> While
> these incantations have principled origins and serve a useful 
> organizing purpose
> in larger programs, they have the effect of placing obstacles in the 
> path of
> _becoming_ Java programmers. Educators constantly remind us of the 
> litany of
> complexity that students have to confront on Day 1 of class -- when 
> they really
> just want to write their first program.
>
> As an amusing demonstration of this, in her JavaOne keynote appearance 
> in 2019,
> [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked 
> about when
> she learned to program in Java, and how her teacher performed a rap song
> to help students memorize `"public static void main"`.? Our hats are 
> off to
> creative educators everywhere for this kind of dedication, but teachers
> shouldn't have to do this.
>
> Of course, advanced programmers complain about ceremony too. We will 
> never be
> able to satisfy programmers' insatiable appetite for typing fewer 
> keystrokes,
> and we shouldn't try, because the goal of programming is to write 
> programs that
> are easy to read and are clearly correct, not programs that were easy 
> to type.
> But we can try to better align the ceremony commensurate with the value it
> brings to a program -- and let simple programs be expressed more simply.
>
> ## Concept overload
>
> The classic "Hello World" program looks like this in Java:
>
> ```
> public class HelloWorld {
> ??? public static void main(String[] args) {
> ??????? System.out.println("Hello World");
> ??? }
> }
> ```
>
> It may only be five lines, but those lines are packed with concepts 
> that are
> challenging to absorb without already having some programming 
> experience and
> familiarity with object orientation. Let's break down the concepts a 
> student
> confronts when writing their first Java program:
>
> ? - **public** (on the class).? The `public` accessibility level is 
> relevant
> ??? only when there is going to be cross-package access; in a simple 
> "Hello
> ??? World" program, there is only one class, which lives in the 
> unnamed package.
> ??? They haven't even written a one-line program yet; the notion of access
> ??? control -- keeping parts of a program from accessing other parts 
> of it -- is
> ??? still way in their future.
>
> ? - **class**.? Our student hasn't set out to write a _class_, or model a
> ??? complex system with objects; they want to write a _program_.? In 
> Java, a
> ??? program is just a `main` method in some class, but at this point 
> our student
> ??? still has no idea what a class is or why they want one.
>
> ? - **Methods**.? Methods are of course a key concept in Java, but the 
> mechanics
> ??? of methods -- parameters, return types, and invocation -- are still
> ??? unfamiliar, and the `main` method is invoked magically from the `java`
> ??? launcher rather than from explicit code.
>
> ? - **public** (again).? Like the class, the `main` method has to be 
> public, but
> ??? again this is only relevant when programs are large enough to require
> ??? packages to organize them.
>
> ? - **static**.? The `main` method has to be static, and at this 
> point, students
> ??? have no context for understanding what a static method is or why 
> they want
> ??? one.? Worse, the early exposure to `static` methods will turn out 
> to be a
> ??? bad habit that must be later unlearned.? Worse still, the fact 
> that the
> ??? `main` method is `static` creates a seam between `main` and other 
> methods;
> ??? either they must become `static` too, or the `main` method must 
> trampoline
> ??? to some sort of "instance main" (more ceremony!)? And if we get 
> this wrong,
> ??? we get the dreaded and mystifying `"cannot be referenced from a static
> ??? context"` error.
>
> ? - **main**.? The name `main` has special meaning in a Java program, 
> indicating
> ??? the starting point of a program, but this specialness hides behind 
> being an
> ??? ordinary method name.? This may contribute to the sense of "so 
> many magic
> ??? incantations."
>
> ? - **String[]**.? The parameter to `main` is an array of strings, 
> which are the
> ??? arguments that the `java` launcher collected from the command 
> line.? But our
> ??? first program -- likely our first dozen -- will not use command-line
> ??? parameters. Requiring the `String[]` parameter is, at this point, 
> a mistake
> ??? waiting to happen, and it will be a long time until this parameter 
> makes
> ??? sense.? Worse, educators may be tempted to explain arrays at this 
> point,
> ??? which further increases the time-to-first-program.
>
> ? - **System.out.println**.? If you look closely at this incantation, each
> ??? element in the chain is a different thing -- `System` is a class 
> (what's a
> ??? class again?), `out` is a static field (what's a field?), and 
> `println` is
> ??? an instance method.? The only part the student cares about right 
> now is
> ??? `println`; the rest of it is an incantation that they do not yet 
> understand
> ??? in order to get at the behavior they want.
>
> That's a lot to explain to a student on the first day of class.? 
> There's a good
> chance that by now, class is over and we haven't written any programs 
> yet, or
> the teacher has said "don't worry what this means, you'll understand 
> it later"
> six or eight times.? Not only is this a lot of _syntactic_ things to 
> absorb, but
> each of those things appeals to a different concept (class, method, 
> package,
> return value, parameter, array, static, public, etc) that the student 
> doesn't
> have a framework for understanding yet.? Each of these will have an 
> important
> role to play in larger programs, but so far, they only contribute to "wow,
> programming is complicated."
>
> It won't be practical (or even desirable) to get _all_ of these 
> concepts out of
> the student's face on day 1, but we can do a lot -- and focus on the 
> ones that
> do the most to help beginners understand how programs are constructed.
>
> ## Goal: a smooth on-ramp
>
> As much as programmers like to rant about ceremony, the real goal here 
> is not
> mere ceremony reduction, but providing a graceful _on ramp_ to Java 
> programming.
> This on-ramp should be helpful to beginning programmers by requiring 
> only those
> concepts that a simple program needs.
>
> Not only should an on-ramp have a gradual slope and offer enough 
> acceleration
> distance to get onto the highway at the right speed, but its direction 
> must
> align with that of the highway.? When a programmer is ready to learn 
> about more
> advanced concepts, they should not have to discard what they've 
> already learned,
> but instead easily see how the simple programs they've already written
> generalize to more complicated ones, and both the syntatic and conceptual
> transformation from "simple" to "full blown" program should be 
> straightforward
> and unintrusive.? It is a definite non-goal to create a "simplified 
> dialect of
> Java for students".
>
> We identify three simplifications that should aid both educators and 
> students in
> navigating the on-ramp to Java, as well as being generally useful to 
> simple
> programs beyond the classroom as well:
>
> ?- A more tolerant launch protocol
> ?- Unnamed classes
> ?- Predefined static imports for the most critical methods and fields
>
> ## A more tolerant launch protocol
>
> The Java Language Specification has relatively little to say about how 
> Java
> "programs" get launched, other than saying that there is some way to 
> indicate
> which class is the initial class of a program (JLS 12.1.1) and that a 
> public
> static method called `main` whose sole argument is of type `String[]` 
> and whose
> return is `void` constitutes the entry point of the indicated class.
>
> We can eliminate much of the concept overload simply by relaxing the
> interactions between a Java program and the `java` launcher:
>
> ?- Relax the requirement that the class, and `main` method, be 
> public.? Public
> ?? accessibility is only relevant when access crosses packages; simple 
> programs
> ?? live in the unnamed package, so cannot be accessed from any other 
> package
> ?? anyway.? For a program whose main class is in the unnamed package, 
> we can
> ?? drop the requirement that the class or its `main` method be public,
> ?? effectively treating the `java` launcher as if it too resided in 
> the unnamed
> ?? package.
>
> ?- Make the "args" parameter to `main` optional, by allowing the 
> `java` launcher to
> ?? first look for a main method with the traditional `main(String[])`
> ?? signature, and then (if not found) for a main method with no arguments.
>
> ?- Make the `static` modifier on `main` optional, by allowing the 
> `java` launcher to
> ?? invoke an instance `main` method (of either signature) by 
> instantiating an
> ?? instance using an accessible no-arg constructor and then invoking 
> the `main`
> ?? method on it.
>
> This small set of changes to the launch protocol strikes out five of 
> the bullet
> points in the above list of concepts: public (twice), static, method 
> parameters,
> and `String[]`.
>
> At this point, our Hello World program is now:
>
> ```
> class HelloWorld {
> ??? void main() {
> ??????? System.out.println("Hello World");
> ??? }
> }
> ```
>
> It's not any shorter by line count, but we've removed a lot of "horizontal
> noise" along with a number of concepts.? Students and educators will 
> appreciate
> it, but advanced programmers are unlikely to be in any hurry to make these
> implicit elements explicit either.
>
> Additionally, the notion of an "instance main" has value well beyond 
> the first
> day.? Because excessive use of `static` is considered a code smell, many
> educators encourage the pattern of "all the static `main` method does is
> instantiate an instance and call an instance `main` method" anyway.? 
> Formalizing
> the "instance main" protocol reduces a layer of boilerplate in these 
> cases, and
> defers the point at which we have to explain what instance creation is 
> -- and
> what `static` is.? (Further, allowing the `main` method to be an 
> instance method
> means that it could be inherited from a superclass, which is useful 
> for simple
> frameworks such as test runners or service frameworks.)
>
> ## Unnamed classes
>
> In a simple program, the `class` declaration often doesn't help 
> either, because
> other classes (if there are any) are not going to reference it by 
> name, and we
> don't extend a superclass or implement any interfaces.? If we say an 
> "unnamed
> class" consists of member declarations without a class header, then 
> our Hello
> World program becomes:
>
> ```
> void main() {
> ??? System.out.println("Hello World");
> }
> ```
>
> Such source files can still have fields, methods, and even nested 
> classes, so
> that as a program evolves from a few statements to needing some 
> ancillary state
> or helper methods, these can be factored out of the `main` method 
> while still
> not yet requiring a full class declaration:
>
> ```
> String greeting() { return "Hello World"; }
>
> void main() {
> ??? System.out.println(greeting());
> }
> ```
>
> This is where treating `main` as an instance method really shines; the 
> user has
> just declared two methods, and they can freely call each other.? 
> Students need
> not confront the confusing distinction between instance and static 
> methods yet;
> indeed, if not forced to confront static members on day 1, it might be 
> a while
> before they do have to learn this distinction.? The fact that there is a
> receiver lurking in the background will come in handy later, but right 
> now is
> not bothering anybody.
>
> [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
> launched directly without compilation; this streamlined launcher pairs 
> well with
> unnamed classes.
>
> ## Predefined static imports
>
> The most important classes, such as `String` and `Integer`, live in the
> `java.lang` package, which is automatically on-demand imported into all
> compilation units; this is why we do not have to `import 
> java.lang.String` in
> every class.? Static imports were not added until Java 5, but no 
> corresponding
> facility for automatic on-demand import of common behavior was added 
> at that
> time.? Most programs, however, will want to do console IO, and Java 
> forces us to
> do this in a roundabout way -- through the static `System.out` and 
> `System.in`
> fields.? Basic console input and output is a reasonable candidate for
> auto-static import, as one or both are needed by most simple 
> programs.? While
> these are currently instance methods accessed through static fields, 
> we can
> easily create static methods for `println` and `readln` which are 
> suitable for
> static import, and automatically import them.? At which point our 
> first program
> is now down to:
>
> ```
> void main() {
> ??? println("Hello World");
> }
> ```
>
> ## Putting this all together
>
> We've discussed several simplifications:
>
> ?- Update the launcher protocol to make public, static, and arguments 
> optional
> ?? for main methods, and for main methods to be instance methods (when a
> ?? no-argument constructor is available);
> ?- Make the class wrapper for "main classes" optional (unnamed classes);
> ?- Automatically static import methods like `println`
>
> which together whittle our long list of day-1 concepts down 
> considerably.? While
> this is still not as minimal as the minimal Python or Ruby program -- 
> statements
> must still live in a method -- the goal here is not to win at "code 
> golf".? The
> goal is to ensure that concepts not needed by simple programs need not 
> appear in
> those programs, while at the same time not encouraging habits that 
> have to be
> unlearned as programs scale up.
>
> Each of these simplifications is individually small and unintrusive, 
> and each is
> independent of the others.? And each embodies a simple transformation 
> that the
> author can easily manually reverse when it makes sense to do so: elided
> modifiers and `main` arguments can be added back, the class wrapper 
> can be added
> back when the affordances of classes are needed (supertypes, 
> constructors), and
> the full qualifier of static-import can be added back.? And these 
> reversals are
> independent of one another; they can done in any combination or any order.
>
> This seems to meet the requirements of our on-ramp; we've eliminated 
> most of the
> day-1 ceremony elements without introducing new concepts that need to be
> unlearned. The remaining concepts -- a method is a container for 
> statements, and
> a program is a Java source file with a `main` method -- are easily 
> understood in
> relation to their fully specified counterparts.
>
> ## Alternatives
>
> Obviously, we've lived with the status quo for 25+ years, so we could 
> continue
> to do so.? There were other alternatives explored as well; ultimately, 
> each of
> these fell afoul of one of our goals.
>
> ### Can't we go further?
>
> Fans of "code golf" -- of which there are many -- are surely right now 
> trying to
> figure out how to eliminate the last little bit, the `main` method, 
> and allow
> statements to exist at the top-level of a program.? We deliberately 
> stopped
> short of this because it offers little value beyond the first few 
> minutes, and
> even that small value quickly becomes something that needs to be 
> unlearned.
>
> The fundamental problem behind allowing such "loose" statements is that
> variables can be declared inside both classes (fields) and methods (local
> variables), and they share the same syntactic production but not the same
> semantics.? So it is unclear (to both compilers and humans) whether a 
> "loose"
> variable would be a local or a field.? If we tried to adopt some sort 
> of simple
> heuristic to collapse this ambiguity (e.g., whether it precedes or 
> follows the
> first statement), that may satisfy the compiler, but now simple 
> refactorings
> might subtly change the meaning of the program, and we'd be replacing the
> explicit syntactic overhead of `void main()` with an invisible "line" 
> in the
> program that subtly affects semantics, and a new subtle rule about the 
> meaning
> of variable declarations that applies only to unnamed classes.? This 
> doesn't
> help students, nor is this particularly helpful for all but the most 
> trivial
> programs.? It quickly becomes a crutch to be discarded and unlearned, 
> which
> falls afoul of our "on ramp" goals.? Of all the concepts on our list, 
> "methods"
> and "a program is specified by a main method" seem the ones that are 
> most worth
> asking students to learn early.
>
> ### Why not "just" use `jshell`?
>
> While JShell is a great interactive tool, leaning too heavily on it as 
> an onramp
> would fall afoul of our goals.? A JShell session is not a program, but a
> sequence of code snippets.? When we type declarations into `jshell`, 
> they are
> viewed as implicitly static members of some unspecified class, with
> accessibility is ignored completely, and statements execute in a 
> context where
> all previous declarations are in scope.? This is convenient for 
> experimentation
> -- the primary goal of `jshell` -- but not such a great mental model for
> learning to write Java programs.? Transforming a batch of working 
> declarations
> in `jshell` to a real Java program would not be sufficiently simple or
> unintrusive, and would lead to a non-idiomatic style of code, because the
> straightforward translation would have us redeclaring each method, 
> class, and
> variable declaration as `static`.? Further, this is probably not the 
> direction
> we want to go when we scale up from a handful of statements and 
> declarations to
> a simple class -- we probably want to start using classes as classes, 
> not just
> as containers for static members. JShell is a great tool for 
> exploration and
> debugging, and we expect many educators will continue to incorporate 
> it into
> their curriculum, but is not the on-ramp programming model we are 
> looking for.
>
> ### What about "always local"?
>
> One of the main tensions that `main` introduces is that most class 
> members are
> not `static`, but the `main` method is -- and that forces programmers to
> confront the seam between static and non-static members. JShell 
> answers this
> with "make everything static".
>
> Another approach would be to "make everything local" -- treat a simple 
> program
> as being the "unwrapped" body of an implicit main method.? We already 
> allow
> variables and classes to be declared local to a method.? We could add 
> local
> methods (a useful feature in its own right) and relax some of the 
> asymmetries
> around nesting (again, an attractive cleanup), and then treat a mix of
> declarations and statements without a class wrapper as the body of an 
> invisible
> `main` method. This seems an attractive model as well -- at first.
>
> While the syntactic overhead of converting back to full-blown classes 
> -- wrap
> the whole thing in a `main` method and a `class` declaration -- is far 
> less
> intrusive than the transformation inherent in `jshell`, this is still 
> not an
> ideal on-ramp.? Local variables interact with local classes (and 
> methods, when
> we have them) in a very different way than instance fields do with 
> instance
> methods and inner classes: their scopes are different (no forward 
> references),
> their initialization rules are different, and captured local variables 
> must be
> effectively final.? This is a subtly different programming model that 
> would then
> have to be unlearned when scaling up to full classes. Further, the 
> result of
> this wrapping -- where everything is local to the main method -- is 
> also not
> "idiomatic Java".? So while local methods may be an attractive 
> feature, they are
> similarly not the on-ramp we are looking for.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220929/a5efc82e/attachment-0001.htm>

From brian.goetz at oracle.com  Fri Sep 30 19:07:09 2022
From: brian.goetz at oracle.com (Brian Goetz)
Date: Fri, 30 Sep 2022 15:07:09 -0400
Subject: Paving the on-ramp: couplings
In-Reply-To: <cbef21b3-6d05-7b7a-b788-c07d372b2b6b@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <cbef21b3-6d05-7b7a-b788-c07d372b2b6b@oracle.com>
Message-ID: <cc93d475-6a83-96b6-ab4d-2eb271b8090a@oracle.com>


> Coupling: unnamed classes must live in the unnamed package.

The rationale for this is that the only thing you can do with an unnamed 
class is run it from the command line, and it may well be the only class 
in your program.? If you're going to the effort of organizing into 
packages and distributing a JAR, you're well outside the use case for an 
unnamed class.

Another way to phrase this coupling is: distribution -> requires named 
classes.

> Coupling: public is only optional on main methods in the unnamed package.

This is largely a forced move, because giving the launcher additional 
privileges to open classes in existing packages would allow running of 
"main" methods that are not allowed today, which seems a compromise to 
the accessibility model.? Situating the launcher in the unnamed package 
seems an entirely unsurprising thing, and again, people don't (or 
shouldn't) distribute code in the unnamed package.

Another way to phrase this coupling is: distribution -> requires public 
entry points.

> Coupling: instance main requires a no-arg constructor.

Pretty hard to imagine getting around this one; seems intrinsic to the 
"instance main" feature.

> Coupling: unnamed classes don't get constructors.

This one could be decoupled, though I'm not sure it helps.

> Coupling: unnamed classes must have a main.

If we interpret unnamed as really unnamed, the only thing you can do 
with an unnamed class is run it via the launcher, so not having a main 
would be silly.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220930/18cda6ca/attachment.htm>

From forax at univ-mlv.fr  Fri Sep 30 19:41:31 2022
From: forax at univ-mlv.fr (Remi Forax)
Date: Fri, 30 Sep 2022 21:41:31 +0200 (CEST)
Subject: Paving the on-ramp: couplings
In-Reply-To: <cc93d475-6a83-96b6-ab4d-2eb271b8090a@oracle.com>
References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com>
 <cbef21b3-6d05-7b7a-b788-c07d372b2b6b@oracle.com>
 <cc93d475-6a83-96b6-ab4d-2eb271b8090a@oracle.com>
Message-ID: <446462979.16648698.1664566891769.JavaMail.zimbra@u-pem.fr>

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Friday, September 30, 2022 9:07:09 PM
> Subject: Re: Paving the on-ramp: couplings

>> Coupling: unnamed classes must live in the unnamed package.

> The rationale for this is that the only thing you can do with an unnamed class
> is run it from the command line, and it may well be the only class in your
> program. If you're going to the effort of organizing into packages and
> distributing a JAR, you're well outside the use case for an unnamed class.

> Another way to phrase this coupling is: distribution -> requires named classes.

>> Coupling: public is only optional on main methods in the unnamed package.

> This is largely a forced move, because giving the launcher additional privileges
> to open classes in existing packages would allow running of "main" methods that
> are not allowed today, which seems a compromise to the accessibility model.
> Situating the launcher in the unnamed package seems an entirely unsurprising
> thing, and again, people don't (or shouldn't) distribute code in the unnamed
> package.

> Another way to phrase this coupling is: distribution -> requires public entry
> points.

>> Coupling: instance main requires a no-arg constructor.

> Pretty hard to imagine getting around this one; seems intrinsic to the "instance
> main" feature.
Technically you can store the array of arguments in a field but fields should not be allowed, see below. 

>> Coupling: unnamed classes don't get constructors.

> This one could be decoupled, though I'm not sure it helps.

>> Coupling: unnamed classes must have a main.

> If we interpret unnamed as really unnamed, the only thing you can do with an
> unnamed class is run it via the launcher, so not having a main would be silly.

Coupling: a nested class should not have the same name as the filename (minus ".java") of an unamed classes 

Avoid confusion between a top-level class and a nested class of an unnamed class (as proposed by Tagir) 

Coupling: unnamed classes don't get fields. 

If there is no constructor, there is no way to properly initialize fields. And field syntax is too close to the local variable syntax when there is no enclosing class. 

R?mi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220930/2c20e8ea/attachment.htm>