Paving the on-ramp

Brian Goetz brian.goetz at oracle.com
Thu Sep 29 18:20:42 UTC 2022


One thing that this forces us to confront, in some manner or other, are 
the rules about the relationship between the file name and the class 
name, which are spread out in a few different places.

The compiler issues an error when a top-level *public* class does not 
match the file it is in.  Order is irrelevant; the following are valid 
Foo.java files:

--
public class Foo { }
class Bar { }
--
class Foo { }
class Bar { }
--
class Bar { }
public class Foo { }
--

but the following are illegal for Foo.java:

--
public class Foo { }
public class Bar { }
--
public class Bar { }
--

The standard class loader implementation enforces that a class file 
X.class must contain a class called X (javap will warn about this too.)  
When compiling code in-memory, the javac "FileManager" abstraction still 
requires a "file name" for each source unit being compiled, even if 
there is no actual file.


The prototype implementation we have does infer a class name from the 
file name in the obvious way; we just don't enter it into the symbol 
table.  But, if you put this in Foo.java:

--
void main() { }
--

You'll get a Foo.class with class Foo in it, and if you put that class 
file on the class path, *other* classes can instantiate it by name.  So 
the prototype is currently in a half-here, half-there situation.  It is 
probably overkill to try to have some ACC_UNNAMED marking to prevent 
this.  So we might accept this odd state, or we might embrace it as Guy 
suggests, and go ahead and enter Foo in the symbol table, and even let 
people declare constructors.  That means that the name "unnamed class" 
would no longer be an accurate name (a shame, since its friends "unnamed 
module" and "unnamed package" have been saving it a seat.)  This is a 
workable direction, though not a forced move.


The other connection point with the file name is the one Tagir brought 
up, which is the effect of accidentally putting a method or field 
outside the braces.  If you have a class

--
class Foo { }
void x() { }
--

today, this is an error; under this proposal, this becomes a valid 
unnamed class with a *nested* class Foo, which may not be what was 
meant.  We can reduce the possibility of this by issuing a warning/error 
if an unnamed class has a "top-level" nested class whose name matches 
the file.  This seems reasonably consistent with the existing rules 
constraining file names and class names.  If we combined this with the 
previous move, this becomes "An unnamed class cannot have a top-level 
nested class of the same name".



On 9/28/2022 1:57 PM, Brian Goetz wrote:
> At various points, we've explored the question of which program 
> elements are most and least helpful for students first learning Java.  
> After considering a number of alternatives over the years, I have a 
> simple proposal for smoothing the "on ramp" to Java programming, while 
> not creating new things to unlearn.
>
> Markdown source is below, HTML will appear soon at:
>
> https://openjdk.org/projects/amber/design-notes/on-ramp
>
>
> # Paving the on-ramp
>
> Java is one of the most widely taught programming languages in the 
> world.  Tens
> of thousands of educators find that the imperative core of the 
> language combined
> with a straightforward standard library is a foundation that students can
> comfortably learn on.  Choosing Java gives educators many degrees of 
> freedom:
> they can situate students in `jshell` or Notepad or a full-fledged 
> IDE; they can
> teach imperative, object-oriented, functional, or hybrid programming 
> styles; and
> they can easily find libraries to interact with external data and 
> services.
>
> No language is perfect, and one of the most common complaints about 
> Java is that
> it is "too verbose" or has "too much ceremony."  And unfortunately, 
> Java imposes
> its heaviest ceremony on those first learning the language, who need and
> appreciate it the least.  The declaration of a class and the 
> incantation of
> `public static void main` is pure mystery to a beginning programmer.  
> While
> these incantations have principled origins and serve a useful 
> organizing purpose
> in larger programs, they have the effect of placing obstacles in the 
> path of
> _becoming_ Java programmers. Educators constantly remind us of the 
> litany of
> complexity that students have to confront on Day 1 of class -- when 
> they really
> just want to write their first program.
>
> As an amusing demonstration of this, in her JavaOne keynote appearance 
> in 2019,
> [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked 
> about when
> she learned to program in Java, and how her teacher performed a rap song
> to help students memorize `"public static void main"`.  Our hats are 
> off to
> creative educators everywhere for this kind of dedication, but teachers
> shouldn't have to do this.
>
> Of course, advanced programmers complain about ceremony too. We will 
> never be
> able to satisfy programmers' insatiable appetite for typing fewer 
> keystrokes,
> and we shouldn't try, because the goal of programming is to write 
> programs that
> are easy to read and are clearly correct, not programs that were easy 
> to type.
> But we can try to better align the ceremony commensurate with the value it
> brings to a program -- and let simple programs be expressed more simply.
>
> ## Concept overload
>
> The classic "Hello World" program looks like this in Java:
>
> ```
> public class HelloWorld {
>     public static void main(String[] args) {
>         System.out.println("Hello World");
>     }
> }
> ```
>
> It may only be five lines, but those lines are packed with concepts 
> that are
> challenging to absorb without already having some programming 
> experience and
> familiarity with object orientation. Let's break down the concepts a 
> student
> confronts when writing their first Java program:
>
>   - **public** (on the class).  The `public` accessibility level is 
> relevant
>     only when there is going to be cross-package access; in a simple 
> "Hello
>     World" program, there is only one class, which lives in the 
> unnamed package.
>     They haven't even written a one-line program yet; the notion of access
>     control -- keeping parts of a program from accessing other parts 
> of it -- is
>     still way in their future.
>
>   - **class**.  Our student hasn't set out to write a _class_, or model a
>     complex system with objects; they want to write a _program_.  In 
> Java, a
>     program is just a `main` method in some class, but at this point 
> our student
>     still has no idea what a class is or why they want one.
>
>   - **Methods**.  Methods are of course a key concept in Java, but the 
> mechanics
>     of methods -- parameters, return types, and invocation -- are still
>     unfamiliar, and the `main` method is invoked magically from the `java`
>     launcher rather than from explicit code.
>
>   - **public** (again).  Like the class, the `main` method has to be 
> public, but
>     again this is only relevant when programs are large enough to require
>     packages to organize them.
>
>   - **static**.  The `main` method has to be static, and at this 
> point, students
>     have no context for understanding what a static method is or why 
> they want
>     one.  Worse, the early exposure to `static` methods will turn out 
> to be a
>     bad habit that must be later unlearned.  Worse still, the fact 
> that the
>     `main` method is `static` creates a seam between `main` and other 
> methods;
>     either they must become `static` too, or the `main` method must 
> trampoline
>     to some sort of "instance main" (more ceremony!)  And if we get 
> this wrong,
>     we get the dreaded and mystifying `"cannot be referenced from a static
>     context"` error.
>
>   - **main**.  The name `main` has special meaning in a Java program, 
> indicating
>     the starting point of a program, but this specialness hides behind 
> being an
>     ordinary method name.  This may contribute to the sense of "so 
> many magic
>     incantations."
>
>   - **String[]**.  The parameter to `main` is an array of strings, 
> which are the
>     arguments that the `java` launcher collected from the command 
> line.  But our
>     first program -- likely our first dozen -- will not use command-line
>     parameters. Requiring the `String[]` parameter is, at this point, 
> a mistake
>     waiting to happen, and it will be a long time until this parameter 
> makes
>     sense.  Worse, educators may be tempted to explain arrays at this 
> point,
>     which further increases the time-to-first-program.
>
>   - **System.out.println**.  If you look closely at this incantation, each
>     element in the chain is a different thing -- `System` is a class 
> (what's a
>     class again?), `out` is a static field (what's a field?), and 
> `println` is
>     an instance method.  The only part the student cares about right 
> now is
>     `println`; the rest of it is an incantation that they do not yet 
> understand
>     in order to get at the behavior they want.
>
> That's a lot to explain to a student on the first day of class.  
> There's a good
> chance that by now, class is over and we haven't written any programs 
> yet, or
> the teacher has said "don't worry what this means, you'll understand 
> it later"
> six or eight times.  Not only is this a lot of _syntactic_ things to 
> absorb, but
> each of those things appeals to a different concept (class, method, 
> package,
> return value, parameter, array, static, public, etc) that the student 
> doesn't
> have a framework for understanding yet.  Each of these will have an 
> important
> role to play in larger programs, but so far, they only contribute to "wow,
> programming is complicated."
>
> It won't be practical (or even desirable) to get _all_ of these 
> concepts out of
> the student's face on day 1, but we can do a lot -- and focus on the 
> ones that
> do the most to help beginners understand how programs are constructed.
>
> ## Goal: a smooth on-ramp
>
> As much as programmers like to rant about ceremony, the real goal here 
> is not
> mere ceremony reduction, but providing a graceful _on ramp_ to Java 
> programming.
> This on-ramp should be helpful to beginning programmers by requiring 
> only those
> concepts that a simple program needs.
>
> Not only should an on-ramp have a gradual slope and offer enough 
> acceleration
> distance to get onto the highway at the right speed, but its direction 
> must
> align with that of the highway.  When a programmer is ready to learn 
> about more
> advanced concepts, they should not have to discard what they've 
> already learned,
> but instead easily see how the simple programs they've already written
> generalize to more complicated ones, and both the syntatic and conceptual
> transformation from "simple" to "full blown" program should be 
> straightforward
> and unintrusive.  It is a definite non-goal to create a "simplified 
> dialect of
> Java for students".
>
> We identify three simplifications that should aid both educators and 
> students in
> navigating the on-ramp to Java, as well as being generally useful to 
> simple
> programs beyond the classroom as well:
>
>  - A more tolerant launch protocol
>  - Unnamed classes
>  - Predefined static imports for the most critical methods and fields
>
> ## A more tolerant launch protocol
>
> The Java Language Specification has relatively little to say about how 
> Java
> "programs" get launched, other than saying that there is some way to 
> indicate
> which class is the initial class of a program (JLS 12.1.1) and that a 
> public
> static method called `main` whose sole argument is of type `String[]` 
> and whose
> return is `void` constitutes the entry point of the indicated class.
>
> We can eliminate much of the concept overload simply by relaxing the
> interactions between a Java program and the `java` launcher:
>
>  - Relax the requirement that the class, and `main` method, be 
> public.  Public
>    accessibility is only relevant when access crosses packages; simple 
> programs
>    live in the unnamed package, so cannot be accessed from any other 
> package
>    anyway.  For a program whose main class is in the unnamed package, 
> we can
>    drop the requirement that the class or its `main` method be public,
>    effectively treating the `java` launcher as if it too resided in 
> the unnamed
>    package.
>
>  - Make the "args" parameter to `main` optional, by allowing the 
> `java` launcher to
>    first look for a main method with the traditional `main(String[])`
>    signature, and then (if not found) for a main method with no arguments.
>
>  - Make the `static` modifier on `main` optional, by allowing the 
> `java` launcher to
>    invoke an instance `main` method (of either signature) by 
> instantiating an
>    instance using an accessible no-arg constructor and then invoking 
> the `main`
>    method on it.
>
> This small set of changes to the launch protocol strikes out five of 
> the bullet
> points in the above list of concepts: public (twice), static, method 
> parameters,
> and `String[]`.
>
> At this point, our Hello World program is now:
>
> ```
> class HelloWorld {
>     void main() {
>         System.out.println("Hello World");
>     }
> }
> ```
>
> It's not any shorter by line count, but we've removed a lot of "horizontal
> noise" along with a number of concepts.  Students and educators will 
> appreciate
> it, but advanced programmers are unlikely to be in any hurry to make these
> implicit elements explicit either.
>
> Additionally, the notion of an "instance main" has value well beyond 
> the first
> day.  Because excessive use of `static` is considered a code smell, many
> educators encourage the pattern of "all the static `main` method does is
> instantiate an instance and call an instance `main` method" anyway.  
> Formalizing
> the "instance main" protocol reduces a layer of boilerplate in these 
> cases, and
> defers the point at which we have to explain what instance creation is 
> -- and
> what `static` is.  (Further, allowing the `main` method to be an 
> instance method
> means that it could be inherited from a superclass, which is useful 
> for simple
> frameworks such as test runners or service frameworks.)
>
> ## Unnamed classes
>
> In a simple program, the `class` declaration often doesn't help 
> either, because
> other classes (if there are any) are not going to reference it by 
> name, and we
> don't extend a superclass or implement any interfaces.  If we say an 
> "unnamed
> class" consists of member declarations without a class header, then 
> our Hello
> World program becomes:
>
> ```
> void main() {
>     System.out.println("Hello World");
> }
> ```
>
> Such source files can still have fields, methods, and even nested 
> classes, so
> that as a program evolves from a few statements to needing some 
> ancillary state
> or helper methods, these can be factored out of the `main` method 
> while still
> not yet requiring a full class declaration:
>
> ```
> String greeting() { return "Hello World"; }
>
> void main() {
>     System.out.println(greeting());
> }
> ```
>
> This is where treating `main` as an instance method really shines; the 
> user has
> just declared two methods, and they can freely call each other.  
> Students need
> not confront the confusing distinction between instance and static 
> methods yet;
> indeed, if not forced to confront static members on day 1, it might be 
> a while
> before they do have to learn this distinction.  The fact that there is a
> receiver lurking in the background will come in handy later, but right 
> now is
> not bothering anybody.
>
> [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be
> launched directly without compilation; this streamlined launcher pairs 
> well with
> unnamed classes.
>
> ## Predefined static imports
>
> The most important classes, such as `String` and `Integer`, live in the
> `java.lang` package, which is automatically on-demand imported into all
> compilation units; this is why we do not have to `import 
> java.lang.String` in
> every class.  Static imports were not added until Java 5, but no 
> corresponding
> facility for automatic on-demand import of common behavior was added 
> at that
> time.  Most programs, however, will want to do console IO, and Java 
> forces us to
> do this in a roundabout way -- through the static `System.out` and 
> `System.in`
> fields.  Basic console input and output is a reasonable candidate for
> auto-static import, as one or both are needed by most simple 
> programs.  While
> these are currently instance methods accessed through static fields, 
> we can
> easily create static methods for `println` and `readln` which are 
> suitable for
> static import, and automatically import them.  At which point our 
> first program
> is now down to:
>
> ```
> void main() {
>     println("Hello World");
> }
> ```
>
> ## Putting this all together
>
> We've discussed several simplifications:
>
>  - Update the launcher protocol to make public, static, and arguments 
> optional
>    for main methods, and for main methods to be instance methods (when a
>    no-argument constructor is available);
>  - Make the class wrapper for "main classes" optional (unnamed classes);
>  - Automatically static import methods like `println`
>
> which together whittle our long list of day-1 concepts down 
> considerably.  While
> this is still not as minimal as the minimal Python or Ruby program -- 
> statements
> must still live in a method -- the goal here is not to win at "code 
> golf".  The
> goal is to ensure that concepts not needed by simple programs need not 
> appear in
> those programs, while at the same time not encouraging habits that 
> have to be
> unlearned as programs scale up.
>
> Each of these simplifications is individually small and unintrusive, 
> and each is
> independent of the others.  And each embodies a simple transformation 
> that the
> author can easily manually reverse when it makes sense to do so: elided
> modifiers and `main` arguments can be added back, the class wrapper 
> can be added
> back when the affordances of classes are needed (supertypes, 
> constructors), and
> the full qualifier of static-import can be added back.  And these 
> reversals are
> independent of one another; they can done in any combination or any order.
>
> This seems to meet the requirements of our on-ramp; we've eliminated 
> most of the
> day-1 ceremony elements without introducing new concepts that need to be
> unlearned. The remaining concepts -- a method is a container for 
> statements, and
> a program is a Java source file with a `main` method -- are easily 
> understood in
> relation to their fully specified counterparts.
>
> ## Alternatives
>
> Obviously, we've lived with the status quo for 25+ years, so we could 
> continue
> to do so.  There were other alternatives explored as well; ultimately, 
> each of
> these fell afoul of one of our goals.
>
> ### Can't we go further?
>
> Fans of "code golf" -- of which there are many -- are surely right now 
> trying to
> figure out how to eliminate the last little bit, the `main` method, 
> and allow
> statements to exist at the top-level of a program.  We deliberately 
> stopped
> short of this because it offers little value beyond the first few 
> minutes, and
> even that small value quickly becomes something that needs to be 
> unlearned.
>
> The fundamental problem behind allowing such "loose" statements is that
> variables can be declared inside both classes (fields) and methods (local
> variables), and they share the same syntactic production but not the same
> semantics.  So it is unclear (to both compilers and humans) whether a 
> "loose"
> variable would be a local or a field.  If we tried to adopt some sort 
> of simple
> heuristic to collapse this ambiguity (e.g., whether it precedes or 
> follows the
> first statement), that may satisfy the compiler, but now simple 
> refactorings
> might subtly change the meaning of the program, and we'd be replacing the
> explicit syntactic overhead of `void main()` with an invisible "line" 
> in the
> program that subtly affects semantics, and a new subtle rule about the 
> meaning
> of variable declarations that applies only to unnamed classes.  This 
> doesn't
> help students, nor is this particularly helpful for all but the most 
> trivial
> programs.  It quickly becomes a crutch to be discarded and unlearned, 
> which
> falls afoul of our "on ramp" goals.  Of all the concepts on our list, 
> "methods"
> and "a program is specified by a main method" seem the ones that are 
> most worth
> asking students to learn early.
>
> ### Why not "just" use `jshell`?
>
> While JShell is a great interactive tool, leaning too heavily on it as 
> an onramp
> would fall afoul of our goals.  A JShell session is not a program, but a
> sequence of code snippets.  When we type declarations into `jshell`, 
> they are
> viewed as implicitly static members of some unspecified class, with
> accessibility is ignored completely, and statements execute in a 
> context where
> all previous declarations are in scope.  This is convenient for 
> experimentation
> -- the primary goal of `jshell` -- but not such a great mental model for
> learning to write Java programs.  Transforming a batch of working 
> declarations
> in `jshell` to a real Java program would not be sufficiently simple or
> unintrusive, and would lead to a non-idiomatic style of code, because the
> straightforward translation would have us redeclaring each method, 
> class, and
> variable declaration as `static`.  Further, this is probably not the 
> direction
> we want to go when we scale up from a handful of statements and 
> declarations to
> a simple class -- we probably want to start using classes as classes, 
> not just
> as containers for static members. JShell is a great tool for 
> exploration and
> debugging, and we expect many educators will continue to incorporate 
> it into
> their curriculum, but is not the on-ramp programming model we are 
> looking for.
>
> ### What about "always local"?
>
> One of the main tensions that `main` introduces is that most class 
> members are
> not `static`, but the `main` method is -- and that forces programmers to
> confront the seam between static and non-static members. JShell 
> answers this
> with "make everything static".
>
> Another approach would be to "make everything local" -- treat a simple 
> program
> as being the "unwrapped" body of an implicit main method.  We already 
> allow
> variables and classes to be declared local to a method.  We could add 
> local
> methods (a useful feature in its own right) and relax some of the 
> asymmetries
> around nesting (again, an attractive cleanup), and then treat a mix of
> declarations and statements without a class wrapper as the body of an 
> invisible
> `main` method. This seems an attractive model as well -- at first.
>
> While the syntactic overhead of converting back to full-blown classes 
> -- wrap
> the whole thing in a `main` method and a `class` declaration -- is far 
> less
> intrusive than the transformation inherent in `jshell`, this is still 
> not an
> ideal on-ramp.  Local variables interact with local classes (and 
> methods, when
> we have them) in a very different way than instance fields do with 
> instance
> methods and inner classes: their scopes are different (no forward 
> references),
> their initialization rules are different, and captured local variables 
> must be
> effectively final.  This is a subtly different programming model that 
> would then
> have to be unlearned when scaling up to full classes. Further, the 
> result of
> this wrapping -- where everything is local to the main method -- is 
> also not
> "idiomatic Java".  So while local methods may be an attractive 
> feature, they are
> similarly not the on-ramp we are looking for.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20220929/18deff5e/attachment-0001.htm>


More information about the amber-spec-observers mailing list