Fwd: Java on-ramp - Class declaration for Entrypoints

Thu Oct 6 16:45:09 UTC 2022

This was received today on the -comments list; Stephen has an alternate 
proposal for the on-ramp, making "entry points" (programs) a first class 
concept.

This proposal is actually three proposals in one, which are actually 
pretty separable:

  - A syntax-heavy proposal for "entrypoints";
  - An alternate way to address console IO, with new classes for 
SystemOut rather than new static functionality;
  - A further evolution on the "instance main" notion to use an 
interface to identify the "instance main" entry point, rather than 
extending the existing structural recognition.

I will mostly speak to the first of these.  The second is completely 
orthogonal (it is not essential to this proposal, and could be taken 
separately without dragging in anything else). This is largely an API 
design question, and can be discussed when we get to how to best expose 
new functionality for console IO.  The third has been mentioned on the 
EG list already, that "instance main" is an opportunity to switch to 
something more nominal.  I'll come back to this later in this mail.

To the main part of this proposal, I explored several similar points in 
my internal exploration (including also a method-centric version of the 
same basic split), prior to publishing my "On Ramp" document.  (Spoiler: 
ultimately I didn't feel these approaches carried their weight.)

In its favor, the proposal is principled: that of all the concepts that 
we ask users to confront in Hello World that have syntactic 
representations, perhaps the most important one for the situation at 
hand -- a *program* -- doesn't even have a syntactic representation in 
the program.  Java takes the worse-is-better approach here of saying "a 
program is just a class with a main method."  This is a pragmatic 
tradeoff, but it means that in the big list of concepts that users are 
confronted with, we are not able to whittle it down to zero -- users 
still have to confront methods (and their affordances such as return 
types and parameter lists.)  There's still a conceptual gap here, 
because these affordances largely support invoking methods (which the 
user has not yet learned about), but somehow the "main" method gets 
called magically by the launcher.  I spent some time exploring whether 
we could remove these last things from the list, but concluded that the 
juice was not worth the squeeze.

Stephen approaches this by a classic "split" move -- split methods into 
two kinds: "main" methods and "regular" ones, giving special syntax to 
the new thing, and then adopting a similar "unnamed classes" approach 
for classes with a main method.  This splitting is motivated by several 
forces: that a "program" is an important concept to represent, and that 
the current treatment of "main" in the language is frustratingly 
structural in a language where nearly everything else is nominal.

I sympathize with the concern that "main-ness" is too magic; when I 
learned Java 25+ years ago, among my first reactions was "but where's 
the program".  That a Java program is merely a soup of classes loosely 
contained by tooling switches or environment variables, and a program 
could have as many "main" entry points as it has classes, does take some 
getting used to (though less now than it did then, when linkers roamed 
the earth.)  And I sympathize with the desire to fix the mistakes of the 
past.  But I think that the "entrypoint" proposal strikes the wrong 
balance.  It has way too much new surface syntax, while at the same 
time, leaving the existing static main protocol flapping in the breeze.  
Given where we are, I don't see it as carrying its weight.  Stephen 
thinks it is "better bang for buck", but I disagree, not necessarily 
because of the bang part, but the buck part -- it is dramatically more 
expensive in all the dimensions (spec, syntax, user perception, 
implementation cost, etc).

The two sub-proposals are more compelling, and both are sure to play 
into the discussions to come in any case.  New classes vs new static 
entry points is a totally valid API design discussion to have.  
Similarly, with the addition of "instance main", it is a totally 
sensible question to ask whether we want to extend the existing 
structural recognition of main methods to the new thing, or break from 
that and use interfaces.  Both approaches have pros and cons, and we can 
discuss those.

-------- Forwarded Message --------
Subject: 	Java on-ramp - Class declaration for Entrypoints
Date: 	Thu, 6 Oct 2022 12:27:06 +0100
From: 	Stephen Colebourne <scolebourne at joda.org>
To: 	amber-spec-comments at openjdk.java.net

Hi all,
I wrote up my response to the on-ramp discussion as a blog post having
given it a few days thought:

https://blog.joda.org/2022/10/fully-defined-entrypoints.html

I think what I've come up with is a lot more powerful, more useful to
experienced developers, and more consistent during the learning
process. It is, of course, a bigger change.

This email to amber-spec-comments reads the idea into the legal-world
of OpenJDK..

thanks
Stephen

<h4>Entrypoints</h4>

When a Java program starts some kind of class file needs to be run.
It could be a normal <code>class</code>, but that isn't ideal as we
don't really want static/instance variables, subclasses, parent
interfaces, access control etc.
One suggestion was for it to be a normal <code>interface</code>, but
that isn't ideal as we don't want to mark the methods as
<code>default</code> or allow <code>abstract</code> methods.


I'd like to propose that what Java needs is a new kind of class
declaration for entrypoints.


I don't think this is overly radical. We already have two alternate
class declarations - <code>record</code> and <code>enum</code>. They
have alternate syntax that compiles to a class file without being
explictly a <code>class</code> in source code.
What we need here is a new kind - <code>entrypoint</code> - that
compiles to a class file but has different syntax rules, just like
<code>record</code> and <code>enum</code> do.


I believe this is a fundamentally better approach than the minor
tweaks in the official proposal, because it will be useful to
developers of all skill levels, including framework authors.
ie. it has a much better "bang for buck".


The simplest entrypoint would be:

<pre>
// MyMain.java
entrypoint {
SystemOut.println("Hello World");
}
</pre>

In the source code we have various things:

<ul>
<li>Inferred class name from file name, the class file is
<code>MyMain$entrypoint</code></li>
<li>Top-level code, no need to discuss methods initially</li>
<li>No access to paraneters</li>
<li>New classes <code>SystemOut</code>, <code>SystemIn</code> and
<code>SystemErr</code></li>
<li>No constructor, as a new kind of class declaration it doesn't need 
it</li>
</ul>

The classes like <code>SystemOut</code> may seem like a small
change, but it would have been much simpler for me from 25 years ago
to understand.
I don't favour more static imports for them (either here or more
generally), as I think <code>SystemOut.println("Hello World")</code>
is simple enough.
More static imports would be too magical in my opinion.


The next steps when learning Java are for the instructor to expand
the entrypoint.

<ul>
<li>Add a named method (always private, any name although
<code>main()</code> would be common)</li>
<li>Add parameters to the method (maybe String[], maybe String...)</li>
<li>Add return type to the method (void is default return type)</li>
<li>Group code into a block</li>
<li>Add additional methods (always private)</li>
</ul>

Here are some valid examples. Note that instructors can choose the
order to explain each feature:

<pre>
entrypoint {
SystemOut.println("Hello World");
}
entrypoint main() {
SystemOut.println("Hello World");
}
entrypoint main(String[] args) {
SystemOut.println("Hello World");
}
entrypoint {
main() {
SystemOut.println("Hello World");
}
}
entrypoint {
void main(String[] args) {
SystemOut.println("Hello World");
}
}
entrypoint {
main(String[] args) {
output("Hello World");
}
output(String text) {
SystemOut.println(text);
}
}
</pre>

Note that there are never any static methods, static variables,
instance variables or access control. If you need any of that you need
a class.
Thus we have proper separation of concerns for the entrypoint of
systems, which would be Best Practice even for experienced developers.

<h4>Progressing to classes</h4>

During initial learning, the entrypoint class declaration and normal
class declaration would be kept in separate files:

<pre>
// MyMain.java
entrypoint {
SystemOut.println(new Person().name());
}
// Person.java
public class Person {
String name() {
return "Bob";
}
}
</pre>

However, at some point the instructor would embed an entrypoint (of
any valid syntax) in a normal <code>class</code>.

<pre>
public class Person {
entrypoint {
SystemOut.println(new Person().name());
}
String name() {
return "Bob";
}
}
</pre>

We discover that an <code>entrypoint</code> is normally wrapped in a
<code>class</code> which then offers the ability to add
static/instance variables and access control.


Note that since all methods on the entrypoint are private and the
entrypoint is anonymous, there is no way for the rest of the code to
invoke it without hackery.
Note also that the entrypoint does not get any special favours like
an instance of the outer class, thus there is no issue with no-arg
constructors - if you want an instance you have to use
<code>new</code> (the alternative is unhelpful magic that harms
learnability IMO).


Finally, we see that our old-style static main method is revealed to
be just a normal entrypoint:

<pre>
public class Person {
entrypoint public static void main(String[] args) {
SystemOut.println(new Person().name());
}
String name() {
return "Bob";
}
}
</pre>

ie. when a method is declared as <code>public static void
main(String[])</code> the keyword <code>entrypoint</code> is
implicitly added.


What experienced developers gain from this is a clearer way to
express what the entrypoint actually is, and more power in expressing
whether they want the command line arguments or not.

<h4>Full-featured entrypoints</h4>
<p>
Everything above is what most Java developers would need to know.
But an entrypoint would actually be a whole lot more powerful.
</p>
<p>
The basic entrypoint would compile to a class something like this:
</p>
<pre>
// MyMain.java
entrypoint startHere(String[] args) {
SystemOut.println("Hello World");
}
// MyMain$entrypoint.class
public final MyMain$entrypoint implements java.lang.Entrypoint {
@Override
public void main(Runtime runtime) {
runtime.execute(() -> startHere(runtime.args()));
}
private void startHere(String[] args) {
SystemOut.println("Hello World");
}
}
</pre>
<p>
Note that it is <code>final</code> and methods are <code>private</code>.
</p>
<p>
The <code>Entrypoint</code> interface would be:
</p>
<pre>
public interface java.lang.Entrypoint {
/**
* Invoked by the JVM to launch the program.
* When the method completes, the JDK terminates.
*/
public abstract void main(Runtime runtime);
}
</pre>
<p>
The <code>Runtime.execute</code> method would be something like:
</p>
<pre>
public void execute(ThrowableRunnable runnable) {
try {
runnable.run();
System.exit(0);
} catch (Throwable ex) {
ex.printStackTrace();
System.exit(1);
}
}
</pre>
<p>
The JVM would do the following:
</p>
<ul>
<li>Load the class file specified on the command line</li>
<li>If it implements <code>java.lang.Entrypoint</code> call the
no-args constructor and invoke it</li>
<li>Else look for a legacy <code>public static void
main(String[])</code>, and invoke that</li>
</ul>
<p>
Note that <code>java.lang.Entrypoint</code> is a <b>normal interface
that can be implemented by anyone and do anything</b>!
</p>
<p>
This last point is critical to enhancing the bang-for-buck. I was
intriguied by things like <a
href="https://www.azul.com/blog/superfast-application-startup-java-on-crac/">Azul
CRaC</a> which wants to own the whole lifecycle of the JVM run.
Wouldn't that be more powerful if they could control the whole
lifecycle through <code>Entrypoint</code>.
Another possibile use is to reset the state when an application has
finished, allowing the same JVM to be reused - a bit like
Function-as-a-Service providers or build system daemons do.
(I suspect it may be possible to enhance the entrypoint concept to
control the shutdown hooks and to catch things like
<code>System.exit</code> but that is beyond the scope of this blog.)
For example, here is a theoretical application framework entrypoint:
</p>
<pre>
// FrameworkApplication.java - an Open Source library
public interface FrameworkApplication extends Entrypoint {
public default main(Runtime runtime) {
// do framework things
start();
// do framework things
}
public abstract start();
}
</pre>
<p>
Applications just implement this interface, and they can run it by
specifying their own class name on the command line, yet it is a
full-featured framework application!
</p>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-observers/attachments/20221006/f6a5fc5b/attachment-0001.htm>