Paving the on-ramp

John Rose john.r.rose at oracle.com
Thu Sep 29 20:04:15 UTC 2022


On Sep 29, 2022, at 6:55 AM, Brian Goetz <brian.goetz at oracle.com> wrote:
>
>> (3) Instead of speaking of automatic imports, speak of the compiler automatically providing certain import statements if the compilation unit doesn’t have a class header.
>
> If we did this, when a class "graduates" from a low-ceremony class to a full class, then they'd have to go back and fix up all the println calls, and similarly it would put users in a position of "you can have ceremony reduction X, but only if you qualify for ceremony reduction Y."
>>
> Taken together, coupling "instance main" and "auto static imports" to "no class header" means that we have created a "beginners dialect" which is different, and which has to be unlearned and undone as soon as a class graduates.  I would prefer to have these be orthogonal features to the extent possible.

I like the principle behind Guy’s moves for removing magic, by implicitly adding stuff you could have had explicitly.

But adding `public static main` when there is an instance `main` is not a big payoff, though, since (a) you don’t want to apply such a rule to all class files in existence today, and (b) applying it only to the unnamed classes couples the two features in a (probably) confusing way.

So, with my VM hat on, I say, fine, let’s add another trick to the launcher’s bag of tricks:  If (1) a class is mentioned on a cammand line, then (2) we look for a somewhat wider range of methods (but all named `main`).

Again, having implicit static-imports that one could have written explicitly is very good, and it is a fine de-mystification move to say “one will be written for you if you didn’t write it already”.  I think that’s a way to explain our handling of `java.lang.*` today, isn’t it?

So there’s a small risk to adding more “stuff” to `java.lang.*`.  (The problem with `Module` was mentioned in this thread.) And something equivalent to `import static java.lang.StaticImports.*` will further poke the bear, depending on how rich we make the set of imported names.

Here’s a suggestion regarding static imports, specifically, that would match the on-ramp goals and mitigate the risk of name pollution from new *static* imports:

  1. If there is no `import java.lang.*` the program acts as if it were inserted.
  2. If there are *no imports at all*, the program acts as if *two* imports were inserted:  `import java.lang.*` and `import static java.lang.StaticImports.*` (or whatever the name is).

The effect of this is an empty set of imports will get a predictable, useful, and up-to-date set of default names.  That makes for good on-ramp conditions.  To get control over those imports, the user starts adding explicit imports at the top of the file.  We proceed up the on-ramp by a series of one-line changes, not wholesale refactorings.

This is akin to today’s mitigation of the problem with `java.lang.Module`:  You mitigate by *adding another import*, by-name import of your chosen class named `Module`.  That’s how Java has always worked.  Removing an intrusive static import from `java.lang` would (under the above rule) be mitigated more simply; just add any import at all, even a redundant `import java.lang.*`.  That’s a little magic, but the story is clear:  You get a certain “menu” of imports if you don’t specify *any*.

(Q:  What would break if we also auto-imported `java.util.*` under the null-import condition?  How disruptive would that be??)

I agree, in hindsight, with Guy’s point about unnamed classes in named packages.  I don’t see a deep coupling between those two parts of the language, so don’t make a shallow one.

In general, shallow couplings lead to the problem of “beginner’s dialect” Brian mentioned:  If simplifications A and B are coupled, when you graduate from one you have to “complicate” to the others.  In the case of the unnamed package, when you graduate your program to a named package (perhaps because it is now a unit test or utility that needs package API access) you might not want to graduate it, at the same time, from its unnamed format.

With my VM hat on again, I have a tentative suggestion for “fixing” the problem with an *unintentionally* linkable/denotable class. (As pointed out, that could be a class named `Foo` just because it is anonymous in a file that happens to have the “pretty name” `Foo.java`.)

Suggestion:  Allow classfiles (in newer classfile versions) to specify `ACC_PRIVATE` in their `access_flags` for the class.  With the obvious (!?) meaning:  A class marked private (at the VM level) will fail access checks except to itself and its nestmates (if any).  Roll it out as a VM feature first, and later as a slightly-incompatible language change for nested classes.  Heck, even named classes (that’s a compatible extension).

(Immediate use cases: All non-denotable classes are compiled `ACC_PRIVATE`.  That includes both “on ramp” unnamed classes and also any “inner class” which doesn’t have a linkable bytecode name.)

Second suggestion (independent of first):  In the example of `Foo.java`, “poison” the name `Foo` in a predictable way (prepend `$` or add `$unnamed` for example), and also mark the class as Synthetic (or with a new attribute).  Then, liberalize the launcher *ever so slightly* so as to allow (1) either the exactly matching name as today, (2) the predictably poisoned name (`Foo$unnamed`) if the class is also marked as synthetic/unnamed/whatever with an attribute.  This will put unnamed classes on a common footing with other classes (local & anonymous inner classes) that already have linkable-but-unpredictable names.  This is simpler than supporting `ACC_PRIVATE` and probably easier in the resolver (since there are just two names to check instead of one).

Third suggestion, probably not usable:  We have properly anonymous classes in the VM (VMACs), which have names that not even the class itself can resolve; they have a special ability to self-resolve `CONSTANT_Class` but it is hardwired and doesn’t go through a class-loader.  We could try to do something like this for unnamed classes, *but* it would not scale well to unnamed classes *which have named nested classes*.  To name those nested classes `Foo.Bar` you need a resolvable name like `Foo$unnamed$Bar`.  (But the classes could be marked `ACC_PRIVATE`; see above.)

I don’t know a clean way to fix the syntax ambiguity between (a) nested class/interface of unnamed class (new) and (b) non-public top-level (package-member) class/interface (old).  Here are two dirty workarounds, both of which make such secondary classes into inherently non-linkable inner classes:

1. Put all your nested classes together in a method body.
2. Put all your nested classes in an instance initializer (magic braces!).

Both have the problem that the class names don’t scope to the whole top-level (unnamed) class, so they are non-starters I guess, but might jog someone else’s imagination for a better workaround.

Here’s another workaround, which I guess Brian already mentioned:

3. If your user is wishing for nested classes or interfaces (or more likely records), then it’s time to learn about type definitions, so require them to “graduate” to a top-level class *at that time*.

Tentative suggestion, again for brainstorming:  A way to smooth *that* move might be to provide yet more syntaxes to declare a *class which is not denotable but which has a body*.  Something like a truncated class header with a body:  `class /*empty header*/ { …body here…}`.  The rule would be: If you are defining classes, it’s time to acknowledge you are defining a top-level one to surround them, but you don’t have to name it yet; it’s “just there”.

(On this slippery slope, maybe allow nested unnamed classes as well… And/or unnamed-but-denotable constructors: `class { … String field; class(String field) { this.field = field; } … `.  This doesn’t appeal much to me, at least until we have compelling new use cases for anonymous classes, not already covered by `new Object() { … }`.  Enhanced inference could make such a class into a poly-type-expression, someday, for some contexts where supers would be inferred.  I think that’s what C# does in this vein.)

OK, that’s enough BS (brainstorming, of course) from me.


More information about the amber-spec-observers mailing list