EG meeting, 2022-02-09 [SoV-3: constructor questions]

Wed Feb 9 18:32:07 UTC 2022

On 8 Feb 2022, at 19:04, Dan Smith wrote:

> "SoV-3: constructor questions": Dan asked about validation for <init> 
> and <new> methods. Answer: JVM doesn't care about <init> methods in 
> abstract classes, the rules about <new> methods still uncertain.

On the question of JVM validation of `<new>` methods, I’m in favor of 
as few rules as possible, ideally treating `<new>` as just another name. 
  It’s super-power is not in its restrictions but in its 
conventionality:  It’s the obvious choice for constructor factory 
methods.  But it is not necessarily limited to that use.

Maximal limitation would be that a `<new>` method can only occur as the 
translation of a value-class constructor.  Any evidence in the classfile 
that it was not such a translation would be grounds for failing a 
validation check.  We’d make as many such rules as we can think of.

Arguments against:

  - Having a special method identifier in the JVMs without other 
restrictions would be a new thing, and hence suspicious.
  - Limiting the use of `<new>` as much as possible makes it clear, to 
higher layers of the code (javac and reflection) what is going on in the 
class file, as a “reflection” of the source file.
  - Reflection of an irregular (non-source-conforming) `<new>` method 
has to be messy.  (Is it really a constructor?  Or is it just a method 
named `<new>`?)

Arguments in favor:

  - It is a new thing in the JVM for any descriptor to be constrained to 
mention the same name as is the name of the constant pool item referred 
to by `ClassFile.this_class` item (JVMS 4.1).  (It is suspicious.)
  - A maximal limitation would break hidden classes.  (They must 
sometimes return a supertype from their factories, since the HC is not 
always name-able in a descriptor.  HCs only work because the previous 
point.)
  - A limitation might preclude a perhaps-desirable future translation 
strategy that used `<new>` factories uniformly to translate `new` source 
code expressions (identity or value objects, uniformly).
  - A limitation could remove a natural translation strategy for 
“canonical factory methods” in non-concrete types.  This is a 
hypothetical language feature for Java or some other language.  (E.g., 
`new List(a,b,c)` instead of `List.of(a,b,c)`, removing the need of the 
user to remember whether the word was `of` or `make` or `build` or some 
other designer choice.)
  - Most any limitation would preclude ad hoc use of `<new>` factories 
by translation strategies of other languages, such as Scala and Clojure, 
which surely have their own uses of JVM object life cycles.  We want to 
be friendly to non-Java languages.

Compromise positions:
  - Require a `<new>` method to be `ACC_STATIC` but allow for any 
purpose (i.e., any access and any descriptor).
  - Require a `<new>` method to return either the class named by 
`this_class` or some super type (TBD how *this* should be checked).

I would prefer the first compromise:  It’s `static` but otherwise the 
JVM asks no questions.

Regarding reflection, I think it would be OK to surface all of the 
`<new>` methods (of whatever signature) on the `getConstructors` list, 
even if they return “something odd”.  Alternatively, to prevent a 
sharp edge we could have a new list `Class::getFactories`, and *copy* 
(not move) entries from that list onto `getConstructors` exactly when 
the return type matches the enclosing class.  That is a more natural 
move for reflection (which operates on runtime types) than for class 
file structuring (which is more static).

The reason I prefer to require `static` marking is that it would prevent 
the funny name from appearing on the list of regular methods, via 
reflection.