Draft JVMS changes for Primitive Objects (JEP 401)

Dan Heidinga heidinga at redhat.com
Wed Aug 11 19:24:48 UTC 2021


And continuing on the "long-overdue" theme, here's my long-overdue
review of the spec changes.

A big thank you to you, Dan S., for the careful spec writeup efforts.
I think this captures our discussions well.

--Dan

== Section 2.11.5 Object Creation and Manipulation
> Create a new class instance: new, withfield.
Should that also include "defaultvalue"? The semantics aren't quite
the same because of the structural equality of primitive class types
but it is conceptually very similar.  And in the instruction section,
we state "The defaultvalue instruction is similar to the new
instruction" which lends credence to including it in this list.

== Section 4.1 The ClassFile Structure
The  `ACC_PRIM_SUPER` flag is introduced and restrictions on classes
with the flag are called out in various sections such as:
4.5 Fields > ACC_PRIM_SUPER flag set, each field must have its
ACC_STATIC flag set.
4.6 Methods > In a primitive class, and in an abstract class that has
its ACC_PRIM_SUPER flag set, a method that has its ACC_SYNCHRONIZED
flag set must also have its ACC_STATIC flag set.
5.3 Creation and Loading > implements PrimitiveObject if the opposite
is true (ACC_PRIM_SUPER, no instance initialization method).

I didn't see static constraints called out to enforce these
restrictions (should they be?).  Having the handling of the
ACC_PRIM_SUPER in one place would make the VM's job of validating it
easier.

== 4.6 Methods
> Design discussion: this section requires that unnamed factory methods (named <new>) are static
> and have a return type that matches their declaring class or interface. By restricting the
> descriptor in this way, clients can rely on a predictable, useful return type.
>
> Alternatively, we could allow a subtype or supertype as the return type, or impose no constraints
> at all. One potential use case is a hidden class, which is incapable of naming its class type in
> a descriptor.

Because these are static methods, I thought we had agreed they could
name any superclass as the return value due to the hidden class
requirements.  Even though this allows some strange behaviour (ie:
after some bytecode manipulation) such as the following pseudo-code
shows:
```
primitive class Strange {
    Strange() { //<new>()Ljava/lang/Object;
        return new String();
    }
}
```
The contract on `<new>` is more convention than requirement.  In cases
where the return value needs to be used as a primitive value, it would
need to go through a checkcast to validate it when a different return
type is named.

While this doesn't give the hidden class full powers to be checked in
the checkcast, it can still be checked against the PrimitiveObject
interface or its ACC_PRIM_SUPER type.  Seems like a reasonable setup
and avoids the VM having to check the name matches on the return type
of the descriptor.

> A method of a class or interface named <new> (2.9.4) must have its ACC_STATIC flag set.

Should interfaces be able to implement `<new>` or should we prevent
that like we do for `<init>`?  Preventing it now gives us the most
freedom later if we retcon this as a general object factory.

> If the name of the method is <new>, then the descriptor must denote a return type
> that is a type of the current class or interface. For a primitive class, the return
> type must be an inlinable reference type.

Same questions about requiring the return type to match (I don't think
we should) and if we should prevent interfaces from implementing it (I
think yes).

== 4.7.31 The JavaFlags Attribute
> We're having some good internal discussions about default values & null, and will send something out when that settles into something stable.
I expect (hope?) this will change as the internal discussion
solidifies.  I'm not a fan of this "kitchen sink" approach as it
becomes an attractive nuisance to wedge other flags into.  The
suggestion to use more focused attributes (`PrimitiveClassProperties`
& `ReferenceDefaultPrimitiveClass`) matches the existing conventions
for naming / using attributes.

== 5.3.5 Deriving a Class from a class File Representation
> Alternatively, we could more uniformly claim that the class is "considered to implement" the
> expected interface, regardless of what it implements by inheritance. The difference in
> behavior might be observable, say, via reflection.
I think we need to honour programmer intent here.  If a class
implements one of the interfaces by inheritance, then the programmer
has specified their intent and we should go with it (or flag it as an
error).

> An abstract class implements IdentityObject if it declares an instance initialization method
> and does not have its ACC_PRIM_SUPER flag set; and implements PrimitiveObject if the opposite
> is true (ACC_PRIM_SUPER, no instance initialization method). Instance initialization methods
> and ACC_PRIM_SUPER represent two channels for subclass instance creation, and this analysis
> determines whether only one channel is "open".

The rules expressed here don't cover the cases outlined in chapter 4 -
namely that a class that has ACC_PRIM_SUPER must only have static
fields and only its static methods can be synchronized.

> Alternatively, we could ignore instance initialization methods and rely entirely on
> ACC_PRIM_SUPER. In practice, abstract classes written in the Java programming language always
> have instance initialization methods, so the difference in behavior is only relevant to classes
> produced via other languages or tools.

My preference is for the VM to check the rules against the explicitly
set ACC_PRIM_SUPER bit.  It means the language can own when to set it
and the VM only has to do consistency checks on it.

== 5.4.3.1 Field Resolution
> Thus, a field reference with a type like Qjava/lang/String; is permitted. Since it's impossible
> to declare a field with such a type (see 5.4.2), resolution of the reference will fail anyway
> with a NoSuchFieldError.

I'm a little confused by "resolution of the reference will fail with
NoSuchFieldError" as my reading of 5.4.2 says we would reject any
class that has a Q descriptor that doesn't name a Q type.  How would
resolution of such a field reference ever occur?

== Bytecodes
new > tolerable because the Identity class requires no initialization.
Do we need changes to the JVMTI spec to indicate that Identity isn't
passed to the ClassFileLoadHook and is not modifiable?  And maybe a
rule in static constraints that says Identity has an abstract <init>
method? (Or have we dropped that idea?)  We need a way to say, via the
spec, that code in Identity.<init>()V will never run, no matter how a
user adds it there.

withfield > use of the withfield instruction is restricted to
nestmates of the field's declaring class.
I'm glad to see this as discussion on this went around and around.
Limiting to the nest still seems like the right choice to me.  So +1
to this.

On Fri, Jul 2, 2021 at 7:49 PM Dan Smith <daniel.smith at oracle.com> wrote:
>
> Here's a long-overdue refresh of the proposed JVMS changes to support Primitive Objects:
>
> http://cr.openjdk.java.net/~dlsmith/jep401/latest
>
> (Sorry to dump this on the weekend, not looking for same-day feedback. :-))
>
> I *think* I've captured all the key JVMS-related pieces that we expect to include with JEP 401, but please let me know if I missed something.
>
> In a number of areas, there are still open design questions. I've called those out in discussion blocks. Often, I've made a somewhat arbitrary choice for how to resolve the open question, based on my mood at the time. :-) While it's useful to get something down on paper, all of these will be more carefully explored and resolved in the coming months. If you see something *not* called out that you think still needs further discussion, let me know.
>
> New term to look out for: "inlinable reference type", which is spec-speak for "Q type". (And its companion, "standard reference type", for "L type".) Why not call it a "primitive value type", like we do in the Java language? Because, unlike the language model, in the JVM it works best to treat all class types as reference types that participate in a single substitutability/subtyping graph, even though the JVM can optimize away the references in many cases. Our generics story leans heavily on the JVM handling types in this way. Given that mismatch, it seems too confusing to try to force the same terminology into the different models.
>
> There are supplementary "cleanup" changes included in the bundle, if you're interested in exploring them. Most of these fall under the umbrella of the "Better-defined JVM class file validation" JEP I proposed a few weeks ago, but "JVM Types Cleanup" is new.
>



More information about the valhalla-spec-experts mailing list