Using the variable instantiation to define if is primitive type (Bucket 3) or value type (Bucket 2)

Wed Jan 5 13:56:04 UTC 2022

----- Original Message -----
> From: "Anderson Vasconcelos Pires" <andvasp at gmail.com>
> To: valhalla-spec-comments at openjdk.java.net, "valhalla-dev" <valhalla-dev at openjdk.java.net>
> Sent: Tuesday, January 4, 2022 1:31:57 PM
> Subject: Using the variable instantiation to define if is primitive type (Bucket 3) or value type (Bucket 2)

> Hi guys,

Hi Anderson,

> 
> I am very excited about Valhalla and what it will do for Java. I was
> wondering if the language proposals could be simplified a little bit
> without affecting performance.
> 
> Please let me (maybe the community) know if the proposal below could be
> applied or not.
> 
> From the new proposal that separates the user type in 3 Buckets (1, 2, 3),
> I believe the Primitive.ref (Bucket 3) and value class (Bucket 2) are
> equivalent, right?

not exactly,
they both represent the same type a nullable object with a value semantics, but they are different beasts,
a Primitive.ref is an interface, a value class is a concrete class

> 
> If so, I was wondering if we could have just one class declaration and
> distinguish the variable from how it is created.
> 
> The runtime variable would be a value object if we use the new keyword to
> instantiate the variable, if not would be primitive value. I think it makes
> a little bit of sense since the new is used to create an object today,
> right?
> 
> See the example below:
> 
> [primitive or value] class Point {
>    int x;
>    int y;
> }
> 
> Point p1 = Point(1, 1); // primitive value   - p1 is value of Point. Could
> be defined at compile time as new Point() in the actual proposal;
> Point p2 = new Point(1, 1); // value object - p2 instance of Point.ref.
> Could be defined at compile time as new Point.ref();
> 
> Assuming the Point.ref is a subtype of Point, I believe this could be
> possible, right? If necessary, the right type could be defined at compile
> time in some cases.

Point.ref is a NOT a subtype of Point, it works the other way, Point.ref a super type of Point (if Point is a primitive).

[...]

> 
> Anderson.

Let me try to explain how it works, at least this is how i explain it to myself.

Valhalla has more or less two related goals, we want to avoid to pay for abstraction (not allocate an Optional on stack) and we want to improve the information density (an arrays of objects should not be automatically an array of reference on heap).

For the first goal, we need to change the meaning of ==, hashCode, synchronize and weak reference because they all require either an address on heap or an object header.
We call these objects "value objects", == compare all fields, hashCode mix the values of some of the fields, synchronize throw an IllegalMonitorState and for weak refs we still not have figure how to deal with them. 

For the second goal, we need to introduce objects that are not represented by a reference but by the immediate values of their fields like with primitives.
We call these objects "primitive object", they are not nullable, tearable (are assigned in several CPU instructions which sucks for concurrent code), all zeroes must be a valid default value and have a box to interact with concurrent code and generic code.

We can remark that
- primitive objects are value objects because a primitive object has no address on heap.
- we can try to retrofit builtin primitive values (int, double, etc) to be primitive objects
- the box for a primitive object can be pushed to corner cases if we overhaul generics to work with primitive objects.
- the box for a primitive object can be an interface not a concrete class unlike Integer, Double, etc which is more efficient

But we can not have only one thingy that combines the property of a value object and a primitive object, because a primitive object breaks encapsulation, it is tearable and the default value bypass the constructor. The dichotomy between value object and primitive object represent that tradeoff.  

More on Point.ref, let say i declare a primitive Point
  primitive Point {
    private final int x;
    private final int y;

    Point(int x, int y) {
      this.x = x;
      this.y = y;
    }
   }

this is what the compiler generates
  sealed interface Point.ref permits Point {
  }
  primitive Point implements Point.ref {
    private final int x;
    private final int y;

    Point(int x, int y) {
      this.x = x;
      this.y = y;
    }
   }

Point.ref is an interface automatically generated by the compiler so the relation between a Point and a Point.ref are subtyping relation not boxing relation.

We may in the future revisit this because we currently have two ways to represent the same thing LPoint; and LPoint.ref;

regards,
Rémi