[lworld] RFR: 8367792: [lworld] Remove the Unsafe remnants of old valhalla prototypes

Fri Sep 26 16:01:36 UTC 2025

On Wed, 17 Sep 2025 16:52:46 GMT, Chen Liang <liach at openjdk.org> wrote:

>> There are some remnants from previous iterations of value objects in Unsafe. We should aim to remove them for cleaner code in the future.
>
> The vector support prototype was last updated 5 months ago. makePrivateBuffer introduces significant complexity into core library code and JVM. Besides this complexity, the vector prototype also introduces "multifield", which is also uncertain.
> 
> I will reach out to Paul Sandoz and Jatin Bhateja to check on the status of this prototype - it hasn't been updated since the new lworld model, maybe our mainline hacks can continue to work after the vector classes are marked value.

Hi @liach , 

Currently, Unsafe.put* APIs expect to operate on a mutable value, without Unsafe.makePrivateBuffer, there is no way to transition a value object to larval state.

<img width="800" height="400" alt="image" src="https://github.com/user-attachments/assets/af826cda-55e1-4b0c-a2ea-62592f7623d6" />

Here is a typical update kernel for the nary operation fallback implementation. 

<img width="500" height="200" alt="image" src="https://github.com/user-attachments/assets/4a31baa7-52b8-4e0b-8c42-924407bb5665" />

**Here are some relevant FAQs on the need for multifield annotation.**

Q. Why do we need @multifield annotated field in VectorPayloads and not just retain the array-backed backing storage?
A.  Even currently, Vector instances are immutable, with each modification or application of an operation, a new vector is generated. 
      Each new vector has a distinct backing storage in the form of an array; thus, no two vector ever share their backing storage, which makes vectors an immutable quantity. 

     Vector<Float>  newVector  =  Vec1.lanewise(VectorOperators.ADD, Vec2);

Since arrays are always allocated over the heap, they carry an identity, which is the distinctive heap address for each new backing storage array.

This contradicts the philosophy of value type instances, which are identity-free; the compiler treats two values with the same contents as equivalent entities and is free to substitute one with another. 

By replacing existing array-backed storage with a @multifield annotated storage, we ensure that payload adheres to true value semantics, a @multifiled is seen as a bundle of fields, encapsulating payload is a value class, unlike an array, a multifield is never allocated an explicit heap storage. 

Here is an example code

<img width="388" height="503" alt="image" src="https://github.com/user-attachments/assets/8f6b5a07-a8e0-4912-b909-9d892f40d92a" />

Even though Payload is a value class, its two instances with the same backing storage are not equal, because arrays have identity.
By treating vectors as value objects, we expect that two vectors with the same contents should be equal.

Q.  Is there any alternative to @multifield?
A.  All we need to ensure is that the backing storage has no identity.  Thus, we could have multiple primitive type fields in the payload, one for each lane of the vector. 

<img width="450" height="310" alt="image" src="https://github.com/user-attachments/assets/af977c64-6373-4533-aceb-6d283cecd094" />

Consider the above modified payload class ‘TrueValuePayload’, here we create a separate primitive type field for each lane of the vector, thus two vector values with the same contents are treated as equivalent entities. 

With @multifield we intend to create identity-free backing storage. 

Q.  What are the complications with explicit fields for each lane in the container payload?
A.  First, at load time, build_layout randomizes the fields' layout,  but that is not a major concern since fields can always be updated using their effective offsets.
A bigger concern is that the C2 compiler will see scalar fields as different inputs to InlineTypeNode. The compiler creates an InlineTypeNode for each instance of a value class and scalarizes the fields, with different scalar field we may observe multiple inputs to IR, one for each lane. Each scalar field will be of primitive Type, while the type of Type of InlineTypeNode will be PayloadType, since an inline type node corresponds to a value object. We expect to deal in vector-type field, i.e., the one that carry an ideal type TypeVect. 

Vector IR  is forwarded to its user every time we perform a vector_unbox operation over VectorPayload, this way, vector IR inputs are directly forwarded to the vector operation nodes.

Keeping multiple scalar fields, one for each lane type in the VectorPayload class, while creating a vector IR for a bundle of lanes, will result in adding kludges in the code.  Consider the following case

     var  v1 = FloatVector.fromArray(SP, arr1, index)
     var  v2 = FloatVector.fromArray(SP, arr2, index)
     res =  v1.lanewise(VectorOperators.ADD, v2)

    v1 and v2 are concrete vector class instances, where the vector class is a valuetype, with stock Valhalla, IR corresponding to fromArray API will look like the following

<img width="374" height="224" alt="image" src="https://github.com/user-attachments/assets/5f923efe-5b42-4e47-b04f-13abd4cfeee8" />

While the expected IR pallet should look like the following 

<img width="354" height="313" alt="image" src="https://github.com/user-attachments/assets/26d1ce93-f4d4-48fd-afaf-1b32a0904d2d" />

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/1593#issuecomment-3339344741