No subject

João Mendonça jf.mend at gmail.com
Thu Jun 16 11:17:25 UTC 2022


Hello,


I would like to bring to your consideration the following set of
observations and user-model suggestions, in the hope that they will bring
some useful ideas to the development of the Valhalla project.



*Definition*
*shared-mutable* - a variable that is mutable (non-final) and can be shared
between threads; shared-mutables are the non-final subset of the
shared-variables (§17.4.1.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.4.1>)



*Observations*
Shared-mutables are the only variables that have these two apparently
independent properties:

   1. lack definite-assignment (§16.
   <https://docs.oracle.com/javase/specs/jls/se18/html/jls-16.html>) - the
   variable is initialized with a default-value if not definitely-assigned (
   §4.12.5.
   <https://docs.oracle.com/javase/specs/jls/se18/html/jls-4.html#jls-4.12.5>
   )
   2. allow data-races - the variable may be read/written while being
   written by another thread, with both events happening in an unpredictable
   order (§17.4.5.
   <https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.4.5>
   )

Via properties 1 and 2, nullability and encoding-mode, respectively, affect the
semantics of variables in a way that is unique to shared-mutables:

   1. if not definitely-assigned, the variable is initialized with:
      - if nullable: the null value, regardless of type
      - if not nullable: the zero-value of the type
   2. in a data-race, the value read/written:
      - if reference: has unpredictable origin in *one* of the various
      writes
      - if inline, either:
         - has unpredictable origin in one of the various writes
         - is torn i.e. has distinct internal parts with separate
         unpredictable origins in *more than one* of the various writes (
         §17.7.
         <https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.7>
         )


These are the 3 kinds of shared-mutable variables (§4.12.3.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-4.html#jls-4.12.3>):

   - non-final class variables
   - non-final instance variables
   - array components

The remaining kinds of variables don't have any of the above properties:

   - final class variables
   - final instance variables
   - method parameters
   - constructor parameters
   - lambda parameters
   - exception parameters
   - local variables



*User-model*

For class-authors:

   - *value-knob* to reject identity - Applicable on class declarations, if
   used by a class-author to indicate that the class instances don't require
   identity (a value-class), the runtime will be free to copy these values and
   choose between reference or inline encoding everywhere except in
   shared-mutables, as doing so does not introduce any semantic changes to the
   program. In shared-mutables, however, value-instances can only be inlined
   if atomicity is guaranteed, which will depend on the hardware and the
   variable bit-size (value plus nullability).
   - *tearable-knob* to allow tearing - Applicable on value-class
   declarations, may be used by the class-author to hand the class-user the
   responsibility of how to avoid tearing, freeing the runtime to always
   inline instances in shared-mutables (bikeshedding: when dealing with
   tearable value-classes, the "terrible" sound can work as a warning for the
   dangers of neglecting this responsibility). Conversely, if this knob is not
   used, instances will be kept atomic, which allows the class-author to
   guarantee constructor invariants, which may be useful for the class
   implementation and class-users to rely upon.
   - *zero-knob* to allow using the zero-value as a default - by omission,
   the class-author will force shared-mutables of this type to either be
   definitely-assigned or nullable.


For class-users:

   - *not-nullable-knob* to exclude null from a variable's value-set -
   Applicable on any variable declaration. Since identity-types lack a
   zero-value, any non-nullable shared-mutable with an identity-type must be
   definitely-assigned. For nullable variables, in either encoding-mode, the
   runtime is free to choose the encoding for the extra bit of information
   required to represent null. In shared-mutables that are not
   definitely-assigned, this knob controls the default-value: either null or
   the zero-value of the type.
   - *atomic-knob* to avoid tearing - Applicable on shared-mutable
   declarations, may be used by the class-user to reverse the effect of the
   tearable-knob, thereby restoring atomicity.



*Nullable types*

   - For compatibility, we cannot have a nullable-knob in the new
   user-model since unadorned types must remain nullable as they are now
   - (!) as a non-nullable-knob is pretty concise although not very readable
   - In method bodies, var will mitigate the majority of the noise of (!)
   - In method signatures, the proliferation of (!) in arguments and return
   types will look ugly
   - The compiler will be able to help us avoid the majority of
   NullPointerExceptions
   - Old APIs can compatibly update return types to be non-nullable where
   appropriate, which is more convenient for new client code. Also, removing
   the nullability overhead and may increase performance. Ex: Stream::findAny
   can be updated to return Optional!<T>



*Zero-knob vs no-zero-knob*

I am going with the zero-knob because I feel it gives us the safest and
most common default:

   - Allows a more cautious API introduction - A late addition of the
   zero-default to a class doesn't break client code, but a late removal does.
   - Definite-assignment is safe - Without a zero-default, class-users are
   forced to definitely-assign their shared-mutables, preventing
   missed-initialization-bugs.
   - It's the right default for value Records - The vast majority of
   Records are semantically value classes, since using any identity operations
   on them would be a bug (locking or identity comparison). Making these
   Records value-classes will prevent such bugs. So I am predicting that the
   vast majority of value-classes written by average developers will be
   Records, which mostly don't have a sensible zero-value.



*Migration of value-based classes*

For compatibility with existing code, no value-based class can be tearable,
and somewhat amazingly, not even Double or Long. The reason is that where in
the current model we have a field declaration such as:

ValueBasedClass v = someValue;

v is always reference encoded and, therefore, atomic. In the new model, the
encoding-mode is fully encapsulated, so the only way for v to remain atomic
is all the migrated value-based classes not being declared tearable.
For Double and Long, this is a bit awkward, because it means that for these
two primitives, and for them alone, each of these pair of field
declarations will not be semantically equivalent:

long v;    // tearable
Long! v;   // atomic
double d; // tearable
Double! d; // atomic

Regardless of this peculiarity, the major downside of being forced to make all
value-based classes atomic is that, depending on: target architecture,
primitive bit-size and nullability, we may not get inline encoding where we
otherwise could. So, even though in the new model we can still achieve the
same inlining as before (as the old primitives are still available), in a
few situations, the runtime may have to resort to reference encoding to
ensure atomicity, even if atomicity is not needed. I think this is a
relatively small price to pay for compatibility.


*Sample code*

// For brevity, imports and the modifiers public, final, extends and
implements are omitted.

// Declaration of the primitive wrappers.
zero value class Boolean {...}
zero value class Char {...}
zero value class Byte {...}
zero value class Short {...}
zero value class Integer {...}
zero value class Float {...}
zero value class Long {...}
zero value class Double {...}

// declaration of some value-based classes
zero value class Optional {...}
value class Instant {...}
value class LocalDate {...}

// declaration of some value classes
tearable value class Rational {...}
tearable zero value class Complex {...}

// Fields

class C {
    double _tearable_0d;
    Double! _atomic_0d;
    Integer! _atomic_2i = 2;           // Integer! <==> int
    Instant t_null;
    Intant! t_error;                   // error: Blank field not initialized
    Instant! t = Instant.now();
    LocalDate! ld;                     // error: Blank field not initialized
    atomic Rational! r = new Rational(2, 3);
    final atomic Rational r2;         // error: Final fields are already
atomic
    Rational r_null;
    Rational! r3;                      // error: Blank field not initialized
}

// Local Variables and Arrays

var _2zeros_d = new Double![2];
var _3zeros_L= new atomic Long![3];
var ints = new atomic Integer![3];     // error: Integer is already atomic
atomic Complex nullableComplex;        // error: local variables are
already atomic
var s_nulls = new String[3];
var s_error = new String![3];          // error: array components not
initialized
var letters = new String![]{"a", "b"};
String nullable_letter_a = letters[0];
var nonNullable_letter_b = letters[1];
letters[0] = "z";
letters[1] = null;                     // error: cannot convert from null
to String!
var _3emptyOpts = Optional!<String>[3];


Kind regards,
João Menodnça
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-observers/attachments/20220616/b330466e/attachment-0001.htm>


More information about the valhalla-spec-observers mailing list