No subject
João Mendonça
jf.mend at gmail.com
Thu Jun 16 11:17:25 UTC 2022
Hello,
I would like to bring to your consideration the following set of
observations and user-model suggestions, in the hope that they will bring
some useful ideas to the development of the Valhalla project.
*Definition*
*shared-mutable* - a variable that is mutable (non-final) and can be shared
between threads; shared-mutables are the non-final subset of the
shared-variables (§17.4.1.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.4.1>)
*Observations*
Shared-mutables are the only variables that have these two apparently
independent properties:
1. lack definite-assignment (§16.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-16.html>) - the
variable is initialized with a default-value if not definitely-assigned (
§4.12.5.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-4.html#jls-4.12.5>
)
2. allow data-races - the variable may be read/written while being
written by another thread, with both events happening in an unpredictable
order (§17.4.5.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.4.5>
)
Via properties 1 and 2, nullability and encoding-mode, respectively, affect the
semantics of variables in a way that is unique to shared-mutables:
1. if not definitely-assigned, the variable is initialized with:
- if nullable: the null value, regardless of type
- if not nullable: the zero-value of the type
2. in a data-race, the value read/written:
- if reference: has unpredictable origin in *one* of the various
writes
- if inline, either:
- has unpredictable origin in one of the various writes
- is torn i.e. has distinct internal parts with separate
unpredictable origins in *more than one* of the various writes (
§17.7.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-17.html#jls-17.7>
)
These are the 3 kinds of shared-mutable variables (§4.12.3.
<https://docs.oracle.com/javase/specs/jls/se18/html/jls-4.html#jls-4.12.3>):
- non-final class variables
- non-final instance variables
- array components
The remaining kinds of variables don't have any of the above properties:
- final class variables
- final instance variables
- method parameters
- constructor parameters
- lambda parameters
- exception parameters
- local variables
*User-model*
For class-authors:
- *value-knob* to reject identity - Applicable on class declarations, if
used by a class-author to indicate that the class instances don't require
identity (a value-class), the runtime will be free to copy these values and
choose between reference or inline encoding everywhere except in
shared-mutables, as doing so does not introduce any semantic changes to the
program. In shared-mutables, however, value-instances can only be inlined
if atomicity is guaranteed, which will depend on the hardware and the
variable bit-size (value plus nullability).
- *tearable-knob* to allow tearing - Applicable on value-class
declarations, may be used by the class-author to hand the class-user the
responsibility of how to avoid tearing, freeing the runtime to always
inline instances in shared-mutables (bikeshedding: when dealing with
tearable value-classes, the "terrible" sound can work as a warning for the
dangers of neglecting this responsibility). Conversely, if this knob is not
used, instances will be kept atomic, which allows the class-author to
guarantee constructor invariants, which may be useful for the class
implementation and class-users to rely upon.
- *zero-knob* to allow using the zero-value as a default - by omission,
the class-author will force shared-mutables of this type to either be
definitely-assigned or nullable.
For class-users:
- *not-nullable-knob* to exclude null from a variable's value-set -
Applicable on any variable declaration. Since identity-types lack a
zero-value, any non-nullable shared-mutable with an identity-type must be
definitely-assigned. For nullable variables, in either encoding-mode, the
runtime is free to choose the encoding for the extra bit of information
required to represent null. In shared-mutables that are not
definitely-assigned, this knob controls the default-value: either null or
the zero-value of the type.
- *atomic-knob* to avoid tearing - Applicable on shared-mutable
declarations, may be used by the class-user to reverse the effect of the
tearable-knob, thereby restoring atomicity.
*Nullable types*
- For compatibility, we cannot have a nullable-knob in the new
user-model since unadorned types must remain nullable as they are now
- (!) as a non-nullable-knob is pretty concise although not very readable
- In method bodies, var will mitigate the majority of the noise of (!)
- In method signatures, the proliferation of (!) in arguments and return
types will look ugly
- The compiler will be able to help us avoid the majority of
NullPointerExceptions
- Old APIs can compatibly update return types to be non-nullable where
appropriate, which is more convenient for new client code. Also, removing
the nullability overhead and may increase performance. Ex: Stream::findAny
can be updated to return Optional!<T>
*Zero-knob vs no-zero-knob*
I am going with the zero-knob because I feel it gives us the safest and
most common default:
- Allows a more cautious API introduction - A late addition of the
zero-default to a class doesn't break client code, but a late removal does.
- Definite-assignment is safe - Without a zero-default, class-users are
forced to definitely-assign their shared-mutables, preventing
missed-initialization-bugs.
- It's the right default for value Records - The vast majority of
Records are semantically value classes, since using any identity operations
on them would be a bug (locking or identity comparison). Making these
Records value-classes will prevent such bugs. So I am predicting that the
vast majority of value-classes written by average developers will be
Records, which mostly don't have a sensible zero-value.
*Migration of value-based classes*
For compatibility with existing code, no value-based class can be tearable,
and somewhat amazingly, not even Double or Long. The reason is that where in
the current model we have a field declaration such as:
ValueBasedClass v = someValue;
v is always reference encoded and, therefore, atomic. In the new model, the
encoding-mode is fully encapsulated, so the only way for v to remain atomic
is all the migrated value-based classes not being declared tearable.
For Double and Long, this is a bit awkward, because it means that for these
two primitives, and for them alone, each of these pair of field
declarations will not be semantically equivalent:
long v; // tearable
Long! v; // atomic
double d; // tearable
Double! d; // atomic
Regardless of this peculiarity, the major downside of being forced to make all
value-based classes atomic is that, depending on: target architecture,
primitive bit-size and nullability, we may not get inline encoding where we
otherwise could. So, even though in the new model we can still achieve the
same inlining as before (as the old primitives are still available), in a
few situations, the runtime may have to resort to reference encoding to
ensure atomicity, even if atomicity is not needed. I think this is a
relatively small price to pay for compatibility.
*Sample code*
// For brevity, imports and the modifiers public, final, extends and
implements are omitted.
// Declaration of the primitive wrappers.
zero value class Boolean {...}
zero value class Char {...}
zero value class Byte {...}
zero value class Short {...}
zero value class Integer {...}
zero value class Float {...}
zero value class Long {...}
zero value class Double {...}
// declaration of some value-based classes
zero value class Optional {...}
value class Instant {...}
value class LocalDate {...}
// declaration of some value classes
tearable value class Rational {...}
tearable zero value class Complex {...}
// Fields
class C {
double _tearable_0d;
Double! _atomic_0d;
Integer! _atomic_2i = 2; // Integer! <==> int
Instant t_null;
Intant! t_error; // error: Blank field not initialized
Instant! t = Instant.now();
LocalDate! ld; // error: Blank field not initialized
atomic Rational! r = new Rational(2, 3);
final atomic Rational r2; // error: Final fields are already
atomic
Rational r_null;
Rational! r3; // error: Blank field not initialized
}
// Local Variables and Arrays
var _2zeros_d = new Double![2];
var _3zeros_L= new atomic Long![3];
var ints = new atomic Integer![3]; // error: Integer is already atomic
atomic Complex nullableComplex; // error: local variables are
already atomic
var s_nulls = new String[3];
var s_error = new String![3]; // error: array components not
initialized
var letters = new String![]{"a", "b"};
String nullable_letter_a = letters[0];
var nonNullable_letter_b = letters[1];
letters[0] = "z";
letters[1] = null; // error: cannot convert from null
to String!
var _3emptyOpts = Optional!<String>[3];
Kind regards,
João Menodnça
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-spec-observers/attachments/20220616/b330466e/attachment-0001.htm>
More information about the valhalla-spec-observers
mailing list