Null-restricted types: Why so complicated?

Fri Jan 19 17:51:56 UTC 2024

On Fri, Jan 19, 2024 at 10:07 AM John Bossons <jbossons at gmail.com> wrote:

> Thanks for your comments. I was not sufficiently explicit.
>
> Let me focus on implicit. I guess my dislike is of introducing a 'fake'
> constructor into the definition of a class. I say 'fake' because, as I
> understand it, the only purpose of the implicit constructor is to indicate
> to the JVM/compiler that a never-null instance can be created. But in Java
> idiom that means that a developer can invoke the public implicit
> constructor, which will cause confusion.
>
It is a real constructor like that default no-arg one generated by javac,
and developers always CAN invoke that constructor. It is, however, a
watered-down version, because our existing default one can perform side
effects like `private List<Integer> values = new ArrayList<>();` injected
to the end of constructor, while the implicit one must give it up so that
JVM can construct zero instances cheaply yet correctly. As a result, this
constructor cannot declare any side-effect code or declare a custom
superconstrctor call, so it will look like a "fake" one, yet it is no
different from a true constructor.

>
> Maybe it would be better to require a potentially null-restricted class to
> extend a marker interface ('extends NeverNullPossible'? Or maybe, looking
> ahead to my next comment, 'extends AllZerosIsNull'?). That would enable the
> compiler to catch an invalid use of the ! marker in a declaration, just as
> the proposed implicit constructor does, while conforming better to common
> Java idiom.
>
A zero instance is a class capacity indeed, and such classes must be final.
My only cent against marker interfaces is that I don't think Java compiler
ever emits errors simply because your class implements an unsuitable
interface.

>
> My further suggestion is that appending ! to a type should mean that the
> default initialized value of an instance (all fields zero) is equivalent to
> null, so that
>       Range![] a = new Range![100];  // allocated with zero values
>       System.out.println(a[5]);      // throws NullPointerException (zero
> fields)
> This better conforms to current idiom, where the initial initialization is
> with nulls and the println invocation on a null array element or field
> throws a NPE.
>
Consider this: how would you differentiate a null Range versus a Range[0,
0]? Are both all zero bits?
This is where the zero instance starts: before anything, the zero instance
has always been a VALID instance of an object, yet its inlined
representation will be all zero bits, which means it will coincide with
null; thus, we introduce the null-restricted concept to avoid the
performance pitfalls we will suffer to represent a null.

Also adding on to Anh Mai's comment, recall that Range is a value class (a
prerequisite to null-restriction) so its identity doesn't matter; the VM is
totally permitted inline the null-friendly range array with 9-byte units (8
byte + single bit indicating nullity), and it is still somewhat a memory
win over linking to regular objects. But we might need some fine-grained
control to ensure VM allocates an inlined array instead of a pointer array
in this case. In this case, testing null and throwing NPE would be simply
checking one bit, which is more reliable than scanning a whole byte
interval too.

>
> As you say, my suggestion means runtime testing to determine if all fields
> are zero, which has a performance cost. This will only occur if the JVM
> implements the ! specification, which it presumably will only do if the
> object is small. And the cost will be small (I am presuming) relative to
> savings from allowing the memory footprint to match that of primitives. Am
> I wrong? There is value in conforming to current idiom.
>
> Turning to the LooselyConsistentValue, I withdraw my comments. I
> mistakenly presumed that its use would be required, which is false. It
> simply enables a single-threaded (or volatile-protected) application to
> allow additional inlining, which is harmless.
>

> John
>
> On Thu, Jan 18, 2024 at 4:56 PM - <liangchenblue at gmail.com> wrote:
>
>> Hi John,
>>
>> On Thu, Jan 18, 2024 at 2:30 PM John Bossons <jbossons at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Maybe I am missing something, but the proposal seems to be trying to do
>>> too much.
>>>
>>> Specifically: Why not simply provide that appending ! to a type
>>> specification for an object (field, array element, or parameter) means that
>>> that the object is not only null-restricted but also never zero and
>>> necessarily non-atomic unless small?
>>>
>> First, a reminder that some objects cannot be non-atomic, mostly when
>> fields have dependencies/constraints on each other: if you have a range,
>> you cannot allow its lower bound to be larger than its upper bound.
>> Non-atomic representations cannot avoid this pitfall. Also you seem
>> to misunderstand non-atomic: if an object is non-atomic, each of its fields
>> can update independently from each other, so a 3-d position can be
>> non-atomic, but not so for a range. Non-atomicity is dangerous, and it
>> should not be the default. However, if an atomic class is small enough,
>> like OptionalInt (as now many architecture has like atomic handling of 16
>> bytes etc.) JVM may choose to apply non-atomic optimizations to them for
>> better performance without violating their object constraints.
>>
>>>
>>> Why complicate the specification with an implicit constructor that a
>>> developer will never explicitly invoke? Why permit a developer to 'opt in'
>>> to non-atomic?
>>>
>> The implicit constructor can always be called; its existence asks
>> programmers to affirm that the zero-filled inlined instance is a valid
>> instance. And this instance is different from a null, as null is a pointer,
>> yet the zero-instance has a different size defined by the class layout in
>> the stack/heap.
>>
>>>
>>> Sure, that means trying to read a zero value triggers a NPE. That just
>>> means that a type that can legitimately have a zero value cannot be
>>> specified as null-restricted, since a zero value (e.g. a {null, null} Name)
>>> is the equivalent of a null unrestricted value object. Why go beyond that?
>>> If a non-null zero value is possible, the type cannot be null-restricted
>>> and so can only be an unrestricted JEP 401 value type. End of story.
>>>
>> You see the inlined zero instance and the null pointer have different
>> sizes, and thus they are not exchangeable. Converting the inlined zero
>> instance to null to throw NPE is complex and hurtful to performance as you
>> will scan unrelated bits for almost every field access.
>>
>> And for unrestricted value type, yes, they exist and can possibly be
>> inlined as well if the restricted type is small enough (i.e. has space for
>> extra bit indicating nullity) But reminder, the nullity bit itself isn't
>> even non-atomic with (depends on) the rest of the object! You don't want
>> the nullity to indicate null while the rest of the object indicate some
>> sort of non-null value, which can happen in a non-atomic context.
>>
>>>
>>> With respect to non-atomic, what is new? Yes, unexpected  instances may
>>> occur without synchronization if the object is larger than the word size of
>>> the implementation. Why do we need to extend a LooselyConsistentValue
>>> interface to know/permit that?
>>>
>> Unexpected instances don't occur without synchronization if you use
>> finals, such as in Java's String or immutable List.of(). These APIs may
>> capture any "permitted value" from the arrays passed in, but once
>> constructed, the captured value remains constant no matter which thread
>> observes the String/List object reference. (Technically, JVM implements
>> this with a store-store fence between end of field writes in the
>> constructor and object reference is shared anywhere, and a load-load fence
>> between object reference read and field read) Value classes is about the
>> safety of final fields in programming instead of the close encounter of
>> third kinds of synchronization, volatiles, and fences.
>>
>>>
>>> Can we not keep this 'simple' (if that word has meaning in this
>>> context)? What am I missing?
>>>
>> I think you are missing a bit about how the layout (inlining is
>> represented in memory) and value classes (the thread safety its final
>> offers) work, and what "non-atomic" means. Feel free to question more.
>>
>>>
>>> John
>>>
>>>
>>> --
>>> Phone:  (416) 450-3584 (cell)
>>>
>>
>
> --
> Phone:  (416) 450-3584 (cell)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240119/88830ff5/attachment.htm>