Null-restricted types: Why so complicated?

Fri Jan 19 20:27:18 UTC 2024

Hi John,
For the implicit constructor, it really means to indicate that the zero
default is available. The ability to null-restrict such a field/array
element is implied from this zero default, because now such a default can
replace the null when you have a few zero-filled byte representation of an
object. You shouldn't declare such a constructor solely for the ability to
null-restrict.

As I can see, you love nulls as a default to ensure safety from your
previous comments, and Valhalla designers do too! The point is while
null-restriction is alluring for you to achieve max optimization, you
shouldn't always go for it. An example from Valhalla is that LocalDateTime,
despite being value-based and a candidate for value class migration, should
not have a zero instance, as a zero at 1970-1-1 is a valid value and is not
suitable to indicate an invalid date time like null is. So if we don't want
a null Range to be interpreted as [0, 0], we won't declare this Range to
have an implicit constructor. A positive example for an implicit
constructor would be post-migration Optional: the default instance will
contain a null pointer, equivalent to current Optional.empty(), and it's a
good default value in place of null.

So we might say, without null restriction, don't we lose so many chances of
optimization? Luckily, modern hardware allows our N (from Maurizio's
flattening upper limit) to be quite large, and sometimes there are free
bits in the layout (like from boolean fields) allowing us to represent a
null without extra space.

For your question about Long![] array, it will be almost the same as a
long[] array. Maybe the only difference is that Long! won't allow word
tearing specification-wise while long does, a relic from the times of birth
of the Java programming language. Per valhalla expert group members, the
last platform that really needed to tear long was like some 2013 robot
platform, so I think asking for more de-jure safety in Long! compared to
long is not a bad thing. If you need nulls, you still have to resort to a
Long[] array, or an OptionalLong![] array.

On Fri, Jan 19, 2024 at 1:21 PM John Bossons <jbossons at gmail.com> wrote:

> Hi Liang Chen,
>
> I totally agree with your earlier comment re more efficient checking. The
> big win comes from value classes. Null-restricted classes are a lesser but
> useful additional win. But I remain concerned about catching an undefined
> element of a Long![] array. Maybe we just have to wait for 128-bit word
> sizes, so we don't need null restriction for boxed primitives!
>
> I think you are convincing me that any array of 'large' null-restricted
> elements is unsafe. Maybe that's the answer -- Leave null-restricted types
> as defined, but don't implement arrays of null-restricted elements larger
> than ints! Or, to put this in more acceptable terms, let the JVM make
> that decision.
>
> John
>
> PS:  I still maintain that use of a public implicit constructor to
> indicate null-restriction is not a good choice, because of the constraints
> it imposes on developers. See Effective Java, Item 1, for the design
> pattern it renders ineffective. A marker interface (analogous to
> Serializable or Iterable) would avoid this effect.
>
> On Fri, Jan 19, 2024 at 12:52 PM - <liangchenblue at gmail.com> wrote:
>
>>
>>
>> On Fri, Jan 19, 2024 at 10:07 AM John Bossons <jbossons at gmail.com> wrote:
>>
>>> Thanks for your comments. I was not sufficiently explicit.
>>>
>>> Let me focus on implicit. I guess my dislike is of introducing a 'fake'
>>> constructor into the definition of a class. I say 'fake' because, as I
>>> understand it, the only purpose of the implicit constructor is to indicate
>>> to the JVM/compiler that a never-null instance can be created. But in Java
>>> idiom that means that a developer can invoke the public implicit
>>> constructor, which will cause confusion.
>>>
>> It is a real constructor like that default no-arg one generated by javac,
>> and developers always CAN invoke that constructor. It is, however, a
>> watered-down version, because our existing default one can perform side
>> effects like `private List<Integer> values = new ArrayList<>();` injected
>> to the end of constructor, while the implicit one must give it up so that
>> JVM can construct zero instances cheaply yet correctly. As a result, this
>> constructor cannot declare any side-effect code or declare a custom
>> superconstrctor call, so it will look like a "fake" one, yet it is no
>> different from a true constructor.
>>
>>>
>>> Maybe it would be better to require a potentially null-restricted class
>>> to extend a marker interface ('extends NeverNullPossible'? Or maybe,
>>> looking ahead to my next comment, 'extends AllZerosIsNull'?). That would
>>> enable the compiler to catch an invalid use of the ! marker in a
>>> declaration, just as the proposed implicit constructor does, while
>>> conforming better to common Java idiom.
>>>
>> A zero instance is a class capacity indeed, and such classes must be
>> final. My only cent against marker interfaces is that I don't think Java
>> compiler ever emits errors simply because your class implements an
>> unsuitable interface.
>>
>>>
>>> My further suggestion is that appending ! to a type should mean that the
>>> default initialized value of an instance (all fields zero) is equivalent to
>>> null, so that
>>>       Range![] a = new Range![100];  // allocated with zero values
>>>       System.out.println(a[5]);      // throws NullPointerException
>>> (zero fields)
>>> This better conforms to current idiom, where the initial initialization
>>> is with nulls and the println invocation on a null array element or field
>>> throws a NPE.
>>>
>> Consider this: how would you differentiate a null Range versus a Range[0,
>> 0]? Are both all zero bits?
>> This is where the zero instance starts: before anything, the zero
>> instance has always been a VALID instance of an object, yet its inlined
>> representation will be all zero bits, which means it will coincide with
>> null; thus, we introduce the null-restricted concept to avoid the
>> performance pitfalls we will suffer to represent a null.
>>
>> Also adding on to Anh Mai's comment, recall that Range is a value class
>> (a prerequisite to null-restriction) so its identity doesn't matter; the VM
>> is totally permitted inline the null-friendly range array with 9-byte units
>> (8 byte + single bit indicating nullity), and it is still somewhat a memory
>> win over linking to regular objects. But we might need some fine-grained
>> control to ensure VM allocates an inlined array instead of a pointer array
>> in this case. In this case, testing null and throwing NPE would be simply
>> checking one bit, which is more reliable than scanning a whole byte
>> interval too.
>>
>>>
>>> As you say, my suggestion means runtime testing to determine if all
>>> fields are zero, which has a performance cost. This will only occur if the
>>> JVM implements the ! specification, which it presumably will only do if the
>>> object is small. And the cost will be small (I am presuming) relative to
>>> savings from allowing the memory footprint to match that of primitives. Am
>>> I wrong? There is value in conforming to current idiom.
>>>
>>> Turning to the LooselyConsistentValue, I withdraw my comments. I
>>> mistakenly presumed that its use would be required, which is false. It
>>> simply enables a single-threaded (or volatile-protected) application to
>>> allow additional inlining, which is harmless.
>>>
>>
>>> John
>>>
>>> On Thu, Jan 18, 2024 at 4:56 PM - <liangchenblue at gmail.com> wrote:
>>>
>>>> Hi John,
>>>>
>>>> On Thu, Jan 18, 2024 at 2:30 PM John Bossons <jbossons at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Maybe I am missing something, but the proposal seems to be trying to
>>>>> do too much.
>>>>>
>>>>> Specifically: Why not simply provide that appending ! to a type
>>>>> specification for an object (field, array element, or parameter) means that
>>>>> that the object is not only null-restricted but also never zero and
>>>>> necessarily non-atomic unless small?
>>>>>
>>>> First, a reminder that some objects cannot be non-atomic, mostly when
>>>> fields have dependencies/constraints on each other: if you have a range,
>>>> you cannot allow its lower bound to be larger than its upper bound.
>>>> Non-atomic representations cannot avoid this pitfall. Also you seem
>>>> to misunderstand non-atomic: if an object is non-atomic, each of its fields
>>>> can update independently from each other, so a 3-d position can be
>>>> non-atomic, but not so for a range. Non-atomicity is dangerous, and it
>>>> should not be the default. However, if an atomic class is small enough,
>>>> like OptionalInt (as now many architecture has like atomic handling of 16
>>>> bytes etc.) JVM may choose to apply non-atomic optimizations to them for
>>>> better performance without violating their object constraints.
>>>>
>>>>>
>>>>> Why complicate the specification with an implicit constructor that a
>>>>> developer will never explicitly invoke? Why permit a developer to 'opt in'
>>>>> to non-atomic?
>>>>>
>>>> The implicit constructor can always be called; its existence asks
>>>> programmers to affirm that the zero-filled inlined instance is a valid
>>>> instance. And this instance is different from a null, as null is a pointer,
>>>> yet the zero-instance has a different size defined by the class layout in
>>>> the stack/heap.
>>>>
>>>>>
>>>>> Sure, that means trying to read a zero value triggers a NPE. That just
>>>>> means that a type that can legitimately have a zero value cannot be
>>>>> specified as null-restricted, since a zero value (e.g. a {null, null} Name)
>>>>> is the equivalent of a null unrestricted value object. Why go beyond that?
>>>>> If a non-null zero value is possible, the type cannot be null-restricted
>>>>> and so can only be an unrestricted JEP 401 value type. End of story.
>>>>>
>>>> You see the inlined zero instance and the null pointer have different
>>>> sizes, and thus they are not exchangeable. Converting the inlined zero
>>>> instance to null to throw NPE is complex and hurtful to performance as you
>>>> will scan unrelated bits for almost every field access.
>>>>
>>>> And for unrestricted value type, yes, they exist and can possibly be
>>>> inlined as well if the restricted type is small enough (i.e. has space for
>>>> extra bit indicating nullity) But reminder, the nullity bit itself isn't
>>>> even non-atomic with (depends on) the rest of the object! You don't want
>>>> the nullity to indicate null while the rest of the object indicate some
>>>> sort of non-null value, which can happen in a non-atomic context.
>>>>
>>>>>
>>>>> With respect to non-atomic, what is new? Yes, unexpected  instances
>>>>> may occur without synchronization if the object is larger than the word
>>>>> size of the implementation. Why do we need to extend a
>>>>> LooselyConsistentValue interface to know/permit that?
>>>>>
>>>> Unexpected instances don't occur without synchronization if you use
>>>> finals, such as in Java's String or immutable List.of(). These APIs may
>>>> capture any "permitted value" from the arrays passed in, but once
>>>> constructed, the captured value remains constant no matter which thread
>>>> observes the String/List object reference. (Technically, JVM implements
>>>> this with a store-store fence between end of field writes in the
>>>> constructor and object reference is shared anywhere, and a load-load fence
>>>> between object reference read and field read) Value classes is about the
>>>> safety of final fields in programming instead of the close encounter of
>>>> third kinds of synchronization, volatiles, and fences.
>>>>
>>>>>
>>>>> Can we not keep this 'simple' (if that word has meaning in this
>>>>> context)? What am I missing?
>>>>>
>>>> I think you are missing a bit about how the layout (inlining is
>>>> represented in memory) and value classes (the thread safety its final
>>>> offers) work, and what "non-atomic" means. Feel free to question more.
>>>>
>>>>>
>>>>> John
>>>>>
>>>>>
>>>>> --
>>>>> Phone:  (416) 450-3584 (cell)
>>>>>
>>>>
>>>
>>> --
>>> Phone:  (416) 450-3584 (cell)
>>>
>>
>
> --
> Phone:  (416) 450-3584 (cell)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240119/ac303ae8/attachment.htm>