Null-restricted types: Why so complicated?

Fri Jan 19 19:21:21 UTC 2024

Hi Liang Chen,

I totally agree with your earlier comment re more efficient checking. The
big win comes from value classes. Null-restricted classes are a lesser but
useful additional win. But I remain concerned about catching an undefined
element of a Long![] array. Maybe we just have to wait for 128-bit word
sizes, so we don't need null restriction for boxed primitives!

I think you are convincing me that any array of 'large' null-restricted
elements is unsafe. Maybe that's the answer -- Leave null-restricted types
as defined, but don't implement arrays of null-restricted elements larger
than ints! Or, to put this in more acceptable terms, let the JVM make
that decision.

John

PS:  I still maintain that use of a public implicit constructor to indicate
null-restriction is not a good choice, because of the constraints it
imposes on developers. See Effective Java, Item 1, for the design pattern
it renders ineffective. A marker interface (analogous to Serializable or
Iterable) would avoid this effect.

On Fri, Jan 19, 2024 at 12:52 PM - <liangchenblue at gmail.com> wrote:

>
>
> On Fri, Jan 19, 2024 at 10:07 AM John Bossons <jbossons at gmail.com> wrote:
>
>> Thanks for your comments. I was not sufficiently explicit.
>>
>> Let me focus on implicit. I guess my dislike is of introducing a 'fake'
>> constructor into the definition of a class. I say 'fake' because, as I
>> understand it, the only purpose of the implicit constructor is to indicate
>> to the JVM/compiler that a never-null instance can be created. But in Java
>> idiom that means that a developer can invoke the public implicit
>> constructor, which will cause confusion.
>>
> It is a real constructor like that default no-arg one generated by javac,
> and developers always CAN invoke that constructor. It is, however, a
> watered-down version, because our existing default one can perform side
> effects like `private List<Integer> values = new ArrayList<>();` injected
> to the end of constructor, while the implicit one must give it up so that
> JVM can construct zero instances cheaply yet correctly. As a result, this
> constructor cannot declare any side-effect code or declare a custom
> superconstrctor call, so it will look like a "fake" one, yet it is no
> different from a true constructor.
>
>>
>> Maybe it would be better to require a potentially null-restricted class
>> to extend a marker interface ('extends NeverNullPossible'? Or maybe,
>> looking ahead to my next comment, 'extends AllZerosIsNull'?). That would
>> enable the compiler to catch an invalid use of the ! marker in a
>> declaration, just as the proposed implicit constructor does, while
>> conforming better to common Java idiom.
>>
> A zero instance is a class capacity indeed, and such classes must be
> final. My only cent against marker interfaces is that I don't think Java
> compiler ever emits errors simply because your class implements an
> unsuitable interface.
>
>>
>> My further suggestion is that appending ! to a type should mean that the
>> default initialized value of an instance (all fields zero) is equivalent to
>> null, so that
>>       Range![] a = new Range![100];  // allocated with zero values
>>       System.out.println(a[5]);      // throws NullPointerException (zero
>> fields)
>> This better conforms to current idiom, where the initial initialization
>> is with nulls and the println invocation on a null array element or field
>> throws a NPE.
>>
> Consider this: how would you differentiate a null Range versus a Range[0,
> 0]? Are both all zero bits?
> This is where the zero instance starts: before anything, the zero instance
> has always been a VALID instance of an object, yet its inlined
> representation will be all zero bits, which means it will coincide with
> null; thus, we introduce the null-restricted concept to avoid the
> performance pitfalls we will suffer to represent a null.
>
> Also adding on to Anh Mai's comment, recall that Range is a value class (a
> prerequisite to null-restriction) so its identity doesn't matter; the VM is
> totally permitted inline the null-friendly range array with 9-byte units (8
> byte + single bit indicating nullity), and it is still somewhat a memory
> win over linking to regular objects. But we might need some fine-grained
> control to ensure VM allocates an inlined array instead of a pointer array
> in this case. In this case, testing null and throwing NPE would be simply
> checking one bit, which is more reliable than scanning a whole byte
> interval too.
>
>>
>> As you say, my suggestion means runtime testing to determine if all
>> fields are zero, which has a performance cost. This will only occur if the
>> JVM implements the ! specification, which it presumably will only do if the
>> object is small. And the cost will be small (I am presuming) relative to
>> savings from allowing the memory footprint to match that of primitives. Am
>> I wrong? There is value in conforming to current idiom.
>>
>> Turning to the LooselyConsistentValue, I withdraw my comments. I
>> mistakenly presumed that its use would be required, which is false. It
>> simply enables a single-threaded (or volatile-protected) application to
>> allow additional inlining, which is harmless.
>>
>
>> John
>>
>> On Thu, Jan 18, 2024 at 4:56 PM - <liangchenblue at gmail.com> wrote:
>>
>>> Hi John,
>>>
>>> On Thu, Jan 18, 2024 at 2:30 PM John Bossons <jbossons at gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Maybe I am missing something, but the proposal seems to be trying to do
>>>> too much.
>>>>
>>>> Specifically: Why not simply provide that appending ! to a type
>>>> specification for an object (field, array element, or parameter) means that
>>>> that the object is not only null-restricted but also never zero and
>>>> necessarily non-atomic unless small?
>>>>
>>> First, a reminder that some objects cannot be non-atomic, mostly when
>>> fields have dependencies/constraints on each other: if you have a range,
>>> you cannot allow its lower bound to be larger than its upper bound.
>>> Non-atomic representations cannot avoid this pitfall. Also you seem
>>> to misunderstand non-atomic: if an object is non-atomic, each of its fields
>>> can update independently from each other, so a 3-d position can be
>>> non-atomic, but not so for a range. Non-atomicity is dangerous, and it
>>> should not be the default. However, if an atomic class is small enough,
>>> like OptionalInt (as now many architecture has like atomic handling of 16
>>> bytes etc.) JVM may choose to apply non-atomic optimizations to them for
>>> better performance without violating their object constraints.
>>>
>>>>
>>>> Why complicate the specification with an implicit constructor that a
>>>> developer will never explicitly invoke? Why permit a developer to 'opt in'
>>>> to non-atomic?
>>>>
>>> The implicit constructor can always be called; its existence asks
>>> programmers to affirm that the zero-filled inlined instance is a valid
>>> instance. And this instance is different from a null, as null is a pointer,
>>> yet the zero-instance has a different size defined by the class layout in
>>> the stack/heap.
>>>
>>>>
>>>> Sure, that means trying to read a zero value triggers a NPE. That just
>>>> means that a type that can legitimately have a zero value cannot be
>>>> specified as null-restricted, since a zero value (e.g. a {null, null} Name)
>>>> is the equivalent of a null unrestricted value object. Why go beyond that?
>>>> If a non-null zero value is possible, the type cannot be null-restricted
>>>> and so can only be an unrestricted JEP 401 value type. End of story.
>>>>
>>> You see the inlined zero instance and the null pointer have different
>>> sizes, and thus they are not exchangeable. Converting the inlined zero
>>> instance to null to throw NPE is complex and hurtful to performance as you
>>> will scan unrelated bits for almost every field access.
>>>
>>> And for unrestricted value type, yes, they exist and can possibly be
>>> inlined as well if the restricted type is small enough (i.e. has space for
>>> extra bit indicating nullity) But reminder, the nullity bit itself isn't
>>> even non-atomic with (depends on) the rest of the object! You don't want
>>> the nullity to indicate null while the rest of the object indicate some
>>> sort of non-null value, which can happen in a non-atomic context.
>>>
>>>>
>>>> With respect to non-atomic, what is new? Yes, unexpected  instances may
>>>> occur without synchronization if the object is larger than the word size of
>>>> the implementation. Why do we need to extend a LooselyConsistentValue
>>>> interface to know/permit that?
>>>>
>>> Unexpected instances don't occur without synchronization if you use
>>> finals, such as in Java's String or immutable List.of(). These APIs may
>>> capture any "permitted value" from the arrays passed in, but once
>>> constructed, the captured value remains constant no matter which thread
>>> observes the String/List object reference. (Technically, JVM implements
>>> this with a store-store fence between end of field writes in the
>>> constructor and object reference is shared anywhere, and a load-load fence
>>> between object reference read and field read) Value classes is about the
>>> safety of final fields in programming instead of the close encounter of
>>> third kinds of synchronization, volatiles, and fences.
>>>
>>>>
>>>> Can we not keep this 'simple' (if that word has meaning in this
>>>> context)? What am I missing?
>>>>
>>> I think you are missing a bit about how the layout (inlining is
>>> represented in memory) and value classes (the thread safety its final
>>> offers) work, and what "non-atomic" means. Feel free to question more.
>>>
>>>>
>>>> John
>>>>
>>>>
>>>> --
>>>> Phone:  (416) 450-3584 (cell)
>>>>
>>>
>>
>> --
>> Phone:  (416) 450-3584 (cell)
>>
>

-- 
Phone:  (416) 450-3584 (cell)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240119/30a6c473/attachment.htm>