Null restriction on interface typed fields

Mon Jun 17 21:57:17 UTC 2024

There is a large category of hazards going under the collective name 
"escaping construction", by which an incompletely initialized object may 
be observed by alien code (such as a virtual method overridden by a 
subclass, or the super-constructor, or any code to which `this` has been 
explicitly or explicitly exposed during construction).  We've had this 
hazard forever, but the advent of non-nullable value types for which the 
physical zero is not in the value set makes this hazard considerably worse.

On 6/17/2024 5:19 PM, Olexandr Rotan wrote:
>
> This makes sense, I didn't even think about primitives. Although, I 
> might not fully understand what is meant by "somehow observed"? Is it 
> refers to bypassing VM-provided access to a value such as reading 
> memory using Arenas and other low-level APIs? And if yes, how is it 
> possible to even restrict this access?
>
> And also, it's a discovery that even if object value is null, it's 
> primitive fields are still initialized. Is it really how this works? 
> If not, they I guess I just missed the point here, because how can you 
> "observe" unexciting object?
>
>
> On Tue, Jun 18, 2024, 00:05 Brian Goetz <brian.goetz at oracle.com> wrote:
>
>     What you are describing is a possible, though less desirable, way
>     to implement nullity control, one which could be described as
>     "erased nullity".  That is, the language does its type checking
>     (as it does with generics) and then throws out the nullity
>     information (as it does with generics), potentially inserting
>     checks where it wouldn't be able to trust the contents of the
>     variable (as it does with generics).  But in the face of
>     interference or separate compilation anomalies, these checks can
>     fail.  It would obviously be better if we could achieve "bang
>     means provably never null, ever", and that is what I was
>     describing through the flow analysis I was alluding to.  (Again,
>     all of this will be the subject of more extended writeups, in due
>     time.)
>
>     A good way to think about why this is important is not with null
>     references, but with the physical zero of a value type.  There are
>     going to be some value types for which the all-zero representation
>     is not only not a good default, but not even a member of the value
>     set of the class, such as:
>
>         value class IsTrue {
>             private boolean isTrue;
>             private boolean isFalse;
>
>             IsTrue(boolean b) {
>                 isTrue = b;
>                 isFalse = !b;
>             }
>
>             boolean isSane() {
>                 return isTrue || isFalse;
>             }
>         }
>
>     If we reviewed this class, we would reasonably conclude that
>     `isSane` is always true.  However, if we can somehow observe an
>     `IsTrue!` before it is written, both booleans would be seen to be
>     false, and the sanity check would not hold.  In the case of a null
>     reference, dereferencing and maybe getting an NPE is "not so
>     unsafe" because we stop before we use the invalid (null) value,
>     but in the case of an uninitialized `IsTrue!`, something far worse
>     has happened -- a value has escaped that was never created by a
>     constructor, which we might actually use.
>
>     This is why we are pursuing strengthening DA analysis in the VM,
>     so that such instances will provably never be observed.
>
>     On 6/17/2024 4:56 PM, Olexandr Rotan wrote:
>>
>>     Sorry if I am missing out something, but if null-restricted type
>>     contains null value (as vm initializes it), isn't the null value
>>     itself is a marker of the fact that value hasn't been set yet?
>>
>>     As I understand, if you assert non-nullity of null able type or
>>     type of unknown nullability with some sort of syntax (like ! in
>>     many languages), when in fact the value is null, exception should
>>     be thrown, so vm already obligated to perform null-check on this
>>     assignments. Same goes for reflective set, which should be
>>     treated as unknown nullity in any condition I guess unless vm
>>     could specialize instructions for the same method call but
>>     different nullity of args (either way vm must check nullity of
>>     value in at least part of the situations). This leads to a fact
>>     that if person even somehow manages to sneak through compiler
>>     analysis and try to set null to non-nullable field, it will
>>     immediately fail. Therefore, if value is null - it is unset.
>>
>>     That said, does it in fact matter that VM initializes this fields
>>     with null, if null bit (I guess that is how it stored), in fact
>>     is a marker of whether field has been initialized or not? VM
>>     could just check if value of field is null, and if so, throw an
>>     exception. Moreover, one way or another, as you said, checks on
>>     value access must be performed. So is this analysis really
>>     converges to "check if field is null", or there is more to it and
>>     I miss it?
>>
>>
>>     On Mon, Jun 17, 2024, 22:16 Chen Liang <chen.l.liang at oracle.com>
>>     wrote:
>>
>>         Indeed, another feature about the strict fields is their
>>         strong encapsulation against reflective setters; another
>>         safeguard against representable invalid values and their
>>         friendliness toward constant folding. John Rose's chart here
>>         seems up-to-date: objects-reloaded (openjdk.org)
>>         <https://cr.openjdk.org/~jrose/values/objects-reloaded.pdf> (This
>>         strictness can also benefit the upcoming Stable Values JEP)
>>
>>         Also for readers' convenience, I think this is the earliest
>>         and most comprehensive document about the strict field
>>         designs, seems still accurate for the most part: Cleanup on
>>         aisle zero: Constructive classes (openjdk.org)
>>         <https://cr.openjdk.org/~jrose/jls/constructive-classes.html>
>>
>>         - Chen
>>         ------------------------------------------------------------------------
>>         *From:* Brian Goetz <brian.goetz at oracle.com>
>>         *Sent:* Monday, June 17, 2024 1:00 PM
>>         *To:* Chen Liang <chen.l.liang at oracle.com>;
>>         valhalla-dev at openjdk.org <valhalla-dev at openjdk.org>
>>         *Subject:* Re: Null restriction on interface typed fields
>>         Yes, the eventual plan is that all references can be
>>         null-restricted.  We've been struggling with a specific
>>         problem, though, which is that for a reference type like
>>         `String!` or `Runnable!`, the VM is going to initialize such
>>         variables with .. null.  This is obviously a safety problem,
>>         because we've put a value in a variable that is not provably
>>         within the value set of the variable's type.  It was for this
>>         reason that earlier discussion focused on nullity control for
>>         (implicitly constructible) values first, and other types later.
>>
>>         Since then, we've figured out that we can solve this problem
>>         with better flow analysis.  Just like the DA analysis done by
>>         the compiler, the VM can do a similar analysis during
>>         verification time of fields that the compiler marks as "must
>>         be written before they are read" (where any this-escape might
>>         lead to reading those fields.)  This goes under the name of
>>         "strict fields", and we should be writing more about this soon.
>>
>>         Once we have this tool in our kit, the limitations on what
>>         types can be null-restricted -- and the safety with which we
>>         can enforce this -- will be greatly broadened.
>>
>>         On 6/9/2024 5:04 AM, Chen Liang wrote:
>>>         Hello valhalla community,
>>>         In our current iteration of valhalla, we only support null
>>>         restriction on value fields, as for references, null is a
>>>         valid default value. Meanwhile, interfaces are not
>>>         restricted to be value or identity, yet some value classes
>>>         will only be represented via interfaces, most notably
>>>         Map.entry().
>>>
>>>         In addition, consider this scenario in the ClassFile API: we
>>>         have ClassBuilder.withMethod(xxx, Consumer<MethodBuilder>)
>>>         and withMethodBody(xxx, Consumer<CodeBuilder>). A
>>>         straightforward implementation of withMethodBody would be
>>>         withMethod(xxx, new WithCode(cbConsumer)), where WithCode is
>>>         (value) record WithCode(Consumer<CodeBuilder> build)
>>>         implements Consumer<MethodBuilder>...
>>>         In this WithCode record, we can see that we are interested
>>>         in declaring "build" null-restricted; if build is a value
>>>         lambda that simply captures context variables, then WithCode
>>>         can be inlined to the same captures should "build" be NR,
>>>         without concerns about representing a null build value in
>>>         the inlined form.
>>>
>>>         Another example is in Algebraic types:
>>>         sealed interface Operation permits O1, O2, ...
>>>         Record O1(Interface1 xxx)
>>>         Record O2(Interface2 xxx)
>>>         For value-based Interface1 and Interface2 values, we might
>>>         seek to NR the fields so the record representations can be
>>>         simpler.
>>>
>>>         Thus, I recommend considering support for NR on interface
>>>         fields in addition to on explicit value type fields. Though
>>>         this might lead down to the rabbit hole of "heap pollution
>>>         of null on reference pointers", I still believe its benefit
>>>         outweighs its cost, especially in cases where these values
>>>         can be restricted on the stack in escape analysis, as in
>>>         most functional APIs (function only invoked conditionally,
>>>         function objects not stored when method exits).
>>>
>>>         Chen Liang
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240617/f50b7252/attachment-0001.htm>