Null restriction on interface typed fields
Brian Goetz
brian.goetz at oracle.com
Mon Jun 17 21:05:43 UTC 2024
What you are describing is a possible, though less desirable, way to
implement nullity control, one which could be described as "erased
nullity". That is, the language does its type checking (as it does with
generics) and then throws out the nullity information (as it does with
generics), potentially inserting checks where it wouldn't be able to
trust the contents of the variable (as it does with generics). But in
the face of interference or separate compilation anomalies, these checks
can fail. It would obviously be better if we could achieve "bang means
provably never null, ever", and that is what I was describing through
the flow analysis I was alluding to. (Again, all of this will be the
subject of more extended writeups, in due time.)
A good way to think about why this is important is not with null
references, but with the physical zero of a value type. There are going
to be some value types for which the all-zero representation is not only
not a good default, but not even a member of the value set of the class,
such as:
value class IsTrue {
private boolean isTrue;
private boolean isFalse;
IsTrue(boolean b) {
isTrue = b;
isFalse = !b;
}
boolean isSane() {
return isTrue || isFalse;
}
}
If we reviewed this class, we would reasonably conclude that `isSane` is
always true. However, if we can somehow observe an `IsTrue!` before it
is written, both booleans would be seen to be false, and the sanity
check would not hold. In the case of a null reference, dereferencing
and maybe getting an NPE is "not so unsafe" because we stop before we
use the invalid (null) value, but in the case of an uninitialized
`IsTrue!`, something far worse has happened -- a value has escaped that
was never created by a constructor, which we might actually use.
This is why we are pursuing strengthening DA analysis in the VM, so that
such instances will provably never be observed.
On 6/17/2024 4:56 PM, Olexandr Rotan wrote:
>
> Sorry if I am missing out something, but if null-restricted type
> contains null value (as vm initializes it), isn't the null value
> itself is a marker of the fact that value hasn't been set yet?
>
> As I understand, if you assert non-nullity of null able type or type
> of unknown nullability with some sort of syntax (like ! in many
> languages), when in fact the value is null, exception should be
> thrown, so vm already obligated to perform null-check on this
> assignments. Same goes for reflective set, which should be treated as
> unknown nullity in any condition I guess unless vm could specialize
> instructions for the same method call but different nullity of args
> (either way vm must check nullity of value in at least part of the
> situations). This leads to a fact that if person even somehow manages
> to sneak through compiler analysis and try to set null to non-nullable
> field, it will immediately fail. Therefore, if value is null - it is
> unset.
>
> That said, does it in fact matter that VM initializes this fields with
> null, if null bit (I guess that is how it stored), in fact is a marker
> of whether field has been initialized or not? VM could just check if
> value of field is null, and if so, throw an exception. Moreover, one
> way or another, as you said, checks on value access must be performed.
> So is this analysis really converges to "check if field is null", or
> there is more to it and I miss it?
>
>
> On Mon, Jun 17, 2024, 22:16 Chen Liang <chen.l.liang at oracle.com> wrote:
>
> Indeed, another feature about the strict fields is their strong
> encapsulation against reflective setters; another safeguard
> against representable invalid values and their friendliness toward
> constant folding. John Rose's chart here seems up-to-date:
> objects-reloaded (openjdk.org)
> <https://cr.openjdk.org/~jrose/values/objects-reloaded.pdf> (This
> strictness can also benefit the upcoming Stable Values JEP)
>
> Also for readers' convenience, I think this is the earliest and
> most comprehensive document about the strict field designs, seems
> still accurate for the most part: Cleanup on aisle zero:
> Constructive classes (openjdk.org)
> <https://cr.openjdk.org/~jrose/jls/constructive-classes.html>
>
> - Chen
> ------------------------------------------------------------------------
> *From:* Brian Goetz <brian.goetz at oracle.com>
> *Sent:* Monday, June 17, 2024 1:00 PM
> *To:* Chen Liang <chen.l.liang at oracle.com>;
> valhalla-dev at openjdk.org <valhalla-dev at openjdk.org>
> *Subject:* Re: Null restriction on interface typed fields
> Yes, the eventual plan is that all references can be
> null-restricted. We've been struggling with a specific problem,
> though, which is that for a reference type like `String!` or
> `Runnable!`, the VM is going to initialize such variables with ..
> null. This is obviously a safety problem, because we've put a
> value in a variable that is not provably within the value set of
> the variable's type. It was for this reason that earlier
> discussion focused on nullity control for (implicitly
> constructible) values first, and other types later.
>
> Since then, we've figured out that we can solve this problem with
> better flow analysis. Just like the DA analysis done by the
> compiler, the VM can do a similar analysis during verification
> time of fields that the compiler marks as "must be written before
> they are read" (where any this-escape might lead to reading those
> fields.) This goes under the name of "strict fields", and we
> should be writing more about this soon.
>
> Once we have this tool in our kit, the limitations on what types
> can be null-restricted -- and the safety with which we can enforce
> this -- will be greatly broadened.
>
> On 6/9/2024 5:04 AM, Chen Liang wrote:
>> Hello valhalla community,
>> In our current iteration of valhalla, we only support null
>> restriction on value fields, as for references, null is a valid
>> default value. Meanwhile, interfaces are not restricted to be
>> value or identity, yet some value classes will only be
>> represented via interfaces, most notably Map.entry().
>>
>> In addition, consider this scenario in the ClassFile API: we have
>> ClassBuilder.withMethod(xxx, Consumer<MethodBuilder>) and
>> withMethodBody(xxx, Consumer<CodeBuilder>). A straightforward
>> implementation of withMethodBody would be withMethod(xxx, new
>> WithCode(cbConsumer)), where WithCode is (value) record
>> WithCode(Consumer<CodeBuilder> build) implements
>> Consumer<MethodBuilder>...
>> In this WithCode record, we can see that we are interested in
>> declaring "build" null-restricted; if build is a value lambda
>> that simply captures context variables, then WithCode can be
>> inlined to the same captures should "build" be NR, without
>> concerns about representing a null build value in the inlined form.
>>
>> Another example is in Algebraic types:
>> sealed interface Operation permits O1, O2, ...
>> Record O1(Interface1 xxx)
>> Record O2(Interface2 xxx)
>> For value-based Interface1 and Interface2 values, we might seek
>> to NR the fields so the record representations can be simpler.
>>
>> Thus, I recommend considering support for NR on interface fields
>> in addition to on explicit value type fields. Though this might
>> lead down to the rabbit hole of "heap pollution of null on
>> reference pointers", I still believe its benefit outweighs its
>> cost, especially in cases where these values can be restricted on
>> the stack in escape analysis, as in most functional APIs
>> (function only invoked conditionally, function objects not stored
>> when method exits).
>>
>> Chen Liang
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240617/69bc7395/attachment-0001.htm>
More information about the valhalla-dev
mailing list