Null restriction on interface typed fields
Brian Goetz
brian.goetz at oracle.com
Mon Jun 17 21:57:17 UTC 2024
There is a large category of hazards going under the collective name
"escaping construction", by which an incompletely initialized object may
be observed by alien code (such as a virtual method overridden by a
subclass, or the super-constructor, or any code to which `this` has been
explicitly or explicitly exposed during construction). We've had this
hazard forever, but the advent of non-nullable value types for which the
physical zero is not in the value set makes this hazard considerably worse.
On 6/17/2024 5:19 PM, Olexandr Rotan wrote:
>
> This makes sense, I didn't even think about primitives. Although, I
> might not fully understand what is meant by "somehow observed"? Is it
> refers to bypassing VM-provided access to a value such as reading
> memory using Arenas and other low-level APIs? And if yes, how is it
> possible to even restrict this access?
>
> And also, it's a discovery that even if object value is null, it's
> primitive fields are still initialized. Is it really how this works?
> If not, they I guess I just missed the point here, because how can you
> "observe" unexciting object?
>
>
> On Tue, Jun 18, 2024, 00:05 Brian Goetz <brian.goetz at oracle.com> wrote:
>
> What you are describing is a possible, though less desirable, way
> to implement nullity control, one which could be described as
> "erased nullity". That is, the language does its type checking
> (as it does with generics) and then throws out the nullity
> information (as it does with generics), potentially inserting
> checks where it wouldn't be able to trust the contents of the
> variable (as it does with generics). But in the face of
> interference or separate compilation anomalies, these checks can
> fail. It would obviously be better if we could achieve "bang
> means provably never null, ever", and that is what I was
> describing through the flow analysis I was alluding to. (Again,
> all of this will be the subject of more extended writeups, in due
> time.)
>
> A good way to think about why this is important is not with null
> references, but with the physical zero of a value type. There are
> going to be some value types for which the all-zero representation
> is not only not a good default, but not even a member of the value
> set of the class, such as:
>
> value class IsTrue {
> private boolean isTrue;
> private boolean isFalse;
>
> IsTrue(boolean b) {
> isTrue = b;
> isFalse = !b;
> }
>
> boolean isSane() {
> return isTrue || isFalse;
> }
> }
>
> If we reviewed this class, we would reasonably conclude that
> `isSane` is always true. However, if we can somehow observe an
> `IsTrue!` before it is written, both booleans would be seen to be
> false, and the sanity check would not hold. In the case of a null
> reference, dereferencing and maybe getting an NPE is "not so
> unsafe" because we stop before we use the invalid (null) value,
> but in the case of an uninitialized `IsTrue!`, something far worse
> has happened -- a value has escaped that was never created by a
> constructor, which we might actually use.
>
> This is why we are pursuing strengthening DA analysis in the VM,
> so that such instances will provably never be observed.
>
> On 6/17/2024 4:56 PM, Olexandr Rotan wrote:
>>
>> Sorry if I am missing out something, but if null-restricted type
>> contains null value (as vm initializes it), isn't the null value
>> itself is a marker of the fact that value hasn't been set yet?
>>
>> As I understand, if you assert non-nullity of null able type or
>> type of unknown nullability with some sort of syntax (like ! in
>> many languages), when in fact the value is null, exception should
>> be thrown, so vm already obligated to perform null-check on this
>> assignments. Same goes for reflective set, which should be
>> treated as unknown nullity in any condition I guess unless vm
>> could specialize instructions for the same method call but
>> different nullity of args (either way vm must check nullity of
>> value in at least part of the situations). This leads to a fact
>> that if person even somehow manages to sneak through compiler
>> analysis and try to set null to non-nullable field, it will
>> immediately fail. Therefore, if value is null - it is unset.
>>
>> That said, does it in fact matter that VM initializes this fields
>> with null, if null bit (I guess that is how it stored), in fact
>> is a marker of whether field has been initialized or not? VM
>> could just check if value of field is null, and if so, throw an
>> exception. Moreover, one way or another, as you said, checks on
>> value access must be performed. So is this analysis really
>> converges to "check if field is null", or there is more to it and
>> I miss it?
>>
>>
>> On Mon, Jun 17, 2024, 22:16 Chen Liang <chen.l.liang at oracle.com>
>> wrote:
>>
>> Indeed, another feature about the strict fields is their
>> strong encapsulation against reflective setters; another
>> safeguard against representable invalid values and their
>> friendliness toward constant folding. John Rose's chart here
>> seems up-to-date: objects-reloaded (openjdk.org)
>> <https://cr.openjdk.org/~jrose/values/objects-reloaded.pdf> (This
>> strictness can also benefit the upcoming Stable Values JEP)
>>
>> Also for readers' convenience, I think this is the earliest
>> and most comprehensive document about the strict field
>> designs, seems still accurate for the most part: Cleanup on
>> aisle zero: Constructive classes (openjdk.org)
>> <https://cr.openjdk.org/~jrose/jls/constructive-classes.html>
>>
>> - Chen
>> ------------------------------------------------------------------------
>> *From:* Brian Goetz <brian.goetz at oracle.com>
>> *Sent:* Monday, June 17, 2024 1:00 PM
>> *To:* Chen Liang <chen.l.liang at oracle.com>;
>> valhalla-dev at openjdk.org <valhalla-dev at openjdk.org>
>> *Subject:* Re: Null restriction on interface typed fields
>> Yes, the eventual plan is that all references can be
>> null-restricted. We've been struggling with a specific
>> problem, though, which is that for a reference type like
>> `String!` or `Runnable!`, the VM is going to initialize such
>> variables with .. null. This is obviously a safety problem,
>> because we've put a value in a variable that is not provably
>> within the value set of the variable's type. It was for this
>> reason that earlier discussion focused on nullity control for
>> (implicitly constructible) values first, and other types later.
>>
>> Since then, we've figured out that we can solve this problem
>> with better flow analysis. Just like the DA analysis done by
>> the compiler, the VM can do a similar analysis during
>> verification time of fields that the compiler marks as "must
>> be written before they are read" (where any this-escape might
>> lead to reading those fields.) This goes under the name of
>> "strict fields", and we should be writing more about this soon.
>>
>> Once we have this tool in our kit, the limitations on what
>> types can be null-restricted -- and the safety with which we
>> can enforce this -- will be greatly broadened.
>>
>> On 6/9/2024 5:04 AM, Chen Liang wrote:
>>> Hello valhalla community,
>>> In our current iteration of valhalla, we only support null
>>> restriction on value fields, as for references, null is a
>>> valid default value. Meanwhile, interfaces are not
>>> restricted to be value or identity, yet some value classes
>>> will only be represented via interfaces, most notably
>>> Map.entry().
>>>
>>> In addition, consider this scenario in the ClassFile API: we
>>> have ClassBuilder.withMethod(xxx, Consumer<MethodBuilder>)
>>> and withMethodBody(xxx, Consumer<CodeBuilder>). A
>>> straightforward implementation of withMethodBody would be
>>> withMethod(xxx, new WithCode(cbConsumer)), where WithCode is
>>> (value) record WithCode(Consumer<CodeBuilder> build)
>>> implements Consumer<MethodBuilder>...
>>> In this WithCode record, we can see that we are interested
>>> in declaring "build" null-restricted; if build is a value
>>> lambda that simply captures context variables, then WithCode
>>> can be inlined to the same captures should "build" be NR,
>>> without concerns about representing a null build value in
>>> the inlined form.
>>>
>>> Another example is in Algebraic types:
>>> sealed interface Operation permits O1, O2, ...
>>> Record O1(Interface1 xxx)
>>> Record O2(Interface2 xxx)
>>> For value-based Interface1 and Interface2 values, we might
>>> seek to NR the fields so the record representations can be
>>> simpler.
>>>
>>> Thus, I recommend considering support for NR on interface
>>> fields in addition to on explicit value type fields. Though
>>> this might lead down to the rabbit hole of "heap pollution
>>> of null on reference pointers", I still believe its benefit
>>> outweighs its cost, especially in cases where these values
>>> can be restricted on the stack in escape analysis, as in
>>> most functional APIs (function only invoked conditionally,
>>> function objects not stored when method exits).
>>>
>>> Chen Liang
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/valhalla-dev/attachments/20240617/f50b7252/attachment-0001.htm>
More information about the valhalla-dev
mailing list