Revisiting default values

Tobi Ajila Tobi_Ajila at ca.ibm.com
Tue Jul 28 17:33:15 UTC 2020


> Bucket #3 classes must be reference-default, and fields/arrays of their
inline type are illegal outside of the declaring class. The declaring class
can provide a flat array factory if it wants to. (A new idea from Tobi,
he'll write it up for the thread.)

```
public sealed abstract class LegacyType permits LegacyType.val { //Formerly
a concrete class, but now its abstract or maybe an interface
    //factory methods
    public static LegacyType makeALegacyType(...);//in some cases this
already exists
    public static LegacyType[] newALegacyTypeArray(int size);//can be
flattened
}

private inline class LegacyType.val extends LegacyType { ... } //this type
is hidden, only LegacyType knows about it
```

This approach is based on what Kevin mentioned earlier, "For all of these
types, there is one really fantastic default value that does everything you
would want it to do: null. That is why these types should not become
inline-types, or certainly not val-default inline types ...". Essentially,
by making these types reference-default and by providing an avenue to
restrict the value-projection to the reference-default type, the writer
maintains control of where and when the value-projection is allowed to be
observed thus solving the bad default problem. The writer also has the
ability to supply a flattened array factory with initialized elements.

This approach is appealing for the following reasons: no additional JVM
complexity (ie. no bytecode checks for the bad default value), no javac
boilerplate (ie. guards on member access, guards on method entries, etc.).
On the other there are two big drawbacks: no instance field flattening for
these types, and creating flattened arrays is a bit unnatural since it has
to be done via a factory.

Going back to Brian's comment:

> I'd suggest, though, we back away from implementation techniques (you've
got a good menu going already), and focus more on "what language do we want
to build."  You claim:
> > I don't think totally excluding Buckets #2 and #3 is a very good
outcome.
> Which I think is a reasonable hypothesis, but I suggest we focus the
discussion on whether we believe this or not, and what we might want to do
about it (and when), first.

I think it would help if we had a clear sense as to what proportion of
inline-types we think will have this "bad default" problem. Last year when
we discussed null-default inline types the thinking was that about 75% of
the motivation for null-defaults was migrating VBC, 20% for security, 5%
for "I want null in my value set.". My assumption is that the vast majority
of inline-types will not be migrated types, they will be new types. If this
is correct then it would appear that the default value problem is really a
problem for a minority of inline-types.

All the solutions proposed have some kind of cost associated with them, and
these costs vary (ie. jvm complexity, throughput overhead, JIT compilation
time, etc.). If the default value problem is only for a minority of the
types, I would argue that the costs should be limited to types that want to
opt-in to not expose their default value or un-initialized value. How we
feel about this will determine which direction we choose to take when
exploring the solution space.

So, in short I want to second Brian's comment, I think its important to
decide if we want this kind of feature but also what we are willing to give
up to get it.

--Tobi

"valhalla-spec-experts" <valhalla-spec-experts-retn at openjdk.java.net> wrote
on 2020/07/21 02:41:11 PM:

> From: Dan Smith <daniel.smith at oracle.com>
> To: valhalla-spec-experts <valhalla-spec-experts at openjdk.java.net>
> Cc: Brian Goetz <brian.goetz at oracle.com>
> Date: 2020/07/21 02:41 PM
> Subject: [EXTERNAL] Re: Revisiting default values
> Sent by: "valhalla-spec-experts"
<valhalla-spec-experts-retn at openjdk.java.net>
>
>
> > On Jul 20, 2020, at 10:27 AM, Brian Goetz <brian.goetz at oracle.com>
wrote:
> >
> > That said, doing so in the language is potentially more viable.
> It would mean, for classes that opt into this treatment:
> >
> >  - Ensuring that `C.default` evaluates to the right thing
> >  - Preventing `this` from escaping the constructor (which might be
> a good thing to enforce for inline classes anyway)
> >  - Ensuring all fields are DA (which we do already), and that
> assignments to fields in ctors are not their default value
> >  - Translating `new Foo[n]` (and reflective equivalent) with
> something that initializes the array elements
> >
> > The goal is to keep default instances from being observed.  If we
> lock down `this` from constructors, the major cost here is
> instantiating arrays of these things, but we already optimize array
> initialization loops like this pretty well.
> >
> > Overall this doesn't seem terrible.  It means that the cost of
> this is borne by the users of classes that opt into this treatment,
> and keeps the complexity out of the VM.  It does mean that
> "attackers" can generate bytecode to generate bad instances (a
> problem we have with multiple vectors today.)
> >
> > Call this "L".
>
> More letters!
>
> Expanding on ways to support Bucket #3 by ensuring initialization of
> fields/arrays:
>
> ---
>
> Option L: Language requires field/array initialization
>
> An inline class may be declared to have no default. Fields and
> arrays of that class's inline type must be provably initialized (via
> compiler analysis) before they are read or published.
>
> Instance fields of the class's inline type must be initialized
> before a method call involving 'this' occurs. (It's already illegal
> to allow the constructor to return before initialization.)
>
> Static fields... seem hopeless, so maybe must have a reference type
> (perhaps implicitly). Maybe we can do an analysis that permits some
> very simple cases, but once you allow method calls of almost any
> sort, you've lost. (We'd have to prove that no initialization of
> *other* classes triggered by <clinit> refers to the field before it
> has been initialized.)
>
> Arrays must be initialized at creation time, either with an array
> initializer ("Address[] as = { x, y, z };") or via a trusted API
> ("Address[] as = Arrays.of(i -> x);"). We might introduce a language
> sugar for the trusted API ("Address[] as = { i -> x };"). We *could*
> support two-stage initialization via things like 'Arrays.fill', but
> analysis to track uninitialized arrays from creation to filling
> doesn't seem worthwhile.
>
> This is less expressive, obviously. In particular, many comfortable
> idioms for initializing an array won't work. As a case study: what
> happens in generic code like ArrayList? When it wants to allocate
> its array (we're in a specialized world where T has been specialized
> to 'QAddress;'), what value does it fill the array with? Nothing is
> available, because at this point the list is empty, and it's just
> allocating storage for later. I guess ArrayList (and similar data
> structures) has to have a special back door, and we're left to trust
> the author not to expose the uninitialized payload.
>
> As with all language features, there's also the question of what
> happens when a class file doesn't conform to the language's rules.
> Option L can't really stand alone—it needs to be backed up by some
> other option when the language's guarantees fail.
>
> ---
>
> Option M: JVM requires field/array initialization
>
> Inline class files can indicate that their default instance is
> invalid. Fields and arrays of that class's inline type must be
> provably initialized (via verification or related analysis) before
> they are read or published.
>
> All the compile-time analysis of Option L applies here, because the
> language compiler needs to be sure its generated class files are valid.
>
> We can use some new verification types to track the initialization
> status of 'this', the way we do to require 'super' calls today. You
> don't have a fully formed 'Foo', capable of being passed to other
> methods, etc., until all fields are initialized. This would also
> apply to 'defaultvalue' for an inline class with a field of a
> default-less inline type.
>
> Again, static fields are hopeless, it's an error to use the inline
> type as a static field type.
>
> 'anewarray' of the inline type is illegal, except within a trusted
> API. That API promises to initialize every array component before
> publishing the array. (We won't try to guarantee this with an
> analysis—the API is trusted because it has been vetted by humans.)
> In addition to some standard factory methods, we could decide that
> the inline class itself is always a trusted API.
>
> (A related approach was discussed at our last EG meeting, but with
> much less expressiveness: inline-typed fields are always illegal,
> and arrays can only be allocated by the class author.)
>
> This closes the backdoor of other bytecode not playing by the
> language's rules. The expressiveness problems of Option L remain—
> e.g., ArrayList's early allocation strategy is impossible.
>


More information about the valhalla-spec-observers mailing list