Revisiting default values
Brian Goetz
brian.goetz at oracle.com
Mon Jul 20 16:27:55 UTC 2020
Responding to Kevin's tangent:
- Of the one's on Dan's list, one could argue that even some of the
ones in Bucket 1 are questionable, such as `char` or `Instant`. The
ones that really seem like slam dunks are: numerics (int, long, etc),
boolean, and maybe Optional. That's a small list.
(Another candidate for bucket 1: BigDecimal.)
More generally:
- The language is schizoid about uninitialized variables. DA analysis
requires that we always initialize locals (even when we want to
initialize `count` to `0`), but doesn't require it for fields. This is
because we know that there are windows of unfortunateness where the
default value is still observable -- inside the ctor, or if `this`
escapes the ctor.
> Option J: JVM treats default instance as 'null'
Implementation note: when we explored this a while back, we were
interested in identifying a "pivot field" where the programmer committed
(or analysis proved) that all properly initialized instances would have
a non-default value for this field, as would be the case if any field
had an unconditional `foo = new Foo()` assignment in the constructor.
This makes detection of the default value much faster, since you only
have to check the pivot field.
(Peter raises this in his "what about" query later.)
We were initially excited about this approach but later realized it was
feeding the "optimization dopamine receptor" rather than actually
solving a problem :)
> It sounds like this debate is between `null` and a value which really
> is the /moral equivalent/ of `null`. You basically would have two
> kinds of nullability that look different from each other.
John has made an impassioned plea for "no new nulls". Accordingly, we
did explore a variant of J where a `withfield` that set the pivot field
to its default value _actually put a null on the stack_. (We backed off.)
> And here's another option that has been previously discarded, but might be worth picking back up. This one to address Bucket #2:
>
> ---
>
> Option K: JVM initializes fields/arrays to a designated default
John has in the past pushed back on this, in part because of the problem
identified above (can't close the window 100%, only 99.5%, and that 0.5%
is where security bugs come from), and in part because of the
cost/complexity in the JVM.
That said, doing so in the language is potentially more viable. It
would mean, for classes that opt into this treatment:
- Ensuring that `C.default` evaluates to the right thing
- Preventing `this` from escaping the constructor (which might be a good
thing to enforce for inline classes anyway)
- Ensuring all fields are DA (which we do already), and that
assignments to fields in ctors are not their default value
- Translating `new Foo[n]` (and reflective equivalent) with something
that initializes the array elements
The goal is to keep default instances from being observed. If we lock
down `this` from constructors, the major cost here is instantiating
arrays of these things, but we already optimize array initialization
loops like this pretty well.
Overall this doesn't seem terrible. It means that the cost of this is
borne by the users of classes that opt into this treatment, and keeps
the complexity out of the VM. It does mean that "attackers" can
generate bytecode to generate bad instances (a problem we have with
multiple vectors today.)
Call this "L".
I'd suggest, though, we back away from implementation techniques (you've
got a good menu going already), and focus more on "what language do we
want to build." You claim:
> I don't think totally excluding Buckets #2 and #3 is a very good
outcome.
Which I think is a reasonable hypothesis, but I suggest we focus the
discussion on whether we believe this or not, and what we might want to
do about it (and when), first.
On 7/10/2020 2:46 PM, Kevin Bourrillion wrote:
> This response is not to the main topic; not trying to send us down a
> rabbit-hole but this point is very important to me (as will be clear :-)).
>
>
> On Fri, Jul 10, 2020 at 11:23 AM Dan Smith <daniel.smith at oracle.com
> <mailto:daniel.smith at oracle.com>> wrote:
>
> Bucket #1: Have a reasonable default, as declared.
> - wrapper classes (the primitive zeros)
> - Optional & friends (empty)
> - From java.time: Instant (start of 1970-01-01), LocalTime
> (midnight), Duration (0s), Period (0d), Year (1 BC, if that's
> acceptable)
>
>
> Duration and Period: sure.
>
> Instant and the others: please, please put these in a separate bucket.
> They can have a /default/, but it is absolutely /not/ a "reasonable"
> default. In fact many tens (hundreds?) of thousands of bug reports in
> the last 50 years of computing have been "why in the world did
> 1970-01-01 or 1969-12-31 show up on this screen??"
>
> (Source: my team at Google has invested literally multiple
> person-years in an effort to stamp out bugs with how users use
> java.time, which I kicked off and have stayed peripherally involved
> in. I feel this should make our perspective worth listening to.)
>
> Realize that primitive types having default values /already/ causes
> some number of bugs today even though we know they are the least-bad
> category and that risk is acceptable.
>
> My reason for complaining here is not just about the java.time types
> themselves, but to argue that this is an important 4th bucket we
> should be concerned about. In some ways it is a bigger problem that
> Bucket #3 "no good default", since it is an /actively harmful/ default.
>
> For all of these types, there is one really fantastic default value
> that does everything you would want it to do: null. That is why these
> types should not become inline types, or /certainly/ not val-default
> inline types, and why Error Prone will have to ban usage of `.val` if
> they do.
>
> (Tangent of tangent: midnight is an interesting choice of default
> value for LocalTime, since I think there are some LocalTimes that so
> far have /always happened/ in every date and location in history and
> that's not one of them. That's not to say any other choice would work,
> but just to highlight how wrong it is to have any default value at all.)
>
> Bucket #2: Could have a reasonable default after re-interpreting
> fields.
> - From java.time: LocalDate, YearMonth, MonthDay, LocalDateTime,
> ZonedDateTime, OffsetTime, OffsetDateTime, ZoneOffset, ZoneRegion,
> MinguoDate, HijrahDate, JapaneseDate, ThaiBuddhistDate (months and
> days should be nonzero; null Strings, ZoneIds, HijrahChronologies,
> and JapaneseEras require special handling)
>
>
> Echoing... default seems harmful in every one of these.
>
>
> --
> Kevin Bourrillion | Java Librarian | Google, Inc. |kevinb at google.com
> <mailto:kevinb at google.com>
More information about the valhalla-spec-observers
mailing list