<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<font size="4"><font face="monospace">Some further thoughts on the
nature of bang, question, ref, and val. <br>
<br>
The model outlined in my mail from yesterday accounted for the
distinction between class and type, but left something important
out: carriers. Adding these into the mix, I think this
clarifies why `.val` and `!` are different, and why `!` and `?`
are not pure inverses. <br>
<br>
The user declares _classes_, which includes identity and value
classes. Ignoring generics for the moment, we derive _types_
from classes. Identity classes give rise to a single principal
type (whose name is the written the same as the class, but let's
call this `C.ref` for clarity); value classes give rise to two
principal types, `C.ref` and `C.val`. <br>
<br>
So `val` and `ref` are functions from Class to Type (val is
partial): <br>
<br>
val :: ValueClass -> Type<br>
ref :: Class -> Type<br>
<br>
What's missing is Carrier. Ignoring the legacy primitive
carriers (I, J, F, D), we have two carriers, L and Q. Every
type has a carrier. For the "ref" types, the carrier is L; for
the "val" types, the carrier is Q:<br>
<br>
carrier ref T = L<br>
carrier val T = Q<br>
<br>
Now, bang and question. These are operators on types. Bang
restricts the value set; question (potentially) augments the
value set to include null. Question is best describe as
yielding a union type: `T? === T|Null`. (Note that for all
reference types T, T|Null == T, because Null <: T.)<br>
<br>
What are the carriers for bang and question types? We define
the carrier on union types by taking the stronger of the two
carriers: <br>
<br>
carrier T|U = max (carrier T) (carrier U)<br>
<br>
which means that<br>
<br>
carrier question T = L<br>
<br>
since we need an L carrier to represent null. But for "bang",
we can preserve the carrier, since we're representing fewer
values: <br>
<br>
carrier bang T = carrier T<br>
<br>
(Why wouldn't we downgrade the carrier of `Point!` to Q?
Because the carrier means more than nullity; it affects
atomicity, layout, initialization strategy, etc.)<br>
<br>
What this means is that `question` is always information-losing,
and that:<br>
<br>
carrier bang question T = L<br>
carrier question bang T = L<br>
<br>
So, the ugly fact here is that "bang" and "question" are not
inverses; `T!?` is not always T, nor is `T?!`. <br>
<br>
But what I want to know is this: how do we want to denote "T or
null", when T is a type variable? This turns out to be the only
place we currently have to utter `.ref`. And uttering `.ref`
here feels like asking the user to do the language's job; what
the user wants is to describe the union type "T|Null". (Since
the only sensible representation for this is a reference type,
the language will translate it as such anyway, but that's the
language's job.) <br>
<br>
This is related to how we ask people to describe "nullable
int". There are three choices: `int?`, `int.ref`, and
`Integer`. I would argue that the first is closest to what the
user wants: a statement about value sets. `int.ref` brings in
carriers, which is unrelated to what the user really wants here;
`Integer` is even worse because the relationship between int and
Integer is ad-hoc. Of course, they will all translate the same
way (the L carrier), but that's the compiler's job. <br>
<br>
For the only remaining use of `.ref` (returning V.ref from
Map::get and friends), I think we want the same; Map::get wants
to return "V or null". Again, ref-ness is a dependent thing,
not the essence; the essence is "T|Null". (Also there's a
connection with type patterns, where we may want to expand a
null-rejecting type pattern to a null-including one.) <br>
<br>
The problem, of course, is that once people see `?`, they will
think it is "obvious" that we left out "!" by mistake, because
of course they go together. But they don't, really; they're
different things. But let's set bang aside, and turn to Kevin's
next question, which is: if `?` is a union type with the null
type, what does that say about `String?`? This seems to be on a
collision course, in that null-analysis efforts would want to
treat `String?` as "String, with explicit nullness", but the
union interpretation will collapse to just `String`. <br>
<br>
Which points the way towards what seems the proper role for bang
and question in the surface syntax, if any: to *modify* types
with respect to their inclusion of null. So `String?` and
`int!` should probably be errors, since String is already
nullable and int is already non-nullable. <br>
<br>
Bottom line: as we've discovered half a dozen times already in
this project, nearly every time we think that nullity is
perfectly correlated to something, we discover it is not.
Bang/question are not val/ref; we might be able to get away with
using `int.ref` to describe nullable ints, but that doesn't help
us at all with nullable or non-nullable type patterns; and none
of these are the same as "known vs unknown nullity" (or known vs
unknown initialization status.) <br>
<br>
<br>
<br>
<br>
</font></font><br>
<div class="moz-cite-prefix">On 6/27/2022 2:48 PM, Brian Goetz
wrote:<br>
</div>
<blockquote type="cite" cite="mid:4e1e09aa-2ec8-6141-3b52-d0c39ea6965a@oracle.com">
I've been bothered by an uncomfortable feeling that .val and ! are
somehow different in nature, but haven't been able to put my
finger on it. Let me make another attempt. <br>
<br>
The "bang" and "question" operators operate on types. In the
strictest form, the bang operator takes a type that has null in
its value set, and returns a type whose value set is the same,
except for null. But observe that if the value set contains
null, then the type has to be a reference type. And the resulting
type also has to be a reference type (except maybe for weird
classes like Void) because we're preserving the remaining values,
which are references. So we could say:<br>
<br>
bang :: RefType -> RefType<br>
<br>
Bang doesn't change the ref-ness, or id-ness, of a type, it just
excludes a specific value from the value set. <br>
<br>
Now, what do ref and val do? They don't operate on types, they
operates on _classes_, to produce a type. Val can only be applied
to value classes, and produces a value type. In the strictest
interpretation (for consistency with bang), ref also only operates
on value classes. So:<br>
<br>
val :: ValClass -> ValType<br>
ref :: ValClass -> RefType<br>
<br>
Now, we've been strict with bang and ref to say they only work
when they have a nontrivial effect, and could totalize them in the
obvious way (ref is a no-op on an id class; bang is a no-op on a
value type.) Which would give us:<br>
<br>
bang :: Type -> Type<br>
val :: ValClass -> ValType<br>
ref :: Class -> RefType<br>
<br>
with the added invariant that bang preserves
id-ness/val-ness/ref-ness of types. <br>
<br>
But still, bang and ref operate on different things, and and
produce different things; one takes a type and yields a slightly
refined type with similar characteristics, the other takes a class
and yields a type with highly specific characteristics. We can
conclude a lot from `val` (its a value type, which already says a
lot), but we cannot conclude anything other than non-nullity from
`bang`; it might be a ref or a val type, it might come from an
identity or value class. <br>
<br>
What this says to me is "val is a subtype of bang"; all vals are
bangs, but not all bangs are vals. <br>
<br>
A harder problem is what to do about `question`. The strict
interpretation says we can only apply `question` to a type that is
already non-null. In our world, that's ValType. <br>
<br>
question :: ValType -> Type<br>
<br>
Or we could totalize as we did with bang, and we get an invariant
that question preserves id-ness, val-ness, ref-ness. But, what
does `question` really mean? Null is a reference. So there are
two interpretations: that question always yields a reference type
(which means non-references need to be lifted/boxed), or that
question yields a union type. <br>
<br>
It turns out that the latter is super-useful on the stack but kind
of sucks in the heap. The return value of `Map::get`, which we've
been calling `T.ref`, really wants a union type (T or Null);
similarly, many difficult questions in pattern matching might be
made less difficult with a `T or Null` Type. But there is no
efficient heap-based representation for such a union type; we
could use tagged unions (blech) or just fall back to boxing.
Which leaves us with the asymmetry that bang is
representation-preserving (as well as other things), but question
is not. (Which makes sense in that one is subtractive and the
other is additive.) <br>
<br>
So, to your question: is this permanently gross? I think if we
adopt the strictest intepretations:<br>
<br>
- bang is only allowed on types that are already nullable<br>
- question is only allowed on types that are not nullable (or on
type variables)<br>
- val is only allowed on value classes<br>
- ref is only allowed on value classes (or on type variables)<br>
<br>
(And we can possibly boil away the last one, since if we can say
`T?`, there is no need for `T.ref` anywhere.) <br>
<br>
What this means is that you can say `String!`, but not
`Optional!`, because Optional is already null-free. Which means
there is never any question whether you say `X.val` or `X!` or
`X.val!` (or `X.ref!` if we exclude ref entirely). So then,
rather than two ways to say the same thing, there are two ways to
say two different things, which have different absolute
strengths. <br>
<br>
This is somewhat unfortunate, but not "permanently gross." <br>
<br>
If we drop `ref` in favor of `?` (not necessarily a slam-dunk), we
can consider finding another way to spell `.val` which is less
intrusive, though there are not too many options that don't look
like line noise. <br>
<br>
<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 6/15/2022 12:41 PM, Kevin
Bourrillion wrote:<br>
</div>
<blockquote type="cite" cite="mid:CAGKkBkttZ35rvuZ7Exfe6Ozh1CJS0NSmTFKMrQ-K+sW1N_66Vg@mail.gmail.com">
<div><br>
</div>
<div>* I still am saddled with the deep feeling that ultimate
victory here looks like "we don't need a val type, because by
capturing the nullness bit and tearability info alone we will
make <i>enough</i> usage patterns always-optimizable, and we
can live with the downsides". To me the upsides of this
simplification are enormous, so if we really must reject it, I
may need some help understanding why. It's been stated that a
non-null value type means something slightly different from a
non-null reference type, but I'm not convinced of this; it's
just that sometimes you have the technical ability to conjure
a "default" instance and sometimes you don't, but nullness of
the type means what it means either way.</div>
<div><br>
</div>
<blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<div>* I think if we plan to go this way (.val), and then we
one day have a nullable types feature, some things will then
be permanently gross that I would hope we can avoid. For
example, nullness *also* demands the concept of
bidirectional projection of type variables, and for very
overlapping reasons. This puts things in a super weird
place.</div>
<div><br>
</div>
</blockquote>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>