<html><head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body>

    <font size="4"><font face="monospace">Some further thoughts on the

        nature of bang, question, ref, and val.  <br>

        <br>

        The model outlined in my mail from yesterday accounted for the

        distinction between class and type, but left something important

        out: carriers.  Adding these into the mix, I think this

        clarifies why `.val` and `!` are different, and why `!` and `?`

        are not pure inverses.  <br>

        <br>

        The user declares _classes_, which includes identity and value

        classes.  Ignoring generics for the moment, we derive _types_

        from classes.  Identity classes give rise to a single principal

        type (whose name is the written the same as the class, but let's

        call this `C.ref` for clarity); value classes give rise to two

        principal types, `C.ref` and `C.val`.  <br>

        <br>

        So `val` and `ref` are functions from Class to Type (val is

        partial): <br>

        <br>

            val :: ValueClass -> Type<br>

            ref :: Class -> Type<br>

        <br>

        What's missing is Carrier.  Ignoring the legacy primitive

        carriers (I, J, F, D), we have two carriers, L and Q.  Every

        type has a carrier.  For the "ref" types, the carrier is L; for

        the "val" types, the carrier is Q:<br>

        <br>

            carrier ref T = L<br>

            carrier val T = Q<br>

        <br>

        Now, bang and question.  These are operators on types.  Bang

        restricts the value set; question (potentially) augments the

        value set to include null.  Question is best describe as

        yielding a union type: `T? === T|Null`.  (Note that for all

        reference types T, T|Null == T, because Null <: T.)<br>

        <br>

        What are the carriers for bang and question types?  We define

        the carrier on union types by taking the stronger of the two

        carriers: <br>

        <br>

            carrier T|U = max (carrier T) (carrier U)<br>

        <br>

        which means that<br>

        <br>

            carrier question T = L<br>

        <br>

        since we need an L carrier to represent null.  But for "bang",

        we can preserve the carrier, since we're representing fewer

        values: <br>

        <br>

            carrier bang T = carrier T<br>

        <br>

        (Why wouldn't we downgrade the carrier of `Point!` to Q? 

        Because the carrier means more than nullity; it affects

        atomicity, layout, initialization strategy, etc.)<br>

        <br>

        What this means is that `question` is always information-losing,

        and that:<br>

        <br>

            carrier bang question T = L<br>

            carrier question bang T = L<br>

        <br>

        So, the ugly fact here is that "bang" and "question" are not

        inverses; `T!?` is not always T, nor is `T?!`.  <br>

        <br>

        But what I want to know is this: how do we want to denote "T or

        null", when T is a type variable?  This turns out to be the only

        place we currently have to utter `.ref`.  And uttering `.ref`

        here feels like asking the user to do the language's job; what

        the user wants is to describe the union type "T|Null".  (Since

        the only sensible representation for this is a reference type,

        the language will translate it as such anyway, but that's the

        language's job.)  <br>

        <br>

        This is related to how we ask people to describe "nullable

        int".  There are three choices: `int?`, `int.ref`, and

        `Integer`.  I would argue that the first is closest to what the

        user wants: a statement about value sets.  `int.ref` brings in

        carriers, which is unrelated to what the user really wants here;

        `Integer` is even worse because the relationship between int and

        Integer is ad-hoc.  Of course, they will all translate the same

        way (the L carrier), but that's the compiler's job.  <br>

        <br>

        For the only remaining use of `.ref` (returning V.ref from

        Map::get and friends), I think we want the same; Map::get wants

        to return "V or null".  Again, ref-ness is a dependent thing,

        not the essence; the essence is "T|Null".  (Also there's a

        connection with type patterns, where we may want to expand a

        null-rejecting type pattern to a null-including one.)  <br>

        <br>

        The problem, of course, is that once people see `?`, they will

        think it is "obvious" that we left out "!" by mistake, because

        of course they go together.  But they don't, really; they're

        different things.  But let's set bang aside, and turn to Kevin's

        next question, which is: if `?` is a union type with the null

        type, what does that say about `String?`?  This seems to be on a

        collision course, in that null-analysis efforts would want to

        treat `String?` as "String, with explicit nullness", but the

        union interpretation will collapse to just `String`.  <br>

        <br>

        Which points the way towards what seems the proper role for bang

        and question in the surface syntax, if any: to *modify* types

        with respect to their inclusion of null.  So `String?` and

        `int!` should probably be errors, since String is already

        nullable and int is already non-nullable.  <br>

        <br>

        Bottom line: as we've discovered half a dozen times already in

        this project, nearly every time we think that nullity is

        perfectly correlated to something, we discover it is not. 

        Bang/question are not val/ref; we might be able to get away with

        using `int.ref` to describe nullable ints, but that doesn't help

        us at all with nullable or non-nullable type patterns; and none

        of these are the same as "known vs unknown nullity" (or known vs

        unknown initialization status.)  <br>

        <br>

        <br>

        <br>

        <br>

      </font></font><br>

    <div class="moz-cite-prefix">On 6/27/2022 2:48 PM, Brian Goetz

      wrote:<br>

    </div>

    <blockquote type="cite" cite="mid:4e1e09aa-2ec8-6141-3b52-d0c39ea6965a@oracle.com">

      

      I've been bothered by an uncomfortable feeling that .val and ! are

      somehow different in nature, but haven't been able to put my

      finger on it.  Let me make another attempt.  <br>

      <br>

      The "bang" and "question" operators operate on types.  In the

      strictest form, the bang operator takes a type that has null in

      its value set, and returns a type whose value set is the same,

      except for null.   But observe that if the value set contains

      null, then the type has to be a reference type.  And the resulting

      type also has to be a reference type (except maybe for weird

      classes like Void) because we're preserving the remaining values,

      which are references.  So we could say:<br>

      <br>

          bang :: RefType -> RefType<br>

      <br>

      Bang doesn't change the ref-ness, or id-ness, of a type, it just

      excludes a specific value from the value set.  <br>

      <br>

      Now, what do ref and val do?  They don't operate on types, they

      operates on _classes_, to produce a type.  Val can only be applied

      to value classes, and produces a value type.  In the strictest

      interpretation (for consistency with bang), ref also only operates

      on value classes.  So:<br>

      <br>

          val :: ValClass -> ValType<br>

          ref :: ValClass -> RefType<br>

      <br>

      Now, we've been strict with bang and ref to say they only work

      when they have a nontrivial effect, and could totalize them in the

      obvious way (ref is a no-op on an id class; bang is a no-op on a

      value type.)  Which would give us:<br>

      <br>

          bang :: Type -> Type<br>

          val :: ValClass -> ValType<br>

          ref :: Class -> RefType<br>

      <br>

      with the added invariant that bang preserves

      id-ness/val-ness/ref-ness of types.  <br>

      <br>

      But still, bang and ref operate on different things, and and

      produce different things; one takes a type and yields a slightly

      refined type with similar characteristics, the other takes a class

      and yields a type with highly specific characteristics.  We can

      conclude a lot from `val` (its a value type, which already says a

      lot), but we cannot conclude anything other than  non-nullity from

      `bang`; it might be a ref or a val type, it might come from an

      identity or value class.  <br>

      <br>

      What this says to me is "val is a subtype of bang"; all vals are

      bangs, but not all bangs are vals.  <br>

      <br>

      A harder problem is what to do about `question`.  The strict

      interpretation says we can only apply `question` to a type that is

      already non-null.  In our world, that's ValType.  <br>

      <br>

          question :: ValType -> Type<br>

      <br>

      Or we could totalize as we did with bang, and we get an invariant

      that question preserves id-ness, val-ness, ref-ness.  But, what

      does `question` really mean?  Null is a reference.  So there are

      two interpretations: that question always yields a reference type

      (which means non-references need to be lifted/boxed), or that

      question yields a union type.  <br>

      <br>

      It turns out that the latter is super-useful on the stack but kind

      of sucks in the heap.  The return value of `Map::get`, which we've

      been calling `T.ref`, really wants a union type (T or Null);

      similarly, many difficult questions in pattern matching might be

      made less difficult with a `T or Null` Type.  But there is no

      efficient heap-based representation for such a union type; we

      could use tagged unions (blech) or just fall back to boxing. 

      Which leaves us with the asymmetry that bang is

      representation-preserving (as well as other things), but question

      is not.  (Which makes sense in that one is subtractive and the

      other is additive.)  <br>

      <br>

      So, to your question: is this permanently gross?  I think if we

      adopt the strictest intepretations:<br>

      <br>

       - bang is only allowed on types that are already nullable<br>

       - question is only allowed on types that are not nullable (or on

      type variables)<br>

       - val is only allowed on value classes<br>

       - ref is only allowed on value classes (or on type variables)<br>

      <br>

      (And we can possibly boil away the last one, since if we can say

      `T?`, there is no need for `T.ref` anywhere.)  <br>

      <br>

      What this means is that you can say `String!`, but not

      `Optional!`, because Optional is already null-free.  Which means

      there is never any question whether you say `X.val` or `X!` or

      `X.val!` (or `X.ref!` if we exclude ref entirely).  So then,

      rather than two ways to say the same thing, there are two ways to

      say two different things, which have different absolute

      strengths.  <br>

      <br>

      This is somewhat unfortunate, but not "permanently gross."  <br>

      <br>

      If we drop `ref` in favor of `?` (not necessarily a slam-dunk), we

      can consider finding another way to spell `.val` which is less

      intrusive, though there are not too many options that don't look

      like line noise.  <br>

      <br>

      <br>

      <br>

      <br>

      <br>

      <div class="moz-cite-prefix">On 6/15/2022 12:41 PM, Kevin

        Bourrillion wrote:<br>

      </div>

      <blockquote type="cite" cite="mid:CAGKkBkttZ35rvuZ7Exfe6Ozh1CJS0NSmTFKMrQ-K+sW1N_66Vg@mail.gmail.com">

        <div><br>

        </div>

        <div>* I still am saddled with the deep feeling that ultimate

          victory here looks like "we don't need a val type, because by

          capturing the nullness bit and tearability info alone we will

          make <i>enough</i> usage patterns always-optimizable, and we

          can live with the downsides". To me the upsides of this

          simplification are enormous, so if we really must reject it, I

          may need some help understanding why. It's been stated that a

          non-null value type means something slightly different from a

          non-null reference type, but I'm not convinced of this; it's

          just that sometimes you have the technical ability to conjure

          a "default" instance and sometimes you don't, but nullness of

          the type means what it means either way.</div>

        <div><br>

        </div>

        <blockquote style="margin:0 0 0 40px;border:none;padding:0px">

          <div>* I think if we plan to go this way (.val), and then we

            one day have a nullable types feature, some things will then

            be permanently gross that I would hope we can avoid. For

            example, nullness *also* demands the concept of

            bidirectional projection of type variables, and for very

            overlapping reasons. This puts things in a super weird

            place.</div>

          <div><br>

          </div>

        </blockquote>

      </blockquote>

      <br>

    </blockquote>

    <br>

  </body>

</html>