Equality for values -- new analysis, same conclusion

Victor Nazarov asviraspossible at gmail.com
Thu Aug 29 16:55:37 UTC 2019


Hi,

I'd like to mention some other options here since the expert group started
this discussion.

I think that the way presented below is quite natural and it doesn't burn
any useful names.

Basically what I suggest is to use different package names: the same way we
have java.io, java.nio and java.nio2 packages.

The migration can become mostly hassle free and the world after migration
can remain almost free of migration debrees.

Here is a more detailed description of migration plan. We can make .Box
scheme a standard translation generated by javac. So for new classes we'll
get

 * java.lang.Long128 and
 * java.lang.Long128.Box

classes.

For migrated types we'll get a new package like java.time2 or something
like that. So we'll define

 * java.time2.LocalDate
 * java.time2.LocalDate.Box

Like this:

    package java.time2;
    inline class LocalDate implements java.time.LocalDate {
    }

Having these we can provide a migration option by defining
java.time.LocalDate as an interface:

    package java.time;

    sealed interface LocalDate permits java.time2.LocalDate {
        ...
    }

--
Victor Nazarov

чт, 29 авг. 2019 г., 18:12 Brian Goetz <brian.goetz at oracle.com>:

> > Braindump thoughts went as follows:
> >
> > LocalDate.Inline
> > LocalDate.Val
> > VLocalDate
> > LocalDateV
> > LDate
> > LocDate
>
> Sounds about right.  The first two have a big advantage over the rest, in
> that they are suitable for mechanical derivation by language-level type
> operations (it would be reasonable for a language to know that the “box”
> type of V is a class called “V.Box”, but it would be kind of unreasonable
> to burn in a naming convention like BoxOfV.)
>
> > Do migrated inline classes have different names to highlight the
> > migration?
>
> I see where you’re going — colonize the lower-case name syntax for
> values.  We’ve had this discussion before, and set it aside then, but
> there’s new information now, so its fair to dust it off and reexamine.  The
> last 36 times it was suggested “let’s just spell value types with lower
> case letters, like primitives”, we said “way way too early to have that
> discussion”, in part because making that syntax choice (tail) would have
> thrown the possibility of compatible in-place migration out the window
> (dog).  (This is one of many reasons why we don’t commit to syntax too
> early (or even, allow it to take up too much airtime); it constrains the
> possible feature space in ways that are hard to even see.)  And at the
> time, we were quite interested in supporting a compatible migration where
> Optional “just” becomes a value type, and invested a great deal of work in
> understanding the costs of making that possible, since this would be the
> ideal outcome for the users — existing code just gets faster.  (And, as
> comments earlier in this thread suggest, you were not thrilled at the idea
> that we were unable to succeed on that quest, because it means coming up
> with new names for a bunch of perfectly good types with perfectly good
> names.)
>
> But, there’s new information, which is that we seem to be converging on a
> scheme where the “box” is a separate nominal type, which means that this
> suggestion now becomes _possible_.  (It still has challenges, and its still
> kind of early to discuss it, because there are still some serious
> object-model issues we really should work out first.  But its a good
> reminder to not forget it.)
>
> > Highlighting migration seems like a bad idea in the long
> > term.
>
> I know what you mean, and agree,  but let me restate it in a less
> “absolutist" fashion.
>
> Migration compatibility is essential; flag days are often fatal.  And in
> any migration, there are three phases: old world, transition, and new
> world.  Ideally, we would like to arrive at the situation where the new,
> idiomatic code in the new world isn’t stuck with a crappy,
> migration-induced syntax forever, and any migration-induced compromises
> fade out as the transition phase comes to an end.  (This is exactly why,
> for example, in L2 you can’t say `Foo<V>`, you have to say `Foo<nullable
> projection of V>`, because, when Foo becomes specialized, we want the
> default to be “specialized Foo of V”, without requiring extra syntax that
> says “specialize me please.”  We’d rather the pain accrue to the transition
> phase than the new-world phase.)
>
> Sometimes this is possible; sometimes it is not.  We work hard to minimize
> the collateral damage when it is not.
>
> > Specifically, in the long term it seems like the best name
> > should go to the inline class, not the box. This seems to rule out
> > `LocalDate.Inline` as surely we don't want to refer to `Ratio.Inline`
> > or `Long128.Inline` everywhere.
>
> You’ve just mixed two cases, though.  For new classes, like Ratio and
> Long128, we have no compatibility constraint with previous utterances of
> the class name; for existing classes like LocalDate, we do.  Yes, it would
> be nice to have the “polarity” be the same in both cases, but that
> would-be-nice runs smack into the other would-be-nice of the above
> principle.
>
> > For a naive user, this behaviour of == is deeply unhelpful and
>
> (10 yard penalty for gratuitous overuse of “deeply”)
>
> > What if inline classes are divided between "pure" and "impure" (better
> > names needed). A "pure" inline class has (1) a sensible default value,
> > not null/exception (2) a correct == implementation that always works
> > as would be expected of a primitive type (3) no reference type fields.
> > An "impure" inline class may contain reference type fields, may have
> > no meaningful default value and should never be compared using == in
> > normal code.
>
> Yes, you’ve put your finger on the essential discomfort here, which is
> that when values have identity in the transitive closure of their state,
> much hilarity ensues.  And some values are well-behaved in the sense that
> they are “values all the way down” (FYI, for a detailed exploration of the
> values-all-the-way-down imperative in a Java-like langauge, see the LiMe
> language:
> https://researcher.watson.ibm.com/researcher/files/us-bacon/Huang08Liquid.pdf
> <
> https://researcher.watson.ibm.com/researcher/files/us-bacon/Huang08Liquid.pdf>).
> For values like Complex, Point, Rational, etc, there is no identity
> anywhere in the transitive closure of the state, so there are no places
> where `==` and `equals()` would disagree.  (However, I would point out that
> it is not the case that all such VATWD classes necessarily have great
> default values.)
>
> Early on, its was asked, “why not constrain value types to be values all
> the way down”, and about 30 ms later, someone realized “that means you
> couldn’t have tuples with String in them”, so clearly constraining values
> down to the “well-behaved” ones is a non-starter.
>
> As you point out, another point in the spectrum is to split over
> well-behavedness, and mark them in the type system (“identity-polluted
> value class Foo”), so the compiler could warn when you try to (say) use
> `==` on an identity-polluted value, but I don’t think anyone would really
> thank us for this either.  I get the appeal of splitting here, but I’m
> skeptical that the return on specificity carries its weight.
>
> > Then we say that "pure" inline classes are named using lowerCamelCase,
> > and "impure" inline classes using UpperCamelCase.
>
> That’s an interesting twist on where I thought you were going, which was
> “let’s just use lower case for all value classes” — you’re suggesting
> marking the distinction between VATWD and not via naming convention rather
> than language semantics.  (Besides the fact that I think lumping is better
> than splitting here, I think its pretty questionable for the language to
> make such inferences based on a naming convention.)  But the “use lower
> case for values” is workable, and connects to another concern that has been
> raised in the past: managing the cognitive load of asking users to carry
> around a mental database of which classes have identity and which do not.
> So it may well be time to take this idea back off the shelf where we placed
> it many years ago, and give it another look.
>
> > Thus,`localDate` is the inline class for the `LocalDate` box (assuming
> > agreement on a suitable default value).
>
> But, ignoring the immediately above, and as pointed out way above, now
> that we’ve given up on compatible in-place migration, this again becomes a
> possibility.  (Again, as mentioned above, I don’t want to dive any deeper
> on it now, because we have more important object-model concerns (such as
> “what is the nature of the conversion from localDate to LocalDate — sub
> typing, boxing, widening, something else?”) which again we should nail down
> first.)
>
> > The overall user model is simpler I think. All types that are
> > lowerCamelCase behave like primitives wrt == and null, and all types
> > that are UpperCamelCase behave like reference types wrt == and null.
>
> I get where you’re going, but I’m not sure it is spending the complexity
> budget where it gets the most return.  Implicitly, this proposal says “the
> biggest problem with values is the equality semantics, so let’s spend all
> our syntax budget highlighting the difference.”  But I am not convinced
> that’s our biggest problem, and if it’s not, this is an awfully big chunk
> of syntax budget to spend on it.
>
> Good thoughts!
>
> Cheers,
> -Brian
>
>
>



More information about the valhalla-dev mailing list