Updated State of the Specialization
Brian Goetz
brian.goetz at oracle.com
Sat Dec 20 02:43:08 UTC 2014
> Regarding rules about null and any-T values¹:
> ---------------------------------------------
>
> Am I correct to assume that casting a null to a type is supported inside
> specialized code and results in the zero value of that type (null for
> references, 0/0.0/... for primitive types and e. g. Point(0,0) for value
> type
> Point(Int, Int))?
We considered that approach, but it seemed a pretty bad idea. Null
means "no object is there". But zero is a very useful and common
integer; using zero to also mean "no number" seems infeasible.
Returning zero from Map.get() in an int-valued Map when the element is
not mapped seems even more error-prone than returning null in a
ref-valued map. So we think null should retain its current meaning, and
not define conversion from null to value types.
> Otherwise, what's the recommended way to "summon" a zero value for an
> arbitrary
> type?
Yes, there needs to be some way to do this; this is largely a syntax
issue. Suffice it to say there will be some way to do this.
> Boxing – value boxes vs. wrapper classes:
> -----------------------------------------
>
> The draft seems to be very silent on the expected semantics of how value
> types
> are handled outside specialized code and how wrapper classes are handled
> inside
> specialized code.
>
> I think this is an important point, because every kind of mistake will
> directly
> effect thousands of users. Avoiding quirks here seems to be absolutely
> crucial
> to me (for instance, floating point numbers are hard enough to deal with
> even
> without letting users figure out the subtle differences between primitives,
> primitives-as-value-types, primitives-as-value-boxed-value-types and
> classic
> wrapped primitives).
This is indeed an important area to specify more carefully. It's on the
list, but for now we've been mostly focusing on the mechanics of
specialization. (Not because this is the end goal, but because without
a working prototype, we can't do realistic experiments about the effect
of these features on actual libraries, which is the real test.)
> Primitive equality on value types/inside specialized code:
> ----------------------------------------------------------
>
> Is there any further progress or additional insight into how this
> issue will be adressed, especially when boxing is involved?
Not yet, that's also on the list.
> Partially specialized type hierarchies:
> ---------------------------------------
>
> Although it isn't spelt out in the draft it seems that it is possible to
> go from
> specialized to erased, but not from erased to specialized:
>
> Super<any T> ---> Sub< T> // ok
> Super< T> ---> Sub<any T> // not ok
Correct. Hierarchies need to be any-fied from the top down.
> If this is correct, what's the story for types which are currently
> erased in
> byte code, because they can't be expressed with Java Generics (HKTs come to
> mind)?
>
> With that approach, all of that code would be permanently be stuck in
> no-specialization land (in the sense that even if the HKT couldn't be
> expressed
> with any-T, individual instantiations could ... but not under the scheme
> proposed in this draft).
Today, classes have two choices:
- Non-generic
- Erased generics
We'll add a third choice:
- Any-fied generics
but we're not taking away the first two choices or changing anything
about them. So at the very least, you have all the tools you currently
have. If you want to suggest some use cases that you are hoping we can
support, please do; we can't make any promises, but we're happy to have
the input.
> Manual control over specialization
> ----------------------------------
>
> Maybe this chapter has some merit, but I feel that the reasons put
> forward are
> not very convincing:
>
> "Implementation by parts" argues a point which I'd consider a neat hack
> for a
> single use-case. In a language/world were Option was a value type, why
> would we
> even bother to pass Option(null)/Option(value) instead of null/value (in
> those
> simple cases)?
It's more than a single use case. When you look at the code for
Collections, its rife with examples where implementation by parts is
likely to be your only compatible choice. Further, this is something C#
programmers complain about; you can't write a different implementation
for value instantiations as for reference, and this is limiting.
Reference implementations have tools (nulls, synchronization on object
parameters) that value implementations don't have, and value
implementations have tools (known non-polymorphism) that reference
implementations don't have.
> "Overriding specific instantiations of specialized classes" might have some
> merit, but the example of ArrayList<boolean> seems to be an incredibly poor
> choice to drive home the argument:
>
> [...] the implementation should pack the elements so that every bool
> only uses
> one bit of memory. This is widely considered a mistake.
> –– Wikipedia on C++' vector<bool>
Here's another example, then: the problem of "Maps from int to int" is a
well-studied special case, with many clever implementation tricks that
are not available to more general-purpose maps.
But, your point is taken: "be careful of extrapolating from a small
number of use cases."
> The "Peeling" Technique⁴:
> -------------------------
>
> To be honest, this feels like the draft is trying to fix one case of
> poor library
> design by adding a hugely complex language feature.
This is a fair concern; we've asked ourselves this multiple times. (The
problems we face with any-fying collections are consequences of the
corners that had to be cut in order to generify collections in the first
place -- should we really be beholden to decisions that were made in
1997 after several major language evolutions?) And the prospect of
adding a language feature that we hope no one has to use is not
something to be taken lightly.
But, we're also convinced that if our libraries have this problem, other
libraries will too (and it's way more than one or two methods -- just
try any-fying java.util.concurrent!) So I don't think it's "one case of
poor library design". We'll know more once we have a more complete
implementation and can do some corpus analysis on existing generic code.
> Especially from a user perspective, this is probably as complicated as
> the rest
> of specialized generics plus value types combined.
Yes, and herein is one of the most serious challenges of evolving a
broad-spectrum language -- you can intend for a feature to be used only
in certain situations or by certain users, but it is nearly impossible
to make that guidance stick.
> Having just read it again ... really, this is so absurdly complex for
> basically
> trying to work around two poor API decisions. That's a huge chunk from the
> complexity budget which I'd either spend on something more worthwhile–or
> not
> spend at all.
I wish it was just two. Perhaps a bet! I give you $1000 now, and you
pay me $500 for every library out there that can't anyfy cleanly. I
think you will fund my retirement... :)
But, as I said, your point is by no means lost on us; we've asked
ourselves this too, and will continue to do so. Right now, we're in the
"can we make it work at all" phase; if we can make it work at all, we
can then put effort towards simplifying.
More information about the valhalla-dev
mailing list