Updated State of the Specialization

Brian Goetz brian.goetz at oracle.com
Sat Dec 20 02:43:08 UTC 2014


> Regarding rules about null and any-T values¹:
> ---------------------------------------------
>
> Am I correct to assume that casting a null to a type is supported inside
> specialized code and results in the zero value of that type (null for
> references, 0/0.0/... for primitive types and e. g. Point(0,0) for value
> type
> Point(Int, Int))?

We considered that approach, but it seemed a pretty bad idea.  Null 
means "no object is there".  But zero is a very useful and common 
integer; using zero to also mean "no number" seems infeasible. 
Returning zero from Map.get() in an int-valued Map when the element is 
not mapped seems even more error-prone than returning null in a 
ref-valued map.  So we think null should retain its current meaning, and 
not define conversion from null to value types.

> Otherwise, what's the recommended way to "summon" a zero value for an
> arbitrary
> type?

Yes, there needs to be some way to do this; this is largely a syntax 
issue.  Suffice it to say there will be some way to do this.

> Boxing – value boxes vs. wrapper classes:
> -----------------------------------------
>
> The draft seems to be very silent on the expected semantics of how value
> types
> are handled outside specialized code and how wrapper classes are handled
> inside
> specialized code.
>
> I think this is an important point, because every kind of mistake will
> directly
> effect thousands of users. Avoiding quirks here seems to be absolutely
> crucial
> to me (for instance, floating point numbers are hard enough to deal with
> even
> without letting users figure out the subtle differences between primitives,
> primitives-as-value-types, primitives-as-value-boxed-value-types and
> classic
> wrapped primitives).

This is indeed an important area to specify more carefully.  It's on the 
list, but for now we've been mostly focusing on the mechanics of 
specialization.  (Not because this is the end goal, but because without 
a working prototype, we can't do realistic experiments about the effect 
of these features on actual libraries, which is the real test.)

> Primitive equality on value types/inside specialized code:
> ----------------------------------------------------------
>
> Is there any further progress or additional insight into how this
> issue will be adressed, especially when boxing is involved?

Not yet, that's also on the list.

> Partially specialized type hierarchies:
> ---------------------------------------
>
> Although it isn't spelt out in the draft it seems that it is possible to
> go from
> specialized to erased, but not from erased to specialized:
>
>    Super<any T> ---> Sub<    T>   // ok
>    Super<    T> ---> Sub<any T>   // not ok

Correct.  Hierarchies need to be any-fied from the top down.

> If this is correct, what's the story for types which are currently
> erased in
> byte code, because they can't be expressed with Java Generics (HKTs come to
> mind)?
>
> With that approach, all of that code would be permanently be stuck in
> no-specialization land (in the sense that even if the HKT couldn't be
> expressed
> with any-T, individual instantiations could ... but not under the scheme
> proposed in this draft).

Today, classes have two choices:
  - Non-generic
  - Erased generics

We'll add a third choice:
  - Any-fied generics

but we're not taking away the first two choices or changing anything 
about them.  So at the very least, you have all the tools you currently 
have.  If you want to suggest some use cases that you are hoping we can 
support, please do; we can't make any promises, but we're happy to have 
the input.

> Manual control over specialization
> ----------------------------------
>
> Maybe this chapter has some merit, but I feel that the reasons put
> forward are
> not very convincing:
>
> "Implementation by parts" argues a point which I'd consider a neat hack
> for a
> single use-case. In a language/world were Option was a value type, why
> would we
> even bother to pass Option(null)/Option(value) instead of null/value (in
> those
> simple cases)?

It's more than a single use case.  When you look at the code for 
Collections, its rife with examples where implementation by parts is 
likely to be your only compatible choice.  Further, this is something C# 
programmers complain about; you can't write a different implementation 
for value instantiations as for reference, and this is limiting. 
Reference implementations have tools (nulls, synchronization on object 
parameters) that value implementations don't have, and value 
implementations have tools (known non-polymorphism) that reference 
implementations don't have.

> "Overriding specific instantiations of specialized classes" might have some
> merit, but the example of ArrayList<boolean> seems to be an incredibly poor
> choice to drive home the argument:
>
>    [...] the implementation should pack the elements so that every bool
> only uses
>    one bit of memory. This is widely considered a mistake.
>                                          –– Wikipedia on C++' vector<bool>

Here's another example, then: the problem of "Maps from int to int" is a 
well-studied special case, with many clever implementation tricks that 
are not available to more general-purpose maps.

But, your point is taken: "be careful of extrapolating from a small 
number of use cases."

> The "Peeling" Technique⁴:
> -------------------------
>
> To be honest, this feels like the draft is trying to fix one case of
> poor library
> design by adding a hugely complex language feature.

This is a fair concern; we've asked ourselves this multiple times.  (The 
problems we face with any-fying collections are consequences of the 
corners that had to be cut in order to generify collections in the first 
place -- should we really be beholden to decisions that were made in 
1997 after several major language evolutions?)  And the prospect of 
adding a language feature that we hope no one has to use is not 
something to be taken lightly.

But, we're also convinced that if our libraries have this problem, other 
libraries will too (and it's way more than one or two methods -- just 
try any-fying java.util.concurrent!)  So I don't think it's "one case of 
poor library design".  We'll know more once we have a more complete 
implementation and can do some corpus analysis on existing generic code.

> Especially from a user perspective, this is probably as complicated as
> the rest
> of specialized generics plus value types combined.

Yes, and herein is one of the most serious challenges of evolving a 
broad-spectrum language -- you can intend for a feature to be used only 
in certain situations or by certain users, but it is nearly impossible 
to make that guidance stick.

> Having just read it again ... really, this is so absurdly complex for
> basically
> trying to work around two poor API decisions. That's a huge chunk from the
> complexity budget which I'd either spend on something more worthwhile–or
> not
> spend at all.

I wish it was just two.  Perhaps a bet!  I give you $1000 now, and you 
pay me $500 for every library out there that can't anyfy cleanly.  I 
think you will fund my retirement...  :)

But, as I said, your point is by no means lost on us; we've asked 
ourselves this too, and will continue to do so.  Right now, we're in the 
"can we make it work at all" phase; if we can make it work at all, we 
can then put effort towards simplifying.




More information about the valhalla-dev mailing list