Notes about Valhalla from a non-Java perspective
Simon Ochsenreither
simon at ochsenreither.de
Tue Sep 30 15:10:22 UTC 2014
Hi,
it seems that no one from the Scala team found enough time/interest to
participate on valhalla-dev (Martin mentioned that most of the Scala
team is on holidays/in a move/in recovery) yet ... I would have
preferred to not get involved (bad impressions from other OpenJDK
lists), but I'll just post my personal notes (which I sent to
scala-internals a few weeks earlier) here directly before they fall
completely out of date.
Please note that it's written from a Scala development perspective, so
"we" == "arbitrary group of Scala devs/users".
Here are my (slightly improved) notes:
==============
* On the general approach of using class file attributes with tuples
of (bytecode index -> type)
It's kind of funny, because that's exactly the approach I took
almost 18 months earlier when I was thinking about this topic. I
considered that to be quite a hack at that time and thought "if the
JVM ever gets this, they will surely come up with a more principled
way of doing this".
* vcmp
This feels a bit ad-hoc currently. I think it would be considerably
more useful if they tried to come up with a design which would work
across all types, and not yet-another special case.
Scala's == implementation for instance is around 100 lines of code
with dozens of branches to work around Java's/the JVM's idea of
equality. It would be nice that if they were adding another
comparison operator that they wouldn't repeat the mistakes of equals
& friends.
They could be on the right way, but hard to tell without looking at
it more closely.
My suggestion would be to have a notion of "primitive" equality
which is defined as "do the most basic comparison available":
Compare the bits of the underlying value, which means ...
o the bits of value types, taking types into account
o the bits of the reference for reference types
o wrappers are unboxed
o Double.NaN is equal to itself, 0.0 and -0.0 are not equal
This would be pretty much be in line with our earlier debate about
supporting eq on value types. If this could be encoded as a single
vcmp operation, it would be a huge win.
Additionally, one could also consider a corresponding operation for
"semantic" equality:
o Use "equals" implementation for value types
o Use Java-style "==" for primitives
o Wrappers are unboxed
o Use "equals" implementation for references
o Double.NaN is not equal to itself, 0.0 and -0.0 are equal
o (Optionally) Compare arrays by comparing the element
That would vastly simplify our == implementation, but that's not really
the point ... it would be possible to do that, but I think priority
should be to prevent vcmp from being either artificially limited to
value types or ending up incoherent like the rest of equality stuff in Java.
In the end, I think how to deal with primitive wrappers is still an
uncharted territory. Retrofitting those wrapper classes as value boxes
very likely won't work, so maybe there is more specification required on
how "T=Integer" for "any T" is treated (or int in need of a box in
specialized code).
* No reification for reference types (Reified in the sense of "the
type is available at runtime", not "it gets a specialized class".)
I'm split on this. On the one side, this could give us the escape
hatch for types not expressible with the new Generics, but on the
other hand it would really suck because it would mean we couldn't
just drop ClassTags altogether, but would need to drag them around
for every type even if only references would actually need them.
Additionally, the split between value types/reference types is very
likely not similar to the split we would need to erase Scala's types
to the JVM's level of expressiveness.
It feels like they are again trying to cut corners here, trading
implementation ease for additional language complexity.
(As far as I remember, there were some concerns about what happens
with Java's .class/getClass() when reference types are reified ...
but imho, Java problems should stay Java problems and shouldn't make
the JVM approach worse than necessary.) Also note, how this fits
with my work on making classOf[T] ClassTag-aware.
Additionally, we have already an escape hatch with the existing
erased Generics, so having yet-another different style of Generics
doesn't feel like the right way ... which brings us to the next point:
* Reified/Erased Generics interop
This seems to be a really dark corner. The draft is pretty silent on
this. It currently looks like you can't have a type hierarchy were A
is erased at the top but reified in a a subclass (or the other way
around) ... I think this will make it very hard to use erased
Generics as an escape hatch. Combined with the earlier point, I'd
prefer better reified/erased interop and having reified reference
types with reified Generics.
Otherwise, it feels like tons of people will try to wrap random
reference types in value types to get around that limitation.
* No variance for value types
This is a big conceptual problem with heterogeneous translation and
my personal conclusion 1.5 years ago was the same: It can't really
be done without a huge explosion in runtime complexity.
* Poor man's typeclasses
There was a short mention of "conditional methods", e.g. enabling
methods only for a subset of some generic type. That smells a lot
like a crippled version of typeclasses. Might be useful to watch
what happens in that space.
* Members on any T
There seems to be a debate about whether/how "any T" should expose
methods by itself. While this seems to be nicely in line with what a
lot of people in Scala would like to do, I think Java will not be
able to properly add constraints to T to describe required methods.
They currently only have upper bounds to express those things, but
even in Scala where bounds are less horrible than in Java, people
have pretty much abandoned bounds based on subtyping in favor of
context bounds.
So even if Any/T wouldn't get any members I think we should be aware
that Java's options for adding constraints are very, very primitive
and probably not what we would want to use in Scala. Let's take that
into account before arguing for or against this ...
My conclusion regarding Scala:
* Exposing two kinds of Generics to the user is highly undesirable
from a language design POV. F# did the same with a much better
starting point than Java (well-designed runtime support for reified
Generics) and even for them I think the increase of complexity just
wasn't worth the benefit. Java might not have a choice here, but I
think we should make sure to not leak these "implementation details"
to the user. We will have to support both kinds of Generics for
interop anyway.
* We should think about how we would like to see T/Any and eq/==
evolve in the long-term and communicate that clearly to the Valhalla
people. It's probably much more painful if they e.g. decide to have
no members on any T, but decide on an incompatible constraint
mechanism (which would very likely be reified in the bytecode). That
way, keeping status quo and implementing member-free Any/T as a
scalac fiction might work better.
* We should start preparing for the variance/value type breakage.
People who need variance can migrate to boxed representations easily
(replace Vector[Int] with Vector[Integer] for instance), but trying
to keep things as-is would mean just not supporting JVM value types,
and I think that would be a terrible decision. From a type system
POV, instantiating a type parameter with a value type could be seen
as collapsing all the bounds of that type parameter to make it invariant
* We should really think about having a scalac prototype which tries
to emit the new class file format and leverage the new semantics.
There is already an ASM fork out there with preliminary support, but
I don't know how stable/complete that code is.
Experience has shown (JRuby) that this is the most effective way to
actually influence the design.
That's just some short overview, I can expand on these topics as
necessary, but I expect that people have read the draft proposal and the
complete mailing list already so that everyone is on roughly the same level.
==============
I hope this is helpful.
Thanks,
Simon
More information about the valhalla-dev
mailing list