Variants/case classes/algebraic data types/sums/oh my!

Sat Jun 11 17:25:35 UTC 2016

'Ello!

Responses inline.

On 2016-06-10T19:17:35 -0700
John Rose <john.r.rose at oracle.com> wrote:
>
> We are concentrating on stuff which is probably a prerequisite to doing a good job with ADTs.

Yep, that's good to know. On my end, at least, I would not want to
implement something that required all ADTish things to be value types,
although 90% of the declarations are likely to be value types.

My main concern was in implementing something that was actually
compatible with the value type work. Simple example: An ADTish
definition would probably have a "sealed abstract" root, but as I
understand it, value types must be final. It'd be pretty rough to
implement ADTs like that and then find out at the last minute "Oh, you
can't actually make them value types!".

  Can a concrete value class extend another value type? No. 
    (Because concrete value classes are final.) 
  Can a value class be abstract or non-final? No. 
    (Abstract value classes do not presently seem worthwhile.)

That seems like a fairly large conflict there...

> 
> I like your review (in adt-jvm) of existing attempts to do them on
> the JVM.
> 
> To me as a JVM geek the most interesting paragraph was this:
> 
> > The library implementations of algebraic types in Java all suffer
> > from the same weaknesses due to lack of support two concepts in
> > Java: The ability to make a set of classes closed, and the ability
> > to efficiently select one class from a closed set in O(1) time and
> > without allocating memory.  
> 
> That gives me an idea about how the JVM enhancements we are talking
> about might play out when ADTs are added.
> 
> For example, the proposed nestmate mechanism can be used to create
> small sealed class hierarchies.  You make the related classes
> nestmates, and mark their constructors private.  

I hadn't heard of nestmates, but a search turned up this:

http://mail.openjdk.java.net/pipermail/valhalla-spec-experts/2016-January/000060.html

That actually sounds like a much more general form of what I had in
mind, and I'm glad it exists!

To implement basic case analysis over a type T, I believe you only need
access to the immediate children of T. Actual structural pattern
matching can be implemented easily using nested case analysis.

In other words (again making up syntax on the spot, sorry!):

  class T { 
    class A extends T { }
    class B extends T { }
    class C extends T { 
      class D extends C { }
      class E extends C { }
    }
  }

  T x;
  switch (x) {
    case (A a): { ... }
    case (B b): { ... }
    case (C (D d)): { ... }
    case (C (E e)): { ... }
  }

  // would obviously compile down to something like:
  switch (x) {
    case (A a): { ... }
    case (B b): { ... }
    case (C c): {
      switch (c) {
        case (D d): { ... }
        case (E e): { ... }
      }
    }
  }

It's not clear to me from the nestmates proposal if children of a
NestTop can also be NestTops (I think this would be needed, or at least
another level of indirection would be needed to use the nestmates data
directly for implementing matching as shown above).

> We can also do things to seal interfaces, if we need to.  It's all a
> question of effort, priority, and resource.

Is there a lot of difference in implementation between sealing
interfaces and sealing classes? I don't know the mechanics. Sealed
interfaces are definitely something I'd like as they obviously enable
the very powerful notion of being able to select types from
disjoint categories (making up syntax on the spot):

  sealed interface T { }
  sealed interface U { }

  class A implements T { }
  class B implements T { }
  class C implements T, U { }
  class D implements U { }

  T x;
  switch (x) {
    case (A a): { ... }
    case (B b): { ... }
    case (C c): { ... }
    // case (D d):  *compile time error*
  }

  U x;
  switch (x) {
    // case (A a): *compile time error*
    // case (B b): *compile time error*
    case (C c): { ... }
    case (D d): { ... }
  }

This is something that Scala already implements, so it's at least
possible without JVM support (if you don't care that your concept of
being sealed doesn't make it into other languages that consume your
types), although I've no idea if there are soundness issues as it's
been a couple of years since I've touched the language.

> The O(1) thing is tricky to evaluate.  I think the folks who use enums as kind-tags are on the right path there, since enums allow both separate compilation and efficient (O(1)) switching.

Yes, I think so. I was leaning towards basically copying the
implementation of enums, until you mentioned nestmates...

> Elsewhere (note 4) you mention heap allocation as a design constraint, and it's a big one for Java API designs, and even bigger for language feature design.  The problem of allocation is one reason we are working on value types first, because successful value types will allow programmers and library designers to say what they mean with values and not worry about GC load.
> 
> In general, Valhalla is about reducing the number of pointers and objects, in favor of richer APIs involving primitives and inlined structures (values).

It does sound quite wonderful. This actually came up in conversation
with someone recently, but do you think that lambdas will benefit from
this work? It seems that with the combination of escape analysis and
value types, you could stack allocate closures.

I've spent a good part of two years working on two open-source 3D
renderers written in Java. A lot of effort was spent on the second
implementation transforming all capturing lambdas to non-capturing
forms by threading explicit context parameters through the types. This
obviously doesn't work too well if you're trying to use other APIs that
use lambdas and that don't provide a means to thread a context value.

  http://io7m.github.io/r1/
  https://github.com/io7m/r2

It would be great if I could move back to capturing lambdas, simplify
the API, and not worry about the volume of heap allocations occurred by
using lambdas (N * 60) times a second!

> Also, do keep thinking about specific, low-level, generic JVM mechanisms that could support ADTs.  By "low-level" I mean defined in terms of JVM operations, not language features (of any language), and by "generic" I mean JVM facilities that are logically complete and consistent without reference to a particular application (such as ADTs).  For example, sealed JVM types should be definable without reference to ADTs, yet be useful for ADTs and several other use cases.

Trust me, I never stop thinking about ADTs. ;)

M