Variants/case classes/algebraic data types/sums/oh my!

forax at univ-mlv.fr forax at univ-mlv.fr
Sun Jun 12 00:12:24 UTC 2016


----- Mail original -----

> De: "John Rose" <john.r.rose at oracle.com>
> À: "Rémi Forax" <forax at univ-mlv.fr>
> Cc: "org openjdk" <org.openjdk at io7m.com>, valhalla-dev at openjdk.java.net
> Envoyé: Dimanche 12 Juin 2016 01:02:52
> Objet: Re: Variants/case classes/algebraic data types/sums/oh my!
Hi John, 

> On Jun 11, 2016, at 4:18 AM, Remi Forax < forax at univ-mlv.fr > wrote:

> > because K is reified, there will be two methods eval() at runtime, one
> > specialized for K = 0 and an other for K = 1,
> 
> > which means that the switch will be constant fold (the real 'switch' is
> > done
> > when selecting the specialization).
> 

> That is an interesting use case for non-type class template parameters!

Here is another one, 
class ArrayList<any E> { 
final E[] array; 
int size; 

public <Consumer<? super E> U> void forEach(U consumer) { 
for(int i = 0; i < size; i++) { 
consumer.accept(array[i]); 
} 
} 
} 

In the snippet above, forEach is specialized by the value of the lambda taken as argument, which is another way to say that he code of forEach and the code of the lambda should be inlined together. 

> (The C++ world is full of non-type template parameters; the question there is
> which small slice of that complexity we might want to adopt.)

For now, i would say 'try to come with a specialization mechanism as general as possible' and see if it's possible, 
a user defined SpecializeDynamic being the ultimate golden hammer (Mjölnir ?). 
All the specializations by a value can be implemented by doing constant pool patching (as this is done by unsafe.defineAnonymousClass), 
but here we don't want the anonymous part (and we don't need the host class access because it's superseded by the nestmate idea). 

> Even without int-typed or enum-typed template parameters, you could do what
> you need for ADTs by using enums and a sealed top type.

> Here's a variation that works today:

> abstract class Expr {
> private Expr() {} // sealed hierarchy

> enum $Kind { k$Value, k$Add };
> abstract $Kind $kind(); // direct dispatch hook

> static final class Value extends Expr {
> $Kind $kind() { return $Kind.k$Value; }
> final int value;
> Value(int value) { this.value = value; }
> }

> static final class Add extends Expr {
> $Kind $kind() { return $Kind.k$Add; }
> final Expr left;
> final Expr right;
> Add(Expr left, Expr right) { this.left = left; this.right = right; }
> }

> }

> class Client {
> static int eval(Expr expr) {
> switch(expr.$kind()) {
> case k$Value:
> return ((Expr.Value)expr).value;
> case k$Add:
> Expr.Add add = (Expr.Add)expr;
> return eval(add.left) + eval(add.right);
> default:
> throw new AssertionError();
> }
> }
> public static void main(String... av) {
> System.out.println(eval(new Expr.Add(new Expr.Value(40), new
> Expr.Value(2)))); // => 42
> }
> }

> /* Does not compile:
> Expr bad() {
> class Bad extends Expr { // error: Expr() has private access in Expr
> Expr.$Kind $kind() { return Expr.$Kind.k$Value; }
> }
> return new Bad();
> }
> */

yes, but it's less valhalla-ish no :) 

> We will want to give users a way to code such things with values also. But
> since values can cannot take part in subtype relations abstract classes, the
> pattern above would have to be reformulated to use wildcards or interfaces.
> (Abstract classes are pretty old fashioned anyway.)

since we have default method in interface now, designing with a public abstract class is an error, an abstract class is a class so it leaks implementation details. 

> With wildcards, you could have Expr<Value>, Expr<Add>, and Expr<?>, where
> Value and Add would be contained in Expr instead of deriving from Expr; all
> would be values.

being an Expr instead of being a subtype of an Expr is important, but i don't know if it's not just for type safety, so it can be a problem for the compiler and not the runtime. 

> With interfaces, you'd have Value, Add, and Expr, where Expr is an interface
> and the others are values.

> interface Expr {

> enum $Kind { k$Value, k$Add };
> $Kind $kind(); // direct dispatch hook
> }

> value class Value implements Expr {
> $Kind $kind() { return $Kind.k$Value; }
> final int value;
> Value(int value) { this.value = value; }
> }

> value class Add implements Expr {
> $Kind $kind() { return $Kind.k$Add; }
> final Expr left;
> final Expr right;
> Add(Expr left, Expr right) { this.left = left; this.right = right; }
> }

> In either case you'd want a way to seal the top type against unwanted
> subtypes (specializations or implementations). Here's a straw man for the
> interface version. I have zero desire to discuss source code syntax, and a
> range of options in mind for bytecode syntaxes. A bytecode attribute would
> be straightforward, but a specially marked pseudo-constructor is also a
> possibility.

> __Sealed({Value, Add}) interface Expr { // sealed hierarchy
> __Sealed({Value, Add}) Expr() {} // empty pseudo-constructor accessible only
> to two subtypes
>> }

> In all of the sample code above, the use of enums and accessors, instead of
> lambdas and visitors, is a old fashioned "retro" style. It is exactly
> analogous to the external iterators of early Java instead of the
> stream-shaped internal iterators of Java 8. I like to work with the retro
> external decoding style because it leads to simple bytecode shapes, and can
> be sugared pretty well (cf. Java extended-for).

For a human, in term of Java code, the internal iteration is easier to write (at least until we have some ways to define a coroutine or a generator), but it's harder for Hotspot. 
I want to see that as an implementation issue more than a fundamental rule.If Hotspot was able to not share the profiles between different inline blobs of the same code by c1, then c2 will have less polymorphic/megamorphic callsites to deal with (i beleive that both TurboFan(v8/chrome) and FTL(nitro/safari) doesn't have that issue). 

> But in the end we might want to use a more modern lambda-driven style
> (unapply or some other "morphism"). The main design problem with internal
> iterators or matchers is they tend to require lots of auxiliary types to
> express the functional APIs.

but less functional types means a way to define structural types, no ? 

> — John

Rémi 



More information about the valhalla-dev mailing list