Variants/case classes/algebraic data types/sums/oh my!

Sun Jun 12 01:34:48 UTC 2016

On Jun 11, 2016, at 5:12 PM, forax at univ-mlv.fr wrote:
> 
> 
> De: "John Rose" <john.r.rose at oracle.com>
> À: "Rémi Forax" <forax at univ-mlv.fr>
> Cc: "org openjdk" <org.openjdk at io7m.com>, valhalla-dev at openjdk.java.net
> Envoyé: Dimanche 12 Juin 2016 01:02:52
> Objet: Re: Variants/case classes/algebraic data types/sums/oh my!
> Hi John,
> 
>>> On Jun 11, 2016, at 4:18 AM, Remi Forax <forax at univ-mlv.fr> wrote:
>>> 
>>> because K is reified, there will be two methods eval() at runtime, one specialized for K = 0 and an other for K = 1,
>>> which means that the switch will be constant fold (the real 'switch' is done when selecting the specialization).
>>> 
>> That is an interesting use case for non-type class template parameters!
>> 
> Here is another one,
> class ArrayList<any E> {
>   final E[] array;
>   int size;
> 
>   public <Consumer<? super E> U> void forEach(U consumer) {
>      for(int i = 0; i < size; i++) {
>        consumer.accept(array[i]);
>      }
>   }
> }

I think there are two simple ways to interpret your suggestion.  One is to split and specialize forEach by concrete consumer *type*, and one is to specialize by consumer *instance*.

  public <__SpecializePerType U extends Consumer<? super E>> void forEach(U consumer) {
     for(int i = 0; i < size; i++) {
       consumer.accept(array[i]);
     }
  }
  public <__SpecializePerInstanceOf U extends Consumer<? super E>> void forEach(U consumer) {
     for(int i = 0; i < size; i++) {
       consumer.accept(array[i]);
     }
  }

> In the snippet above, forEach is specialized by the value of the lambda taken as argument, which is another way to say that he code of forEach and the code of the lambda should be inlined together.

Yes; function parameters to templates are an important organizing feature for C++ STL algorithms.  The sort method is also a good candidate for such treatment.  C++ expresses per-type specialization using type parameters and per-instance specialization using non-type parameters.

The general problem here is what I call the loop customization problem, which is to code loops once and compile them into enough specialized versions to unlock a desirable level of optimization.  The JVM ought to create and maintain a library of specializations of forEach (or sort), one for each significantly different "loop shape syndrome" determined by a specific consumer (or comparator, for sort).

A syndrome must specify all the information needed to produce well optimized native code for the algorithm loop, but it should not over-specify details not needed to optimize the loop.  For example, the consumer may be packing results into an array; the syndrome should express that fact, but not over-specify by working only for one specific array instance.  Should the syndrome specify the exact type of the array?  That's an open question.

What if the consumer is packing away only those elements that match some filter predicate?  In that case the filter predicate is part of the syndrome too, since the loop needs to open-code it.  What if it's a bound-checking predicate; should the exact bounds be part of the syndrome?  Probably not but that's an open question too.  Or, if the collection is has a non-unit stride (into the sink array), should the stride be syndrome or not?  Again, more open questions, to be solved by a mixture of static guidance from the programmer and online feedback from profile and type hierarchy.

You can see how streams intensify the loop customization problem by mixing together syndrome and non-syndrome data into a single stream; the terminal operation needs to sort it out and select from a repertoire of efficient loop bodies.  This is not soluble by lots of inlining (and so it's not exactly what some call the "inlining problem"), because the control flow might spread across multiple threads (for parallel streams).

If you use the C++ approach, each function is a distinct syndrome triggering a distinct specialization of the algorithm.  I think the STL libraries are organized carefully so that template parameters are *always* used as syndromes, and normal parameters are *never* used as syndromes.  This makes sense, but the cost is a static decision between what's shared and what's split for optimization, with a bias towards splitting and code bloat.  But in the JVM we like to give the runtime environment a share in decisions about sharing vs. splitting for optimization, so it's a more delicate question.  The JVM likes to start with code sharing (even interpretation) and then "ramp up" splitting and optimization where there is clear evidence that the investment makes sense.  We get to good loops even though there is no static splitting or optimization.

So in the case of List.forEach either formulation of specialization (by type, or by instance) is only an imperfect starting point, if our goal is to get the best loops.

In our current setup, we get per-type specialization *if* the provided consumer is a value type.  If it's a legacy reference type, we will use erasure.  Perhaps the closest we can get in today's draft design space is this:

  public <val U extends Consumer<? super E>> void forEach(U consumer) {
     for(int i = 0; i < size; i++) {
       consumer.accept(array[i]);
     }
  }

This would seem to say, "I need a value type which implements the given interface U, please, and I want you to create a specialization for each such type."  (To be clear:  Such a use of "val" is only a suggestion, not any plan of record.  But it does fall out of some of our intensive conversations last month, in which it appears that interfaces may have "val", "ref", and "any" specializations, with respect to their "this" type.)

In that case, the JVM has to activate some sort of library mechanism for maintaining specializations of forEach, which is good.  Initially, there will be a separate specialization only per *type* of consumer, which will probably capture the presence of a filter predicate (if there is one) but not the nature of the predicate.  So this goes a certain way towards defining a syndrome that will solve the loop customization problem, but not all the way.

>> (The C++ world is full of non-type template parameters; the question there is which small slice of that complexity we might want to adopt.)
> 
> For now, i would say 'try to come with a specialization mechanism as general as possible' and see if it's possible,
> a user defined SpecializeDynamic being the ultimate golden hammer (Mjölnir ?).
> All the specializations by a value can be implemented by doing constant pool patching (as this is done by unsafe.defineAnonymousClass),
> but here we don't want the anonymous part (and we don't need the host class access because it's superseded by the nestmate idea).
> 
>> Even without int-typed or enum-typed template parameters, you could do what you need for ADTs by using enums and a sealed top type.
>> 
>> Here's a variation that works today:
>> ...
> yes, but it's less valhalla-ish no :)

Totally.  But it's also a heuristic for better code shapes in the Valhalla future.

>> We will want to give users a way to code such things with values also.  But since values can cannot take part in subtype relations abstract classes, the pattern above would have to be reformulated to use wildcards or interfaces.  (Abstract classes are pretty old fashioned anyway.)
> since we have default method in interface now, designing with a public abstract class is an error, an abstract class is a class so it leaks implementation details.  

Yep.  An abstract class is not abstract enough, and interfaces are getting concrete enough, by stealing features from abstract classes.  Which is why I am toying with the idea of putting pseudo-constructors in interfaces, in order to steal away the ability of a supertype to control its subtypes by restricting access to its "constructor".  Why can't an interface have a nullary no-op constructor, just like Object?  (We can make it abstract to avoid questions of behavior.)  Anyway, abstract classes can seal their subtypes, and that's what I want to steal away from them.

>> With wildcards, you could have Expr<Value>, Expr<Add>, and Expr<?>, where Value and Add would be contained in Expr instead of deriving from Expr; all would be values.
> 
> being an Expr instead of being a subtype of an Expr is important, but i don't know if it's not just for type safety, so it can be a problem for the compiler and not the runtime.

Expr becomes a box for either a Value or an Add (but not both, and not anything else).  That models the categorical mathematics well enough, and everything else is sugar.  (From a certain distance.)

> 
>> With interfaces, you'd have Value, Add, and Expr, where Expr is an interface and the others are values.
>> ...
>> In all of the sample code above, the use of enums and accessors, instead of lambdas and visitors, is a old fashioned "retro" style.  It is exactly analogous to the external iterators of early Java instead of the stream-shaped internal iterators of Java 8.  I like to work with the retro external decoding style because it leads to simple bytecode shapes, and can be sugared pretty well (cf. Java extended-for).
>> 
> For a human, in term of Java code, the internal iteration is easier to write (at least until we have some ways to define a coroutine or a generator), but it's harder for Hotspot.

Yes.  That's why extended-for is such a brilliant bit of sugar.  Codes like a catamorphism, JITs like Fortran.  (I also look forward to for-comprehensions and "extended-do" some day, but sequential-code monads come after streams, values, and ADTs.)

> I want to see that as an implementation issue more than a fundamental rule.If Hotspot was able to not share the profiles between different inline blobs of the same code by c1, then c2 will have less polymorphic/megamorphic callsites to deal with (i beleive that both TurboFan(v8/chrome) and FTL(nitro/safari) doesn't have that issue).   

That will help, but IMO split profiles are not a full solution, not even an 80% solution.  It's too automagic, depending on obscure differences between inlining policies of tiers.  The problem is the programmer can't easily predict and control the splits.  It will work for demos and benchmarks and when it fails there's no recourse.  We need some way to (gently, provisionally) distinguish syndrome from non-syndrome, and point out the points where syndromes have to be baked into customized loops.

>>  But in the end we might want to use a more modern lambda-driven style (unapply or some other "morphism").  The main design problem with internal iterators or matchers is they tend to require lots of auxiliary types to express the functional APIs.
> but less functional types means a way to define structural types, no ?

I guess I want to say I think we should reach for both.  A "syndrome" as I've been describing it is really a bit of structural typing, and it can "hide" under a simple-looking function or stream.  It's like syntax sugar for loops that the JIT expands into simple code.

— John