Feedback on Sealed Types

Mon Apr 29 21:01:17 UTC 2019

It would be nice if we could "just" overload enum itself to support a 
record-like option:

     enum Node {
         AddNode(Node a, Node b),
         MulNode(Node a, Node b),
         ...;

         ...
     }

but unfortunately that syntax looks confusingly close to something else 
:(  It also doesn't really scale to multi-level hierarchies, though that 
might be OK.

Its worth thinking about, though.  The `data` construct from Haskell is 
surely more direct than modeling a sum with sealed interfaces (though, 
the latter is also more flexible than `data`.)

On the other hand, declaring a sum of records as:

     sealed interface I {
         record A(int a) implements I { }
         record B(long b) implements I { }
         record C(String c) implements I { }
     }

isn't so bad.  The main redundancy here is primarily "implements I", and 
secondarily "record".  We can surely compress that away like this:

     enum interface I {
         A(int a),
         B(long b),
         C(String c);

         // I methods
     }

but I am not sure it carries its weight, given as it adds little 
additional concision (and no additional semantics.)

On 4/29/2019 4:33 PM, Alan Malloy wrote:
> Thanks, Brian. I indeed didn't think of some of your proposed benefits 
> of sealing non-sum types, as I was focused mostly on things you 
> mentioned explicitly in the JEP, which is somewhat light on the 
> expected benefits.
>
> I think the first two items in your "challenges" solve each other: I 
> don't intend sum types to be the only kind of sealed type, but just a 
> good way to declare the simplest kind. I left out the "record" keyword 
> from the declaration with the idea that it would be implicit: if you 
> want the convenient sum-of-products declaration style, you have to use 
> records. If you want something more complicated, you declare a sealed 
> interface (or superclass), and N permitted subclasses, declared 
> separately in whatever way you want. This restriction helps by making 
> the semantics clearer, and I had also hoped that it would lead to a 
> syntax error if you leave out the comma. Looking more closely, I see 
> this is somewhat precarious: a record declaration looks enough like a 
> method signature that they may be ambiguous in an interface, if you 
> don't require the "record" keyword, or if you use a semicolon instead 
> of a comma. I think it can still work if we require each nested record 
> to use {...} instead of ; even if it's empty. This way, your two 
> examples look like
>
>     interface X {
>         class X1 { … }
>         class X2 { … }
>     }
>
> and
>
>     enumerated interface Y {
>         Y1 { … }, <— don’t forget this comma!
>         Y2 { … }
>     }
>
> The latter would become illegal if you dropped the comma, even if you 
> also forgot the "enumerated" keyword, because the braces make no sense 
> in an ordinary interface.
>
> On Mon, Apr 29, 2019 at 12:35 PM Brian Goetz <brian.goetz at oracle.com 
> <mailto:brian.goetz at oracle.com>> wrote:
>
>     Thanks Alan, for this nice exploration.  There’s a lot to respond
>     to.  I’ll start with some general comments about sealing, and then
>     move on to your alternate proposal for exposing it.
>
>     I can think of several main reasons why you would want to seal a
>     hierarchy.
>
>      - To say something about the _interface itself_. That is, that it
>     was not designed as a general purpose contract between arms-length
>     entities, but that it exists only as a common super type for a
>     fixed set of classes that are part of the library.  In other
>     words, “please don’t implement me.”
>
>      - To say something about the semantics of the type. Several of
>     the examples in your report fall into this category: “a DbResult
>     is either a NoRowsFound or a Rows(List<Row>)”.  This tells users
>     exactly what the reasonable range of results to expect are when
>     doing a query.  Of course, the spec could say the same thing, but
>     that involves reading and interpreting the spec. Easier if this
>     conclusion can be driven by types (and IDEs can help more here too.)
>
>      - To strengthen typing by simulating unions.  If my method is
>     going to return either a String or a Number, the common super type
>     is Object.  (Actually, it’s some variant of Serializable &
>     Comparable<? extends….>.). Sums-of-products allow library authors
>     to make a stronger statement about types in the presence of
>     unions.  Exposing a sum of StringHolder(String) and
>     NumberHolder(Number), using records and sealed types, is not so
>     ceremonious, so some library developers might choose to do this
>     instead of Object.
>
>      - Security.  Some libraries may want to know that the code they
>     are calling is part of their library, rather than an arbitrary
>     implementation of some interface.
>
>      - To aid in exhaustiveness.  We’ve already discussed this at
>     length; your point is that this one doesn’t come up as often as
>     one might hope.
>
>     Not only is there an obvious synergy between sums and products (as
>     many languages have demonstrated), but there is a third factor,
>     which is “if you make it easy enough, people will use it more.”
>      Clearly records are pretty easy to use; your point is that if
>     there were a more streamlined sum-of-products idiom, the third
>     factor would be even stronger here.  I think algebraic data types
>     is one of those things that will take some time for developers to
>     learn to appreciate; the easier we make it, of course the faster
>     that will happen.
>
>
>     Now, to your syntax suggestion.  Overall, I like the idea, but I
>     have some concerns.  First, the good parts:
>
>      - The connection with enums is powerful.  Users already
>     understand enums, so this will help them understand sums. Enums
>     have special treatment in switch; we want the same treatment for
>     sealed type patterns. Enums have special treatment for
>     exhaustiveness; we want the same for sealed type patterns.  So
>     tying these together with some more general enum-ness leans on
>     what people already know.
>
>      - While sums and products are theoretically independent features,
>     sums-of-products are expected to be quite common.  So it might be
>     reasonable to encourage this syntactically.
>
>      - The current proposal has some redundancy, in that the subtypes
>     have to say “implements Node”, even if they are nested within
>     Node.  With a stronger mechanism for declaring them, as you
>     propose, then that can safely be left implicit.
>
>      - I confess that I too like the simplicity of Haskell’s `data`
>     declaration, and this brings us closer.
>
>     Now, the challenges:
>
>      - The result is still a little busy.  We need a modifier for
>     “enumerated type”, and we would also need to be able to have child
>     types be not only records, but ordinary classes and interfaces. 
>     So we’d have to have a place for “record”, “class”, or “interface”
>     with the declaration of the enumerated classes (as well as other
>     modifiers.). That busies up the result a bit.
>
>      - Once we do this, I worry that it will be hard to tell the
>     difference between:
>
>         interface X {
>             class X1 { … }
>             class X2 { … }
>         }
>
>     and
>
>         enumerated interface Y {
>             class Y1 { … }, <— don’t forget this comma!
>             class Y2 { … }
>         }
>
>     and that users will forever be making mistakes like forgetting the
>     comma, or putting it where it doesn’t belong.
>
>      - This mechanism addresses the very common case of
>     sum-of-product, but leaves more esoteric sums out of the picture.
>      (Consider the types in java.lang.constant, which really want to
>     be sealed.). There, because they are not co-declared, we’d need
>     something more like
>
>          sealed interface ConstantDesc
>              permits ClassDesc, MethodTypeDesc, …. { }
>
>     It's possible that such a mechanism can be grafted on to your
>     proposal, or there is a shuffling that supports it.
>
>
>
>
>
>>     On Apr 29, 2019, at 2:28 PM, Alan Malloy <amalloy at google.com
>>     <mailto:amalloy at google.com>> wrote:
>>
>>     Hello again, amber-spec-experts. I have another report from the
>>     Google codebase, this time focusing on sealed types. It is
>>     viewable in full Technicolor HTML at
>>     http://cr.openjdk.java.net/~cushon/amalloy/sealed-types-report.html (thanks
>>     again to Liam for hosting), and included below as plain text:
>>
>>     Author: Alan Malloy (amalloy at google.com <mailto:amalloy at google.com>)
>>     Published: 2019-04-29
>>
>>     Feedback on Sealed Types
>>
>>     Hello again, amber-spec-experts. I’m back with a second Google
>>     codebase research project. I’m looking again at the Records &
>>     Sealed Types proposal (which has now become JDK-8222777), but
>>     this time I’m focusing on sealed types instead of records, as
>>     promised in my RFC of a few weeks ago. My goal was to investigate
>>     Google’s codebase to guess what developers might have done
>>     differently if they had access to sealed types. This could help
>>     us understand what works in the current proposal and what to
>>     consider changing.
>>
>>     Unlike my previous report, this one contains more anecdotes than
>>     statistics. It wound up being difficult to build static analysis
>>     to categorize the interesting cases, so I mostly examined
>>     promising candidates by hand.
>>
>>     Summary and Recommendations
>>
>>     For those who don’t care to read through all my anecdotes, I
>>     first provide a summary of my findings, and one suggested addition.
>>
>>     Sealed types, as proposed so far, are a good idea in theory: Java
>>     already has product types and open polymorphism, and sealed types
>>     give us closed polymorphism. However, I could not find many cases
>>     of code being written today that would be greatly enhanced if
>>     sealed types were available. The main selling point of sealed
>>     types for application authors is getting help from the compiler
>>     with exhaustiveness checking, but in practice developers almost
>>     always have a default case, because they are only interested in a
>>     subset of the possible subclasses, and want to ignore cases they
>>     don’t understand. This means that exhaustiveness-checking for
>>     pattern matches would mostly go unused if developers rewrote
>>     their existing code using sealed types.
>>
>>     Pattern matching is great, and can replace visitors in many
>>     cases, but this does not depend on sealed types except for
>>     exhaustiveness checks (which, again, would go mostly unused in
>>     code written today). The class hierarchies for which people
>>     define visitors today are just too large to write an exhaustive
>>     pattern match, and so a default case would be very common.
>>
>>     The other audience for sealed types is library authors. While in
>>     practice most developers have no great need to forbid subclasses,
>>     perhaps it would be a boon for authors of particularly popular
>>     libraries, who need to expose a non-final class as an
>>     implementation detail but don’t intend for consumers to create
>>     their own subclasses. Those authors can already include
>>     documentation saying “you really should not extend this”, but
>>     there is always some weirdo out there who will ignore your
>>     warnings and then write an angry letter when the next version of
>>     your library breaks his program (see: sun.misc.Unsafe). Authors
>>     of such libraries would welcome the opportunity to make it truly
>>     impossible to create undesirable subclasses.
>>
>>     Sealed Types As a Vehicle For Sum Types
>>
>>     So, sealed types as-is would be an improvement, but a niche one,
>>     used by few. I think we can get substantially more mileage out of
>>     them if we also include a more cohesive way to explicitly define
>>     a sum type and all its subtypes in one place with minimal
>>     ceremony. Such a sum type could be sealed, implicitly or
>>     explicitly. A tool like this takes what I see as the
>>     “theoretical” advantage of sum types (closed polymorphism), and
>>     makes it “practical” by putting it front and center. Making sums
>>     an actual language element instead of something “implied” by
>>     sealing a type and putting its subclasses nearby could help in a
>>     lot of ways:
>>
>>     * Developers might more often realize that a sealed/sum type is a
>>     good model for their domain. Currently it’s a “pattern” external
>>     to the language instead of a “feature”, and many don’t realize it
>>     could be applied to their domain. Putting it in the language
>>     raises its profile, addressing the problem that people don’t
>>     realize they want it.
>>     * The compiler could provide help for defining simple
>>     sums-of-products, while making it possible to opt into more
>>     complicated subclasses, in much the way that enums do: the
>>     typical enum just has bare constants like EAST, but you can add
>>     constructor arguments or override methods when necessary.
>>     * The ability to more easily model data in this way may result in
>>     developers writing more classes that are amenable to
>>     sealing/sums, as they do in other languages with explicit sum
>>     types (Haskell, Kotlin, Scala). Then, the exhaustiveness-checking
>>     feature that sealed types provide would pull more weight.
>>
>>     Since enum types are “degenerate sum types”, the syntax for
>>     defining sums can borrow heavily from enums. A sketch of the
>>     syntax I imagine for such things (of course, I am not married to it):
>>     public type-enum interface BinaryTree<T> {
>>       Leaf {
>>         @Override public Stream<T> elements() {return Stream.empty();}
>>       },
>>       Node(T data, BinaryTree<T> left, BinaryTree<T> right) {
>>         @Override public Stream<T> elements() {
>>           return Stream.concat(left.elements(),
>>     Stream.concat(Stream.of(data), right.elements()));
>>         }
>>       };
>>
>>
>>       public Stream<T> elements();
>>     }
>>
>>     Like enums, you can use a bare identifier for simple types that
>>     exist only to be pattern-matched against, but you can add fields
>>     and/or override blocks as necessary. The advantage over declaring
>>     a sealed type separately from its elements is both concision (the
>>     compiler infers visible records, superclass, and all type
>>     parameters) and clarity: you state your intention firmly. I think
>>     a convenient syntax like this will encourage developers to use
>>     the powerful tool of sealed types to model their data.
>>
>>     Evidence in Google’s Codebase
>>
>>     If you are just interested in recommendations, you can stop
>>     reading now: they are all included in the summary. What follows
>>     is a number of anecdotes, or case studies if you prefer, that led
>>     me to the conclusions above. Each shows a type that might have
>>     been written as a sealed type, and hopefully highlights a
>>     different facet of the question of how sealed types can be useful.
>>
>>     The first thing I looked for was classes which are often involved
>>     in instanceof checks. As language authors, we imagine people
>>     writing stuff like this[1] all the time:
>>
>>     interface Expr {int eval(Scope s);}
>>     record Var(String name) implements Expr {
>>       public int eval(Scope s) {return s.get(name);}
>>     }
>>     record Sum(Expr left, Expr right) implements Expr {
>>       public int eval(Scope s) {return left.eval(s) + right.eval(s);}
>>     }
>>     class Analyzer {
>>       Stream<String> variablesUsed(Expr e) {
>>         if (e instanceof Var) return Stream.of(((Var)e).name);
>>         if (e instanceof Sum) {
>>           return variablesUsed(((Sum)e).left)
>>     .concat(variablesUsed(((Sum)e).right));
>>        }
>>         throw new IllegalArgumentException();
>>       }
>>     }
>>
>>     Here, the Expr interface captures some of the functionality
>>     shared by all expressions, but later a client (Analyzer) came
>>     along and invented some other polymorphic operations to perform
>>     on an Expr, which Expr did not support. So Analyzer needed to do
>>     instanceof checks instead, externalizing the polymorphism. The
>>     principled approach would have been for Expr to export a visitor
>>     to begin with, but perhaps it wasn’t seen as worth the trouble at
>>     the time.
>>
>>     To try to find this pattern in the wild, I searched for method
>>     bodies which perform multiple instanceof checks against the same
>>     variable. Notably, this excludes the typical equals(Object)
>>     method, which only performs a single check. For each such
>>     variable, I noted:
>>
>>     1. Its declared type
>>     2. The set of subtypes it was checked for with instanceof
>>     3. The common supertype of those subtypes.
>>
>>     I guessed that (3) would usually be the same as (1), but in
>>     practice 55% of the time they were different. Often, the declared
>>     type was Object, or some generic type variable which erases to
>>     Object, while the common supertype being tested was something
>>     like Number, Event, or Node. For example, a Container<T> knows it
>>     will be used in some context where NaN is unsuitable, so it
>>     checks whether its contents are Float or Double, and if so
>>     ensures NaN is not stored. As a second example, a
>>     serialize(Object) method checks whether its input is String or
>>     ByteString, and throws an exception otherwise.
>>
>>     Bad sealed types found looking at instanceof checks
>>
>>     I looked through the most popular declared types of these
>>     candidates, to investigate which types are often involved in such
>>     checks. Most of them are not good candidates for a sealed type.
>>     Object was the most common type, followed by Exception and Throwabe.
>>
>>     Next up is an internal DOMObject class, which sounds promising
>>     until I tell you it has thousands of direct subclasses. Nobody is
>>     doing exhaustive switches on this, of course. Instead, many uses
>>     iterate over a Collection<DOMObject>, or receive a DOMObject in
>>     some way, and just check whether it is of one or two specific
>>     subtypes they care about. This turned out to be a very common
>>     pattern, not just for DOMObject, but for many candidate sealed
>>     types I found: nobody does exhaustive case analysis. They just
>>     look for objects they understand in some much larger hierarchy,
>>     and ignore the rest.
>>
>>     Some more humorous types that are often involved in instanceof
>>     checks: java.net.InetAddress (everyone wants to know if it’s v4
>>     or v6) and com.sun.source.tree.Tree, in our static-analysis
>>     tools. Tree is an interesting case: here we do exactly what I
>>     mentioned previously for DOMObject. On the surface it seems that
>>     Tree would be a good candidate for a sealed interface with record
>>     subtypes, but in practice I’m not sure what sealing would buy us.
>>     We would effectively opt out of exhaustiveness-checking by having
>>     a large default case, or by extending a visitor with
>>     default-empty methods. Of course, sometimes we define a new
>>     visitor to do some polymorphic operation over a Tree, but more
>>     often we just look for one or two subtypes we care about. For
>>     example, DataFlow inspects a Tree, but knows from context that it
>>     is either a LambdaExpressionTree, MethodTree, or an initializer.
>>
>>     Plausible sealed types found looking at instanceof checks
>>
>>     The previous section notwithstanding, I did dig deep enough into
>>     the results to find a few classes that could make good sealed
>>     types. The most prominent, and most interesting, was another AST.
>>     There is an abstract Node class for representing HTML documents.
>>     It has just 4 subclasses defined in the same file: Text, Comment,
>>     Tag, and EndTag. This spartan definition suggests it’s used for
>>     something like SAX parsing, but I didn’t confirm this. It does
>>     everything you could hope for from a type like this: it exposes a
>>     Visitor, it provides an accept(Visitor) method, and the
>>     superclass specifies abstract methods for a couple of the most
>>     common things you would want to do, such as a String toHtml() method.
>>
>>     However, recall that I found this class by looking for classes
>>     often involved in instanceof checks! Some people use the visitor,
>>     but why doesn’t everyone? The first reason I found is one I’ve
>>     mentioned several times already: clients only care about one of
>>     the 4 cases, and may have felt creating an anonymous visitor is
>>     too much ceremony. Would they be happy with a switch and a
>>     default clause? Probably, but it’s hard to know for sure. The
>>     second reason surprised me a bit: I found clients doing analysis
>>     that isn’t really amenable to any one visitor, or a simple
>>     pattern-match. They’ve written this:
>>
>>     if (mode1) { if (x instanceof Tag) {...} }
>>     else if (mode2) { if (x instanceof Text) {...}}
>>
>>     The same use site cares about different subclasses at different
>>     times, depending on some other flag(s) controlling its behavior.
>>     Even if we offered a pattern-match on x, it’s difficult to encode
>>     the flags correctly. They would have to match on a tuple of
>>     (mode1, mode2, x), with a case for (true, _, Tag name) and
>>     another for (false, true, Text text). Technically possible, but
>>     not really prettier than what they already have, especially since
>>     you would need to use a local record instead of an anonymous tuple.
>>
>>     Even so, I think this would have benefited from being a sealed
>>     type. Recall that earlier I carefully said “4 subclasses defined
>>     in the same file”. This is because some jokester in a different
>>     package altogether has defined their own fifth subclass, Doctype.
>>     They have their own sub-interface of Visitor that knows about
>>     Doctype nodes. I can’t help but feel that the authors of Node
>>     would have preferred to make this illegal, if they had been able to.
>>
>>     The second good sealed type I found is almost an enum, except
>>     that one of the instances has per-instance data. This is not
>>     exactly a surprise, since an enum is a degenerate sum type, and
>>     one way to think of sealed types is as a way to model sums. It
>>     looks something like this[2]:
>>
>>     public abstract class DbResult<T> {
>>       public record NoDatabase<T>() extends DbResult<T>;
>>       public record RowNotFound<T>() extends DbResult<T>;
>>       // Four more error types ...
>>       public record EmptySuccess<T>() extends DbResult<T>;
>>       public record SuccessWithData<T>(T data) extends DbResult<T>;
>>
>>       public T getData() {
>>         if (!(this instanceof SuccessWithData))
>>           throw new DbException();
>>         return ((SuccessWithData<T>)this).data;
>>       }
>>       public <U> DbResult<U> transform(Function<T, U> f) {
>>         if (!(this instanceof SuccessWithData)) {
>>           return (DbResult<U>)this;
>>         }
>>         return new SuccessWithData<U>(f.apply(
>>     ((SuccessWithData<T>)this).data));
>>     }
>>
>>     Reading this code made me yearn for Haskell: here is someone who
>>     surely wanted to write
>>
>>     data DbResult t = NoDatabase | NoRow | EmptySuccess | Success t
>>
>>     but had to spend 120 lines defining their sum-of-products (the
>>     extra verbosity is because really they made the subclasses
>>     private, and defined private static singletons for each of the
>>     error types, with a static getter to get the type parameter
>>     right). This seems like a potential win for records and for
>>     sealed types. Certainly my snippet was much shorter than the
>>     actual source file because the proposed record syntax is quite
>>     concise, so that is a real win. But what do we really gain from
>>     sealing this type? Still nobody does exhaustive analysis even of
>>     this relatively small type: they just use functions like getData
>>     and transform to work with the result generically, or spot-check
>>     a couple interesting subtypes with instanceof. Forbidding
>>     subclassing from other packages hardly matters: nobody was
>>     subclassing it anyway, and nor would they be tempted to. Really
>>     the improvements DbResult benefits most from are records, and
>>     pattern-matching on records. It would be much nicer to replace
>>     the instanceof/cast pattern with a pattern-match that extracts
>>     the relevant field.
>>
>>     This is the use case that inspired my idea of a type-enum, in the
>>     Summary section above. Rewriting it as a type-enum eliminates
>>     many of the problems: all the instanceof checks are gone, we
>>     don’t need a bunch of extra keywords for each case, and we’re
>>     explicit about the subclasses “belonging to” the sealed parent,
>>     which means we get stuff like extends and <T> for free. We get
>>     improved clarity by letting the definition of the class hierarchy
>>     reflect its “nature” as a sum.
>>
>>     public abstract type-enum DbResult<T> {
>>       NoDatabase,
>>       RowNotFound,
>>       EmptySuccess,
>>       SuccessWithData(T data) {
>>         @Override public T getData() {
>>           return data;
>>         }
>>         @Override public <U> DbResult<U> transform(Function<T, U> f) {
>>           return new SuccessWithData<U>(f.apply(data));
>>         }
>>       }
>>
>>       public T getData() {
>>         throw new DbException();
>>       }
>>       public <U> DbResult<U> transform(Function<T, U> f) {
>>         return (DbResult<U>)this;
>>       }
>>     }
>>
>>     Visitors
>>
>>     Instead of doing a bunch of instanceof checks, the
>>     “sophisticated” way to interact with a class having a small,
>>     known set of subtypes is with a visitor. I considered doing some
>>     complicated analysis to characterize what makes a class a
>>     visitor, and trying to automatically cross-reference visitors to
>>     the classes they visit...but in practice simply looking for
>>     classes with “Visitor” in their name was a strong enough signal
>>     that a more complicated approach was not needed. Having
>>     identified visitors, I looked at those visitors with the most
>>     subclasses, since each distinct subclass corresponds to one
>>     “interaction” with the sealed type that it visits, and well-used
>>     visitors suggest both popularity and good design.
>>
>>     One common theme I found: developers aren’t good at applying the
>>     visitor pattern. Many cases I found had some weird and
>>     inexplicable quirk compared to the “standard” visitor. These
>>     developers will be relieved to get pattern-matching syntax so
>>     they can stop writing visitors.
>>
>>     The Visiting Object
>>
>>     The first popular visitor I found was a bit odd to me. It’s
>>     another tree type, but with a weird amalgam of several visitors,
>>     and an unusual approach to its double dispatch. I have to include
>>     a relatively lengthy code snippet to show all of its facets:
>>
>>     public static abstract class Node {
>>       public interface Visitor {
>>         boolean process(Node node);
>>       }
>>       public boolean visit(Object v) {
>>         return v instanceof Visitor
>>             && ((Visitor)v).process(this);
>>       }
>>       // Other methods common to all Nodes ...
>>     }
>>
>>     public static final class RootNode extends Node {
>>       public interface Visitor {
>>         boolean processRoot(RootNode node);
>>       }
>>       @Override
>>       public boolean visit(Object v) {
>>         return v instanceof Visitor
>>             ? ((Visitor)v).processRoot(this)
>>             : super.visit(v);
>>       }
>>       // Other stuff about root nodes ...
>>     }
>>
>>     public static abstract class ValueNode extends Node {
>>       public interface Visitor {
>>         boolean processValue(ValueNode node);
>>       }
>>       @Override
>>       public boolean visit(Object v) {
>>         return v instanceof Visitor
>>             ? ((Visitor)v).processValue(this)
>>             : super.visit(v);
>>       }
>>     }
>>
>>     public static final class BooleanNode extends ValueNode {
>>       public interface Visitor {
>>         boolean processBool(BooleanNode node);
>>       }
>>       @Override
>>       public boolean visit(Object v) {
>>         return v instanceof Visitor
>>             ? ((Visitor)v).processBool(this)
>>             : super.visit(v);
>>       }
>>       // Other stuff about booleans ...
>>     }
>>
>>     public static final class StringNode extends ValueNode {
>>       // Much the same as BooleanNode
>>     }
>>
>>     This goes on for some time: there is a multi-layered hierarchy of
>>     dozens of node types, each with a boolean visit(Object) method,
>>     and their own distinct Visitor interface, in this file. I should
>>     note that this code is actually not written by a human, but
>>     rather generated by some process (I didn’t look into how). I
>>     still think it is worth mentioning here for two reasons: first,
>>     whoever wrote the code generator would probably do something
>>     similar if writing it by hand, and second because these visitors
>>     are used often by hand-written code.
>>
>>     Speaking of hand-written code, visitor subclasses now get to
>>     declare ahead of time exactly which kinds of nodes they care
>>     about, by implementing only the appropriate Visitor interfaces:
>>
>>     private class FooVisitor implements StringNode.Visitor,
>>      BooleanNode.Visitor, RootNode.Visitor {
>>       // ...
>>     }
>>
>>     This isn’t how I would have written things, but I can sorta see
>>     the appeal, if you don’t have to write it all by hand: a visitor
>>     can choose to handle any one subclass of ValueNode, or all
>>     ValueNodes, or just RootNode and StringNode, et cetera. They get
>>     to pick and choose what sub-trees of the inheritance tree they
>>     work with.
>>
>>     Would Node be a good sealed class? Maybe. It clearly intends to
>>     enumerate all subclasses, but the benefit it gets from enforcing
>>     that is minimal. As in my previous examples, the main advantage
>>     for Node implementors would come from records, and the main
>>     advantage for clients would come from pattern-matching, obviating
>>     their need for this giant visitor.
>>
>>     The Enumerated Node
>>
>>     Another AST, this time for some kind of query language,
>>     explicitly declares an enum of all subclasses it can have, and
>>     uses this enum instead of using traditional double-dispatch:
>>
>>     public interface Node {
>>       enum Kind {EXPR, QUERY, IMPORT /* and 9 more */}
>>       Kind getKind();
>>       Location getLocation();
>>     }
>>
>>     public abstract record AbstractNode(Location l) implements Node {}
>>
>>     public class Expr extends AbstractNode {
>>       public Kind getKind() {return EXPR;}
>>       // ...
>>     }
>>     // And so on for other Kinds ...
>>
>>     public abstract class Visitor {
>>       // Empty default implementations, not abstract.
>>       public Expr visitExpr(Expr e) {}
>>       public Query visitQuery(Query q) {}
>>       public Import visitImport(Import i) {}
>>       public Node visit(Node n) {
>>         switch (n.getKind()) {
>>           case EXPR: return visitExpr((Expr)n);
>>           case QUERY: return visitQuery((Query)n);
>>           case IMPORT: return visitImport((Import)n);
>>           // ...
>>         }
>>       }
>>     }
>>
>>     It’s not really clear to me why they do it this way, instead of
>>     putting an accept(Visitor) method on Node. They gain the ability
>>     to return different types for each Node subtype, but are hugely
>>     restricted in what visitors can do: they must return a Node,
>>     instead of performing an arbitrary computation. It seems like the
>>     idea is visitors must specialize to tree rewriting, but I still
>>     would have preferred to parameterize the visitor by return type.
>>
>>     Would this be better as a sealed type? I feel sure that if sealed
>>     types existed, the authors of this class would have used one. We
>>     could certainly do away with the enum, and use an
>>     expression-switch instead to pattern-match in the implementation
>>     of visit(Node). But I think the Visitor class would still exist,
>>     and still have separate methods for each Node subtype, because
>>     they developer seemed to care about specializing the return type.
>>     The only place where an exhaustiveness check helps would be in
>>     the visit(Node) method, inside the visitor class itself. All
>>     other dispatch goes through visit(Node), or through one of the
>>     specialized visitor methods if the type is known statically. It
>>     seems like overall this would be an improvement, but again, the
>>     improvement comes primarily from pattern-matching, not sealing.
>>
>>     Colocated interface implementations
>>
>>     Finally, I looked for interfaces having all of their
>>     implementations defined in the same file. On this I do have some
>>     statistical data[3]. A huge majority (98.5%) of public interfaces
>>     have at least one implementation in a different source file.
>>     Package-private interfaces also tend to have implementations in
>>     other files: 85% of them are in this category. For protected
>>     interfaces it’s much closer: only 53% have external
>>     implementations. Of course, all private interfaces have all
>>     implementations in a single file.
>>
>>     Next, I looked at interfaces that share a source file with all
>>     their implementations, to see whether they’d make good sealed
>>     types. First was this Entry class:
>>
>>     public interface Entry {
>>       enum Status {OK, PENDING, FAILED}
>>       Status getStatus();
>>       int size();
>>       String render();
>>     }
>>
>>     public class UserEntry implements Entry {
>>       private User u;
>>       private Status s;
>>       public UserEntry(User u, Status s) {
>>         this.u = u;
>>         this.s = s;
>>       }
>>       @Override String render() {return u.name <http://u.name/>();}
>>       @Override int size() {return 1;}
>>       @Override Status getStatus() {return s;}
>>     }
>>
>>     public class AccountEntry implements Entry {
>>       private Account a;
>>       private Status s;
>>       public UserEntry(Account a, Status s) {
>>         this.a = a;
>>         this.s = s;
>>       }
>>       @Override String render() {return a.render();}
>>       @Override int size() {return a.size();}
>>       @Override Status getStatus() {return s;}
>>     }
>>
>>     A huge majority of the clients of this Entry interface treat it
>>     polymorphically, just calling its interface methods. In only one
>>     case is there an instanceof check made on an Entry, dispatching
>>     to different methods depending on which subclass is present.
>>
>>     Is this a good sealed type? I think not, really. There are two
>>     implementations now, but perhaps there will be a GroupEntry
>>     someday. Existing clients should continue to work in that case:
>>     the polymorphic Entry interface provides everything clients are
>>     “intended” to know.
>>
>>     Another candidate for sealing:
>>
>>     public interface Request {/* Empty */}
>>     public record RequestById(int id) implements Request;
>>     public record RequestByValue(String owner, boolean historic)
>>     implements Request;
>>
>>     public class RequestFetcher {
>>       public List<Item> fetch(Iterable<Request> requests) {
>>         List<RequestById> idReqs = Lists.newArrayList();
>>         List<RequestByValue> valueReqs = Lists.newArrayList();
>>         List<Query> queries = Lists.newArrayList();
>>         for (Request req : requests) {
>>           if (req instanceof RequestById) {
>>             idReqs.add((RequestById)req);
>>           } else if (req instanceof RequestByValue) {
>>     valueReqs.add((RequestByValue)req);
>>           }
>>         }
>>     queries.addAll(prepareIdQueries(idReqs));
>>     queries.addAll(prepareValueQueries(valueReqs));
>>         return runQueries(queries);
>>       }
>>     }
>>
>>     Interestingly, since the Request interface is empty, the only way
>>     to do anything with this class is to cast it to one
>>     implementation type. In fact, the RequestFetcher I include here
>>     is the only usage of either of these classes (plus, of course,
>>     helpers like prepareIdQueries).
>>
>>     So, clients need to know about specific subclasses, and want to
>>     be sure they’re doing exhaustive pattern-matching. Seems like a
>>     great sealed class to me. Except...actually each of the two
>>     subclasses has been extended by a decorator adding a source[4]:
>>
>>     public record SourcedRequestById(Source source) extends RequestById;
>>     public record SourcedRequestByValue(Source source) extends
>>     RequestByValue;
>>
>>     Does this argue in favor of sealing, or against? I don’t really
>>     know. The owners of Request clearly intended for all four of
>>     these subclasses to exist (they’re in the same package), so they
>>     could include them all in the permitted subtype list, but it
>>     seems like a confusing API to expose to clients.
>>
>>     A third candidate for sealing is another simple sum type:
>>
>>     public interface ValueOrAggregatorException<T> {
>>       T get();
>>       public static <T> ValueOrAggregatorException<T>
>>         of(T value) {
>>         return new OfValue<T>(value);
>>       }
>>       public static <T> ValueOrAggregatorException<T>
>>           ofException(AggregatorException err) {
>>         return new OfException<T>(err);
>>       }
>>       private record OfValue<T>(T value)
>>           implements ValueOrAggregatorException<T> {
>>         @Override T get() {return value;}
>>       }
>>       private record OfException<T>(AggregatorException err)
>>           implements ValueOrAggregatorException<T> {
>>         @Override T get() {throw err;}
>>       }
>>     }
>>
>>     It has only two subtypes, and it seems unimaginable there could
>>     ever be a third, so why not seal it? However, the subtypes are
>>     intentionally hidden: it is undesirable to let people see whether
>>     there’s an exception, except by having it thrown at you. In fact
>>     AggregatorException is documented as “plugins may throw this, but
>>     should never catch it”: there is some higher-level thing
>>     responsible for catching all such exceptions. So, this type gains
>>     no benefit from exhaustiveness checks in pattern-matching. The
>>     type is intended to be used polymorphically, through its
>>     interface method, even though its private implementation is
>>     amenable to sealing.
>>     ________________
>>     [1] Throughout this document I will use record syntax as if it
>>     were already in the language. This is merely for brevity, and to
>>     avoid making the reader spend a lot of time reading code that
>>     boils down to just storing a couple fields. In practice, of
>>     course the code in Google’s codebase either defines the records
>>     by hand, or uses an @AutoValue.
>>     [2] Recall that @AutoValue, Google’s “record”, allows extending a
>>     class, which is semantically okay here: DbResult has no state,
>>     only behavior.
>>     [3]This data is imperfect. While the Google codebase strongly
>>     discourages having more than one version of a module checked in,
>>     there is still some amount of “vendoring” or checking in multiple
>>     versions of some package, e.g. for supporting external clients of
>>     an old version of an API. As a result, two “different files”
>>     which are really copies of each other may implement interfaces
>>     with the same fully-qualified name; I did not attempt to control
>>     for this case, and so such cases may look like they were in the
>>     same file, or not.
>>     [4] Of course in the record proposal it is illegal to extend
>>     records like this; in real life these plain data carriers are
>>     implemented by hand as ordinary classes, so the subtyping is legal.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20190429/09cba626/attachment-0001.html>