Feedback on Sealed Types
Brian Goetz
brian.goetz at oracle.com
Mon Apr 29 21:01:17 UTC 2019
It would be nice if we could "just" overload enum itself to support a
record-like option:
enum Node {
AddNode(Node a, Node b),
MulNode(Node a, Node b),
...;
...
}
but unfortunately that syntax looks confusingly close to something else
:( It also doesn't really scale to multi-level hierarchies, though that
might be OK.
Its worth thinking about, though. The `data` construct from Haskell is
surely more direct than modeling a sum with sealed interfaces (though,
the latter is also more flexible than `data`.)
On the other hand, declaring a sum of records as:
sealed interface I {
record A(int a) implements I { }
record B(long b) implements I { }
record C(String c) implements I { }
}
isn't so bad. The main redundancy here is primarily "implements I", and
secondarily "record". We can surely compress that away like this:
enum interface I {
A(int a),
B(long b),
C(String c);
// I methods
}
but I am not sure it carries its weight, given as it adds little
additional concision (and no additional semantics.)
On 4/29/2019 4:33 PM, Alan Malloy wrote:
> Thanks, Brian. I indeed didn't think of some of your proposed benefits
> of sealing non-sum types, as I was focused mostly on things you
> mentioned explicitly in the JEP, which is somewhat light on the
> expected benefits.
>
> I think the first two items in your "challenges" solve each other: I
> don't intend sum types to be the only kind of sealed type, but just a
> good way to declare the simplest kind. I left out the "record" keyword
> from the declaration with the idea that it would be implicit: if you
> want the convenient sum-of-products declaration style, you have to use
> records. If you want something more complicated, you declare a sealed
> interface (or superclass), and N permitted subclasses, declared
> separately in whatever way you want. This restriction helps by making
> the semantics clearer, and I had also hoped that it would lead to a
> syntax error if you leave out the comma. Looking more closely, I see
> this is somewhat precarious: a record declaration looks enough like a
> method signature that they may be ambiguous in an interface, if you
> don't require the "record" keyword, or if you use a semicolon instead
> of a comma. I think it can still work if we require each nested record
> to use {...} instead of ; even if it's empty. This way, your two
> examples look like
>
> interface X {
> class X1 { … }
> class X2 { … }
> }
>
> and
>
> enumerated interface Y {
> Y1 { … }, <— don’t forget this comma!
> Y2 { … }
> }
>
> The latter would become illegal if you dropped the comma, even if you
> also forgot the "enumerated" keyword, because the braces make no sense
> in an ordinary interface.
>
> On Mon, Apr 29, 2019 at 12:35 PM Brian Goetz <brian.goetz at oracle.com
> <mailto:brian.goetz at oracle.com>> wrote:
>
> Thanks Alan, for this nice exploration. There’s a lot to respond
> to. I’ll start with some general comments about sealing, and then
> move on to your alternate proposal for exposing it.
>
> I can think of several main reasons why you would want to seal a
> hierarchy.
>
> - To say something about the _interface itself_. That is, that it
> was not designed as a general purpose contract between arms-length
> entities, but that it exists only as a common super type for a
> fixed set of classes that are part of the library. In other
> words, “please don’t implement me.”
>
> - To say something about the semantics of the type. Several of
> the examples in your report fall into this category: “a DbResult
> is either a NoRowsFound or a Rows(List<Row>)”. This tells users
> exactly what the reasonable range of results to expect are when
> doing a query. Of course, the spec could say the same thing, but
> that involves reading and interpreting the spec. Easier if this
> conclusion can be driven by types (and IDEs can help more here too.)
>
> - To strengthen typing by simulating unions. If my method is
> going to return either a String or a Number, the common super type
> is Object. (Actually, it’s some variant of Serializable &
> Comparable<? extends….>.). Sums-of-products allow library authors
> to make a stronger statement about types in the presence of
> unions. Exposing a sum of StringHolder(String) and
> NumberHolder(Number), using records and sealed types, is not so
> ceremonious, so some library developers might choose to do this
> instead of Object.
>
> - Security. Some libraries may want to know that the code they
> are calling is part of their library, rather than an arbitrary
> implementation of some interface.
>
> - To aid in exhaustiveness. We’ve already discussed this at
> length; your point is that this one doesn’t come up as often as
> one might hope.
>
> Not only is there an obvious synergy between sums and products (as
> many languages have demonstrated), but there is a third factor,
> which is “if you make it easy enough, people will use it more.”
> Clearly records are pretty easy to use; your point is that if
> there were a more streamlined sum-of-products idiom, the third
> factor would be even stronger here. I think algebraic data types
> is one of those things that will take some time for developers to
> learn to appreciate; the easier we make it, of course the faster
> that will happen.
>
>
> Now, to your syntax suggestion. Overall, I like the idea, but I
> have some concerns. First, the good parts:
>
> - The connection with enums is powerful. Users already
> understand enums, so this will help them understand sums. Enums
> have special treatment in switch; we want the same treatment for
> sealed type patterns. Enums have special treatment for
> exhaustiveness; we want the same for sealed type patterns. So
> tying these together with some more general enum-ness leans on
> what people already know.
>
> - While sums and products are theoretically independent features,
> sums-of-products are expected to be quite common. So it might be
> reasonable to encourage this syntactically.
>
> - The current proposal has some redundancy, in that the subtypes
> have to say “implements Node”, even if they are nested within
> Node. With a stronger mechanism for declaring them, as you
> propose, then that can safely be left implicit.
>
> - I confess that I too like the simplicity of Haskell’s `data`
> declaration, and this brings us closer.
>
> Now, the challenges:
>
> - The result is still a little busy. We need a modifier for
> “enumerated type”, and we would also need to be able to have child
> types be not only records, but ordinary classes and interfaces.
> So we’d have to have a place for “record”, “class”, or “interface”
> with the declaration of the enumerated classes (as well as other
> modifiers.). That busies up the result a bit.
>
> - Once we do this, I worry that it will be hard to tell the
> difference between:
>
> interface X {
> class X1 { … }
> class X2 { … }
> }
>
> and
>
> enumerated interface Y {
> class Y1 { … }, <— don’t forget this comma!
> class Y2 { … }
> }
>
> and that users will forever be making mistakes like forgetting the
> comma, or putting it where it doesn’t belong.
>
> - This mechanism addresses the very common case of
> sum-of-product, but leaves more esoteric sums out of the picture.
> (Consider the types in java.lang.constant, which really want to
> be sealed.). There, because they are not co-declared, we’d need
> something more like
>
> sealed interface ConstantDesc
> permits ClassDesc, MethodTypeDesc, …. { }
>
> It's possible that such a mechanism can be grafted on to your
> proposal, or there is a shuffling that supports it.
>
>
>
>
>
>> On Apr 29, 2019, at 2:28 PM, Alan Malloy <amalloy at google.com
>> <mailto:amalloy at google.com>> wrote:
>>
>> Hello again, amber-spec-experts. I have another report from the
>> Google codebase, this time focusing on sealed types. It is
>> viewable in full Technicolor HTML at
>> http://cr.openjdk.java.net/~cushon/amalloy/sealed-types-report.html (thanks
>> again to Liam for hosting), and included below as plain text:
>>
>> Author: Alan Malloy (amalloy at google.com <mailto:amalloy at google.com>)
>> Published: 2019-04-29
>>
>> Feedback on Sealed Types
>>
>> Hello again, amber-spec-experts. I’m back with a second Google
>> codebase research project. I’m looking again at the Records &
>> Sealed Types proposal (which has now become JDK-8222777), but
>> this time I’m focusing on sealed types instead of records, as
>> promised in my RFC of a few weeks ago. My goal was to investigate
>> Google’s codebase to guess what developers might have done
>> differently if they had access to sealed types. This could help
>> us understand what works in the current proposal and what to
>> consider changing.
>>
>> Unlike my previous report, this one contains more anecdotes than
>> statistics. It wound up being difficult to build static analysis
>> to categorize the interesting cases, so I mostly examined
>> promising candidates by hand.
>>
>> Summary and Recommendations
>>
>> For those who don’t care to read through all my anecdotes, I
>> first provide a summary of my findings, and one suggested addition.
>>
>> Sealed types, as proposed so far, are a good idea in theory: Java
>> already has product types and open polymorphism, and sealed types
>> give us closed polymorphism. However, I could not find many cases
>> of code being written today that would be greatly enhanced if
>> sealed types were available. The main selling point of sealed
>> types for application authors is getting help from the compiler
>> with exhaustiveness checking, but in practice developers almost
>> always have a default case, because they are only interested in a
>> subset of the possible subclasses, and want to ignore cases they
>> don’t understand. This means that exhaustiveness-checking for
>> pattern matches would mostly go unused if developers rewrote
>> their existing code using sealed types.
>>
>> Pattern matching is great, and can replace visitors in many
>> cases, but this does not depend on sealed types except for
>> exhaustiveness checks (which, again, would go mostly unused in
>> code written today). The class hierarchies for which people
>> define visitors today are just too large to write an exhaustive
>> pattern match, and so a default case would be very common.
>>
>> The other audience for sealed types is library authors. While in
>> practice most developers have no great need to forbid subclasses,
>> perhaps it would be a boon for authors of particularly popular
>> libraries, who need to expose a non-final class as an
>> implementation detail but don’t intend for consumers to create
>> their own subclasses. Those authors can already include
>> documentation saying “you really should not extend this”, but
>> there is always some weirdo out there who will ignore your
>> warnings and then write an angry letter when the next version of
>> your library breaks his program (see: sun.misc.Unsafe). Authors
>> of such libraries would welcome the opportunity to make it truly
>> impossible to create undesirable subclasses.
>>
>> Sealed Types As a Vehicle For Sum Types
>>
>> So, sealed types as-is would be an improvement, but a niche one,
>> used by few. I think we can get substantially more mileage out of
>> them if we also include a more cohesive way to explicitly define
>> a sum type and all its subtypes in one place with minimal
>> ceremony. Such a sum type could be sealed, implicitly or
>> explicitly. A tool like this takes what I see as the
>> “theoretical” advantage of sum types (closed polymorphism), and
>> makes it “practical” by putting it front and center. Making sums
>> an actual language element instead of something “implied” by
>> sealing a type and putting its subclasses nearby could help in a
>> lot of ways:
>>
>> * Developers might more often realize that a sealed/sum type is a
>> good model for their domain. Currently it’s a “pattern” external
>> to the language instead of a “feature”, and many don’t realize it
>> could be applied to their domain. Putting it in the language
>> raises its profile, addressing the problem that people don’t
>> realize they want it.
>> * The compiler could provide help for defining simple
>> sums-of-products, while making it possible to opt into more
>> complicated subclasses, in much the way that enums do: the
>> typical enum just has bare constants like EAST, but you can add
>> constructor arguments or override methods when necessary.
>> * The ability to more easily model data in this way may result in
>> developers writing more classes that are amenable to
>> sealing/sums, as they do in other languages with explicit sum
>> types (Haskell, Kotlin, Scala). Then, the exhaustiveness-checking
>> feature that sealed types provide would pull more weight.
>>
>> Since enum types are “degenerate sum types”, the syntax for
>> defining sums can borrow heavily from enums. A sketch of the
>> syntax I imagine for such things (of course, I am not married to it):
>> public type-enum interface BinaryTree<T> {
>> Leaf {
>> @Override public Stream<T> elements() {return Stream.empty();}
>> },
>> Node(T data, BinaryTree<T> left, BinaryTree<T> right) {
>> @Override public Stream<T> elements() {
>> return Stream.concat(left.elements(),
>> Stream.concat(Stream.of(data), right.elements()));
>> }
>> };
>>
>>
>> public Stream<T> elements();
>> }
>>
>> Like enums, you can use a bare identifier for simple types that
>> exist only to be pattern-matched against, but you can add fields
>> and/or override blocks as necessary. The advantage over declaring
>> a sealed type separately from its elements is both concision (the
>> compiler infers visible records, superclass, and all type
>> parameters) and clarity: you state your intention firmly. I think
>> a convenient syntax like this will encourage developers to use
>> the powerful tool of sealed types to model their data.
>>
>> Evidence in Google’s Codebase
>>
>> If you are just interested in recommendations, you can stop
>> reading now: they are all included in the summary. What follows
>> is a number of anecdotes, or case studies if you prefer, that led
>> me to the conclusions above. Each shows a type that might have
>> been written as a sealed type, and hopefully highlights a
>> different facet of the question of how sealed types can be useful.
>>
>> The first thing I looked for was classes which are often involved
>> in instanceof checks. As language authors, we imagine people
>> writing stuff like this[1] all the time:
>>
>> interface Expr {int eval(Scope s);}
>> record Var(String name) implements Expr {
>> public int eval(Scope s) {return s.get(name);}
>> }
>> record Sum(Expr left, Expr right) implements Expr {
>> public int eval(Scope s) {return left.eval(s) + right.eval(s);}
>> }
>> class Analyzer {
>> Stream<String> variablesUsed(Expr e) {
>> if (e instanceof Var) return Stream.of(((Var)e).name);
>> if (e instanceof Sum) {
>> return variablesUsed(((Sum)e).left)
>> .concat(variablesUsed(((Sum)e).right));
>> }
>> throw new IllegalArgumentException();
>> }
>> }
>>
>> Here, the Expr interface captures some of the functionality
>> shared by all expressions, but later a client (Analyzer) came
>> along and invented some other polymorphic operations to perform
>> on an Expr, which Expr did not support. So Analyzer needed to do
>> instanceof checks instead, externalizing the polymorphism. The
>> principled approach would have been for Expr to export a visitor
>> to begin with, but perhaps it wasn’t seen as worth the trouble at
>> the time.
>>
>> To try to find this pattern in the wild, I searched for method
>> bodies which perform multiple instanceof checks against the same
>> variable. Notably, this excludes the typical equals(Object)
>> method, which only performs a single check. For each such
>> variable, I noted:
>>
>> 1. Its declared type
>> 2. The set of subtypes it was checked for with instanceof
>> 3. The common supertype of those subtypes.
>>
>> I guessed that (3) would usually be the same as (1), but in
>> practice 55% of the time they were different. Often, the declared
>> type was Object, or some generic type variable which erases to
>> Object, while the common supertype being tested was something
>> like Number, Event, or Node. For example, a Container<T> knows it
>> will be used in some context where NaN is unsuitable, so it
>> checks whether its contents are Float or Double, and if so
>> ensures NaN is not stored. As a second example, a
>> serialize(Object) method checks whether its input is String or
>> ByteString, and throws an exception otherwise.
>>
>> Bad sealed types found looking at instanceof checks
>>
>> I looked through the most popular declared types of these
>> candidates, to investigate which types are often involved in such
>> checks. Most of them are not good candidates for a sealed type.
>> Object was the most common type, followed by Exception and Throwabe.
>>
>> Next up is an internal DOMObject class, which sounds promising
>> until I tell you it has thousands of direct subclasses. Nobody is
>> doing exhaustive switches on this, of course. Instead, many uses
>> iterate over a Collection<DOMObject>, or receive a DOMObject in
>> some way, and just check whether it is of one or two specific
>> subtypes they care about. This turned out to be a very common
>> pattern, not just for DOMObject, but for many candidate sealed
>> types I found: nobody does exhaustive case analysis. They just
>> look for objects they understand in some much larger hierarchy,
>> and ignore the rest.
>>
>> Some more humorous types that are often involved in instanceof
>> checks: java.net.InetAddress (everyone wants to know if it’s v4
>> or v6) and com.sun.source.tree.Tree, in our static-analysis
>> tools. Tree is an interesting case: here we do exactly what I
>> mentioned previously for DOMObject. On the surface it seems that
>> Tree would be a good candidate for a sealed interface with record
>> subtypes, but in practice I’m not sure what sealing would buy us.
>> We would effectively opt out of exhaustiveness-checking by having
>> a large default case, or by extending a visitor with
>> default-empty methods. Of course, sometimes we define a new
>> visitor to do some polymorphic operation over a Tree, but more
>> often we just look for one or two subtypes we care about. For
>> example, DataFlow inspects a Tree, but knows from context that it
>> is either a LambdaExpressionTree, MethodTree, or an initializer.
>>
>> Plausible sealed types found looking at instanceof checks
>>
>> The previous section notwithstanding, I did dig deep enough into
>> the results to find a few classes that could make good sealed
>> types. The most prominent, and most interesting, was another AST.
>> There is an abstract Node class for representing HTML documents.
>> It has just 4 subclasses defined in the same file: Text, Comment,
>> Tag, and EndTag. This spartan definition suggests it’s used for
>> something like SAX parsing, but I didn’t confirm this. It does
>> everything you could hope for from a type like this: it exposes a
>> Visitor, it provides an accept(Visitor) method, and the
>> superclass specifies abstract methods for a couple of the most
>> common things you would want to do, such as a String toHtml() method.
>>
>> However, recall that I found this class by looking for classes
>> often involved in instanceof checks! Some people use the visitor,
>> but why doesn’t everyone? The first reason I found is one I’ve
>> mentioned several times already: clients only care about one of
>> the 4 cases, and may have felt creating an anonymous visitor is
>> too much ceremony. Would they be happy with a switch and a
>> default clause? Probably, but it’s hard to know for sure. The
>> second reason surprised me a bit: I found clients doing analysis
>> that isn’t really amenable to any one visitor, or a simple
>> pattern-match. They’ve written this:
>>
>> if (mode1) { if (x instanceof Tag) {...} }
>> else if (mode2) { if (x instanceof Text) {...}}
>>
>> The same use site cares about different subclasses at different
>> times, depending on some other flag(s) controlling its behavior.
>> Even if we offered a pattern-match on x, it’s difficult to encode
>> the flags correctly. They would have to match on a tuple of
>> (mode1, mode2, x), with a case for (true, _, Tag name) and
>> another for (false, true, Text text). Technically possible, but
>> not really prettier than what they already have, especially since
>> you would need to use a local record instead of an anonymous tuple.
>>
>> Even so, I think this would have benefited from being a sealed
>> type. Recall that earlier I carefully said “4 subclasses defined
>> in the same file”. This is because some jokester in a different
>> package altogether has defined their own fifth subclass, Doctype.
>> They have their own sub-interface of Visitor that knows about
>> Doctype nodes. I can’t help but feel that the authors of Node
>> would have preferred to make this illegal, if they had been able to.
>>
>> The second good sealed type I found is almost an enum, except
>> that one of the instances has per-instance data. This is not
>> exactly a surprise, since an enum is a degenerate sum type, and
>> one way to think of sealed types is as a way to model sums. It
>> looks something like this[2]:
>>
>> public abstract class DbResult<T> {
>> public record NoDatabase<T>() extends DbResult<T>;
>> public record RowNotFound<T>() extends DbResult<T>;
>> // Four more error types ...
>> public record EmptySuccess<T>() extends DbResult<T>;
>> public record SuccessWithData<T>(T data) extends DbResult<T>;
>>
>> public T getData() {
>> if (!(this instanceof SuccessWithData))
>> throw new DbException();
>> return ((SuccessWithData<T>)this).data;
>> }
>> public <U> DbResult<U> transform(Function<T, U> f) {
>> if (!(this instanceof SuccessWithData)) {
>> return (DbResult<U>)this;
>> }
>> return new SuccessWithData<U>(f.apply(
>> ((SuccessWithData<T>)this).data));
>> }
>>
>> Reading this code made me yearn for Haskell: here is someone who
>> surely wanted to write
>>
>> data DbResult t = NoDatabase | NoRow | EmptySuccess | Success t
>>
>> but had to spend 120 lines defining their sum-of-products (the
>> extra verbosity is because really they made the subclasses
>> private, and defined private static singletons for each of the
>> error types, with a static getter to get the type parameter
>> right). This seems like a potential win for records and for
>> sealed types. Certainly my snippet was much shorter than the
>> actual source file because the proposed record syntax is quite
>> concise, so that is a real win. But what do we really gain from
>> sealing this type? Still nobody does exhaustive analysis even of
>> this relatively small type: they just use functions like getData
>> and transform to work with the result generically, or spot-check
>> a couple interesting subtypes with instanceof. Forbidding
>> subclassing from other packages hardly matters: nobody was
>> subclassing it anyway, and nor would they be tempted to. Really
>> the improvements DbResult benefits most from are records, and
>> pattern-matching on records. It would be much nicer to replace
>> the instanceof/cast pattern with a pattern-match that extracts
>> the relevant field.
>>
>> This is the use case that inspired my idea of a type-enum, in the
>> Summary section above. Rewriting it as a type-enum eliminates
>> many of the problems: all the instanceof checks are gone, we
>> don’t need a bunch of extra keywords for each case, and we’re
>> explicit about the subclasses “belonging to” the sealed parent,
>> which means we get stuff like extends and <T> for free. We get
>> improved clarity by letting the definition of the class hierarchy
>> reflect its “nature” as a sum.
>>
>> public abstract type-enum DbResult<T> {
>> NoDatabase,
>> RowNotFound,
>> EmptySuccess,
>> SuccessWithData(T data) {
>> @Override public T getData() {
>> return data;
>> }
>> @Override public <U> DbResult<U> transform(Function<T, U> f) {
>> return new SuccessWithData<U>(f.apply(data));
>> }
>> }
>>
>> public T getData() {
>> throw new DbException();
>> }
>> public <U> DbResult<U> transform(Function<T, U> f) {
>> return (DbResult<U>)this;
>> }
>> }
>>
>> Visitors
>>
>> Instead of doing a bunch of instanceof checks, the
>> “sophisticated” way to interact with a class having a small,
>> known set of subtypes is with a visitor. I considered doing some
>> complicated analysis to characterize what makes a class a
>> visitor, and trying to automatically cross-reference visitors to
>> the classes they visit...but in practice simply looking for
>> classes with “Visitor” in their name was a strong enough signal
>> that a more complicated approach was not needed. Having
>> identified visitors, I looked at those visitors with the most
>> subclasses, since each distinct subclass corresponds to one
>> “interaction” with the sealed type that it visits, and well-used
>> visitors suggest both popularity and good design.
>>
>> One common theme I found: developers aren’t good at applying the
>> visitor pattern. Many cases I found had some weird and
>> inexplicable quirk compared to the “standard” visitor. These
>> developers will be relieved to get pattern-matching syntax so
>> they can stop writing visitors.
>>
>> The Visiting Object
>>
>> The first popular visitor I found was a bit odd to me. It’s
>> another tree type, but with a weird amalgam of several visitors,
>> and an unusual approach to its double dispatch. I have to include
>> a relatively lengthy code snippet to show all of its facets:
>>
>> public static abstract class Node {
>> public interface Visitor {
>> boolean process(Node node);
>> }
>> public boolean visit(Object v) {
>> return v instanceof Visitor
>> && ((Visitor)v).process(this);
>> }
>> // Other methods common to all Nodes ...
>> }
>>
>> public static final class RootNode extends Node {
>> public interface Visitor {
>> boolean processRoot(RootNode node);
>> }
>> @Override
>> public boolean visit(Object v) {
>> return v instanceof Visitor
>> ? ((Visitor)v).processRoot(this)
>> : super.visit(v);
>> }
>> // Other stuff about root nodes ...
>> }
>>
>> public static abstract class ValueNode extends Node {
>> public interface Visitor {
>> boolean processValue(ValueNode node);
>> }
>> @Override
>> public boolean visit(Object v) {
>> return v instanceof Visitor
>> ? ((Visitor)v).processValue(this)
>> : super.visit(v);
>> }
>> }
>>
>> public static final class BooleanNode extends ValueNode {
>> public interface Visitor {
>> boolean processBool(BooleanNode node);
>> }
>> @Override
>> public boolean visit(Object v) {
>> return v instanceof Visitor
>> ? ((Visitor)v).processBool(this)
>> : super.visit(v);
>> }
>> // Other stuff about booleans ...
>> }
>>
>> public static final class StringNode extends ValueNode {
>> // Much the same as BooleanNode
>> }
>>
>> This goes on for some time: there is a multi-layered hierarchy of
>> dozens of node types, each with a boolean visit(Object) method,
>> and their own distinct Visitor interface, in this file. I should
>> note that this code is actually not written by a human, but
>> rather generated by some process (I didn’t look into how). I
>> still think it is worth mentioning here for two reasons: first,
>> whoever wrote the code generator would probably do something
>> similar if writing it by hand, and second because these visitors
>> are used often by hand-written code.
>>
>> Speaking of hand-written code, visitor subclasses now get to
>> declare ahead of time exactly which kinds of nodes they care
>> about, by implementing only the appropriate Visitor interfaces:
>>
>> private class FooVisitor implements StringNode.Visitor,
>> BooleanNode.Visitor, RootNode.Visitor {
>> // ...
>> }
>>
>> This isn’t how I would have written things, but I can sorta see
>> the appeal, if you don’t have to write it all by hand: a visitor
>> can choose to handle any one subclass of ValueNode, or all
>> ValueNodes, or just RootNode and StringNode, et cetera. They get
>> to pick and choose what sub-trees of the inheritance tree they
>> work with.
>>
>> Would Node be a good sealed class? Maybe. It clearly intends to
>> enumerate all subclasses, but the benefit it gets from enforcing
>> that is minimal. As in my previous examples, the main advantage
>> for Node implementors would come from records, and the main
>> advantage for clients would come from pattern-matching, obviating
>> their need for this giant visitor.
>>
>> The Enumerated Node
>>
>> Another AST, this time for some kind of query language,
>> explicitly declares an enum of all subclasses it can have, and
>> uses this enum instead of using traditional double-dispatch:
>>
>> public interface Node {
>> enum Kind {EXPR, QUERY, IMPORT /* and 9 more */}
>> Kind getKind();
>> Location getLocation();
>> }
>>
>> public abstract record AbstractNode(Location l) implements Node {}
>>
>> public class Expr extends AbstractNode {
>> public Kind getKind() {return EXPR;}
>> // ...
>> }
>> // And so on for other Kinds ...
>>
>> public abstract class Visitor {
>> // Empty default implementations, not abstract.
>> public Expr visitExpr(Expr e) {}
>> public Query visitQuery(Query q) {}
>> public Import visitImport(Import i) {}
>> public Node visit(Node n) {
>> switch (n.getKind()) {
>> case EXPR: return visitExpr((Expr)n);
>> case QUERY: return visitQuery((Query)n);
>> case IMPORT: return visitImport((Import)n);
>> // ...
>> }
>> }
>> }
>>
>> It’s not really clear to me why they do it this way, instead of
>> putting an accept(Visitor) method on Node. They gain the ability
>> to return different types for each Node subtype, but are hugely
>> restricted in what visitors can do: they must return a Node,
>> instead of performing an arbitrary computation. It seems like the
>> idea is visitors must specialize to tree rewriting, but I still
>> would have preferred to parameterize the visitor by return type.
>>
>> Would this be better as a sealed type? I feel sure that if sealed
>> types existed, the authors of this class would have used one. We
>> could certainly do away with the enum, and use an
>> expression-switch instead to pattern-match in the implementation
>> of visit(Node). But I think the Visitor class would still exist,
>> and still have separate methods for each Node subtype, because
>> they developer seemed to care about specializing the return type.
>> The only place where an exhaustiveness check helps would be in
>> the visit(Node) method, inside the visitor class itself. All
>> other dispatch goes through visit(Node), or through one of the
>> specialized visitor methods if the type is known statically. It
>> seems like overall this would be an improvement, but again, the
>> improvement comes primarily from pattern-matching, not sealing.
>>
>> Colocated interface implementations
>>
>> Finally, I looked for interfaces having all of their
>> implementations defined in the same file. On this I do have some
>> statistical data[3]. A huge majority (98.5%) of public interfaces
>> have at least one implementation in a different source file.
>> Package-private interfaces also tend to have implementations in
>> other files: 85% of them are in this category. For protected
>> interfaces it’s much closer: only 53% have external
>> implementations. Of course, all private interfaces have all
>> implementations in a single file.
>>
>> Next, I looked at interfaces that share a source file with all
>> their implementations, to see whether they’d make good sealed
>> types. First was this Entry class:
>>
>> public interface Entry {
>> enum Status {OK, PENDING, FAILED}
>> Status getStatus();
>> int size();
>> String render();
>> }
>>
>> public class UserEntry implements Entry {
>> private User u;
>> private Status s;
>> public UserEntry(User u, Status s) {
>> this.u = u;
>> this.s = s;
>> }
>> @Override String render() {return u.name <http://u.name/>();}
>> @Override int size() {return 1;}
>> @Override Status getStatus() {return s;}
>> }
>>
>> public class AccountEntry implements Entry {
>> private Account a;
>> private Status s;
>> public UserEntry(Account a, Status s) {
>> this.a = a;
>> this.s = s;
>> }
>> @Override String render() {return a.render();}
>> @Override int size() {return a.size();}
>> @Override Status getStatus() {return s;}
>> }
>>
>> A huge majority of the clients of this Entry interface treat it
>> polymorphically, just calling its interface methods. In only one
>> case is there an instanceof check made on an Entry, dispatching
>> to different methods depending on which subclass is present.
>>
>> Is this a good sealed type? I think not, really. There are two
>> implementations now, but perhaps there will be a GroupEntry
>> someday. Existing clients should continue to work in that case:
>> the polymorphic Entry interface provides everything clients are
>> “intended” to know.
>>
>> Another candidate for sealing:
>>
>> public interface Request {/* Empty */}
>> public record RequestById(int id) implements Request;
>> public record RequestByValue(String owner, boolean historic)
>> implements Request;
>>
>> public class RequestFetcher {
>> public List<Item> fetch(Iterable<Request> requests) {
>> List<RequestById> idReqs = Lists.newArrayList();
>> List<RequestByValue> valueReqs = Lists.newArrayList();
>> List<Query> queries = Lists.newArrayList();
>> for (Request req : requests) {
>> if (req instanceof RequestById) {
>> idReqs.add((RequestById)req);
>> } else if (req instanceof RequestByValue) {
>> valueReqs.add((RequestByValue)req);
>> }
>> }
>> queries.addAll(prepareIdQueries(idReqs));
>> queries.addAll(prepareValueQueries(valueReqs));
>> return runQueries(queries);
>> }
>> }
>>
>> Interestingly, since the Request interface is empty, the only way
>> to do anything with this class is to cast it to one
>> implementation type. In fact, the RequestFetcher I include here
>> is the only usage of either of these classes (plus, of course,
>> helpers like prepareIdQueries).
>>
>> So, clients need to know about specific subclasses, and want to
>> be sure they’re doing exhaustive pattern-matching. Seems like a
>> great sealed class to me. Except...actually each of the two
>> subclasses has been extended by a decorator adding a source[4]:
>>
>> public record SourcedRequestById(Source source) extends RequestById;
>> public record SourcedRequestByValue(Source source) extends
>> RequestByValue;
>>
>> Does this argue in favor of sealing, or against? I don’t really
>> know. The owners of Request clearly intended for all four of
>> these subclasses to exist (they’re in the same package), so they
>> could include them all in the permitted subtype list, but it
>> seems like a confusing API to expose to clients.
>>
>> A third candidate for sealing is another simple sum type:
>>
>> public interface ValueOrAggregatorException<T> {
>> T get();
>> public static <T> ValueOrAggregatorException<T>
>> of(T value) {
>> return new OfValue<T>(value);
>> }
>> public static <T> ValueOrAggregatorException<T>
>> ofException(AggregatorException err) {
>> return new OfException<T>(err);
>> }
>> private record OfValue<T>(T value)
>> implements ValueOrAggregatorException<T> {
>> @Override T get() {return value;}
>> }
>> private record OfException<T>(AggregatorException err)
>> implements ValueOrAggregatorException<T> {
>> @Override T get() {throw err;}
>> }
>> }
>>
>> It has only two subtypes, and it seems unimaginable there could
>> ever be a third, so why not seal it? However, the subtypes are
>> intentionally hidden: it is undesirable to let people see whether
>> there’s an exception, except by having it thrown at you. In fact
>> AggregatorException is documented as “plugins may throw this, but
>> should never catch it”: there is some higher-level thing
>> responsible for catching all such exceptions. So, this type gains
>> no benefit from exhaustiveness checks in pattern-matching. The
>> type is intended to be used polymorphically, through its
>> interface method, even though its private implementation is
>> amenable to sealing.
>> ________________
>> [1] Throughout this document I will use record syntax as if it
>> were already in the language. This is merely for brevity, and to
>> avoid making the reader spend a lot of time reading code that
>> boils down to just storing a couple fields. In practice, of
>> course the code in Google’s codebase either defines the records
>> by hand, or uses an @AutoValue.
>> [2] Recall that @AutoValue, Google’s “record”, allows extending a
>> class, which is semantically okay here: DbResult has no state,
>> only behavior.
>> [3]This data is imperfect. While the Google codebase strongly
>> discourages having more than one version of a module checked in,
>> there is still some amount of “vendoring” or checking in multiple
>> versions of some package, e.g. for supporting external clients of
>> an old version of an API. As a result, two “different files”
>> which are really copies of each other may implement interfaces
>> with the same fully-qualified name; I did not attempt to control
>> for this case, and so such cases may look like they were in the
>> same file, or not.
>> [4] Of course in the record proposal it is illegal to extend
>> records like this; in real life these plain data carriers are
>> implemented by hand as ordinary classes, so the subtyping is legal.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20190429/09cba626/attachment-0001.html>
More information about the amber-spec-experts
mailing list