Fwd: Record pattern and side effects
Brian Goetz
brian.goetz at oracle.com
Tue Apr 19 19:29:28 UTC 2022
This came in on the amber-spec-comments list, but it's a useful
discussion to bring here.
> While it's pretty easy to say that record deconstruction should never have
> side effects (or generally be stateful beyond the record), would you also
> extend that to all custom patterns?
Yes, though not all side-effects are created equally.
Pattern matching is about fusing asking a question with conditional
extraction, in a way that is composible (so that patterns can be
composed, just as method calls can be composed.) Let me address
exceptions separately from "ordinary" side effects, but the answer is
mostly the same for both.
First, let's get exceptions out of the way. Pattern matching is about
asking a question, like "If I casted you to Foo, would you throw?"
Having the "if I did this, would you throw" answer by throwing is not
... helpful. The whole point of pattern matching is that you can easily
express "is it this? is it that?" logic; if "is it this" prevented you
from asking "is it that", that would be rude.
This doesn't mean that language constructs that use pattern matching
can't fail; if a switch is supposed to be exhaustive, but somehow is not
(e.g., separate compilation artifacts), while the process of trying to
match each of the N cases should not throw, if we get to the end and
none of the cases have matched, then the _switch_ is entitled to throw.
> It seems to me that stateful, effectful
> patterns could be useful if explicit enough.
Yes, but mere usefulness is not the measure of whether a language
feature is wise, or even beneficial as a whole. We routinely give up
flexibility in order to obtain global benefits. The real question is,
is the language better with the incremental flexibility, or worse? Very
often, the answer is worse, because it undermines safety properties or
optimizations for comparitively little benefit.
> The behaviour I would expect here is essentially "mimicking an equivalent
> if/else chain", ensuring that I can always refactor between a switch and
> ifs without new behaviour, always evaluating top-to-bottom left-to-right.
> But that's also bad for the majority of patterns that are expected to be
> pure.
This is indeed one of the tradeoffs. Let's imagine 99% of patterns are
pure; I think in a sensible world, the number is much higher. But let's
imagine further that we constrained execution to work as you describe,
which has O(n) time complexity, even though many can be executed in O(1)
without this constraint. This is terrible; for the sake of a tiny
fraction of (questionable) patterns, we cripple the performance of every
switch. Seems a manifestly bad tradeoff.
But, performance is not the only motivation here. Suppose we have a
pattern which matches if the second of the current time is odd, and as
its extraction, it reads a byte from an InputStream. Now, we have no
clue what
case Foo(var theByte): ...
means; the question has a random answer, and the extraction has not only
a random value, but may affect the result of other computations
(including later pattern matches). Does this really make sense as a
pattern? That seems pretty far outside of any reasonable description of
a pattern to me. A pattern is not just a combination of a predicate
with an arbitrary set of code thunks to produce values; it is asking a
coherent question.
But you hit upon another reason to not like this: composition. If
patterns are pure, then they are freely composible, and the order in
which we evaluate, say, the subpatterns of a given record pattern
doesn't matter. Not only does this offer us flexibility enable
optimization, but it seems awful that
case Foo(Bar(var x), Baz(var y)):
would behave differently if `Baz(var y)` were evaluated first. If
there's a data flow dependency, it should be explicit; arbitrary
side-effects undermine composition.
I guess it comes down to a philosophical question; is a pattern just an
arbitrary bag of imperative code, and switch is a weird syntax for
invoking it, or whether there's a higher-level concept in here. I think
we give up a lot and gain very little by sticking to the "its just a
weird syntax for calling a method" interpretation.
> I'd suggest providing an annotation for impure patterns
Every bit of flexibility has costs, both in terms of development
bandwidth and the complexity of the language. In a world where we had
an infinite development budget and users had infinite complexity
tolerance, I guess I could imagine this, but even then I'm skeptical,
because simply putting the indicator at the declaration site of the
pattern doesn't help people read a switch and discover if it has hidden
landmines in it. And having something at the use site would surely be
clutter. The benefit seems really tiny, and the cost seems huge, so I
have a hard time imagining how that could balance.
If you want to provide mutative methods on your APIs, provide them as
methods, and call them on the RHS of the case.
(And by the way, there's no such thing as annotations that affect
language semantics. You're asking for a language feature.)
> While it's pretty easy to say that record deconstruction should never have
> side effects (or generally be stateful beyond the record), would you also
> extend that to all custom patterns? It seems to me that stateful, effectful
> patterns could be useful if explicit enough.
>
> Given some IntelliJ-like AST system, we could have code like
>
> ```
> AstElementReference elem = ...;
> switch(elem){
> case DirectRef(var ast) -> ...
> case AstCache.cacheOf(var ast) -> ...
> case Stubs.stubOf(var ast) -> ...
> case FileIndex.refToUnparsed(var ast) -> ...
> default -> throw ...
> }
> ```
> where accessing (or creating) an underlying AST element might be stateful,
> and where proper polymorphism may not be appropriate (e.g. I don't control
> these types, or what I'm doing with them is not meaningful for all
> subtypes, or...).
>
> An equivalent if/else chain might look like
>
> ```
> if(elem instanceof DirectReference(var ast)){
> ...
> }else if(elem.isCache()){
> var ast = AstCache.get(elem.key());
> ...
> }else if(elem.isStub()){
> var ast = Stubs.createMirror(elem.key());
> ...
> }else if(elem.isFileRef()){
> var ast = FileIndex.parse(elem.key());
> ...
> }
> ```
> or something. An enum might be more appropriate for representing types, but
> that's irrelevant to what the patterns are doing; they're moving out the
> "obvious" step of data extraction into the conditional, making it clearer
> what the "actual" logic is, similar to type patterns but more domain
> specific.
>
> In this example, it's clear that side effects are only appropriate on a
> successful match. Stateful failures*may* be required if a stateful pattern
> is nested within another pattern, or guarded by a when clause, though;
>
> ```
> switch(elem){
> case Stubs.stubOf(AstClass.classAst(var clss))
> -> ...
> case Stubs.stubOf(var ast)
> when ast.isPhysical()
> -> ...
> default
> -> throw new IllegalArgumentException();
> }
> ```
>
> Factoring out a common head would be the "correct"/more efficient behaviour
> in this case, but as pointed out already, it's not possible to do that for
> all duplicate occurrences of a pattern.
>
> The behaviour I would expect here is essentially "mimicking an equivalent
> if/else chain", ensuring that I can always refactor between a switch and
> ifs without new behaviour, always evaluating top-to-bottom left-to-right.
> But that's also bad for the majority of patterns that are expected to be
> pure.
>
> I'd suggest providing an annotation for impure patterns, then, which
> prevents the compiler from optimizing the switch in "unexpected" ways, and
> allows warning when an impure pattern is repeated (in the compiler or by an
> IDE), alongside making it clearly documented and explicit.
>
> If the annotation is not present in a switch, the compiler gets to reorder
> and factor out any part it wants.
>
> For the purposes of JDK 19? record patterns, where the dtor cannot be
> explicitly written out, the annotation would have to be applied to the
> whole type, or a particular accessor.
More information about the amber-dev
mailing list