RFR: 8262891: Compiler implementation for Pattern Matching for switch (Preview)

Maurizio Cimadamore mcimadamore at openjdk.java.net
Tue May 11 13:42:50 UTC 2021


On Tue, 4 May 2021 16:41:44 GMT, Jan Lahoda <jlahoda at openjdk.org> wrote:

> This is a preview of a patch implementing JEP 406: Pattern Matching for switch (Preview):
> https://bugs.openjdk.java.net/browse/JDK-8213076
> 
> The current draft of the specification is here:
> http://cr.openjdk.java.net/~gbierman/jep406/jep406-20210430/specs/patterns-switch-jls.html
> 
> A summary of notable parts of the patch:
> -to support cases expressions and patterns in cases, there is a new common superinterface for expressions and patterns, `CaseLabelTree`, which expressions and patterns implement, and a list of case labels is returned from `CaseTree.getLabels()`.
> -to support `case default`, there is an implementation of `CaseLabelTree` that represents it (`DefaultCaseLabelTree`). It is used also to represent the conventional `default` internally and in the newly added methods.
> -in the parser, parenthesized patterns and expressions need to be disambiguated when parsing case labels.
> -Lower has been enhanced to handle `case null` for ordinary (boxed-primitive, String, enum) switches. This is a bit tricky for boxed primitives, as there is no value that is not part of the input domain so that could be used to represent `case null`. Requires a bit shuffling with values.
> -TransPatterns has been enhanced to handle the pattern matching switch. It produces code that delegates to a new bootstrap method, that will classify the input value to the switch and return the case number, to which the switch then jumps. To support guards, the switches (and the bootstrap method) are restartable. The bootstrap method as such is written very simply so far, but could be much more optimized later.
> -nullable type patterns are `case String s, null`/`case null, String s`/`case null: case String s:`/`case String s: case null:`, handling of these required a few tricks in `Attr`, `Flow` and `TransPatterns`.
> 
> The specdiff for the change is here (to be updated):
> http://cr.openjdk.java.net/~jlahoda/8265981/specdiff.preview.01/overview-summary.html

Good work. There's a lot to take in here. I think overall, it holds up well - I like how case labels have been extended to accommodate for patterns. In the front-end, I think there are some questions over the split between Attr and Flow - maybe it is unavoidable, but I'm not sure why some control-flow analysis happens in Attr instead of Flow and I left some questions to make sure I understand what's happening.

In the backend it's mostly good work - but overall the structure of the code, both in Lower and in TransPattern is getting very difficult to follow, given there are many different kind of switches each with its own little twist attached to it. It would be nice, I think (maybe in a separate cleanup?) if the code paths serving the different switch kinds could be made more separate, so that, when reading the code we can worry only about one possible code shape. That would make the code easier to document as well.

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 100:

> 98:      *                       the {@code NameAndType} of the {@code InvokeDynamic}
> 99:      *                       structure and is stacked automatically by the VM.
> 100:      * @param labels non-null case labels - {@code String} and {@code Integer} constants

Can these types be mixed? E.g. can I pass, as static args: `42`, `Runnable.class`, `"hello"` ? If not, should be document, and throw?

src/java.base/share/classes/java/lang/runtime/SwitchBootstraps.java line 102:

> 100:      * @param labels non-null case labels - {@code String} and {@code Integer} constants
> 101:      *                        and {@code Class} instances
> 102:      * @return the index into {@code labels} of the target value, if the target

Doesn't return an index. Returns a CallSite which, when invoked with an argument of type E (where E is the type of the target expression), returns the index into...

src/jdk.compiler/share/classes/com/sun/source/tree/TreeVisitor.java line 306:

> 304: 
> 305:     /**
> 306:      * Visits an GuardPatternTree node.

Suggestion:

     * Visits a GuardPatternTree node.

src/jdk.compiler/share/classes/com/sun/source/tree/TreeVisitor.java line 316:

> 314: 
> 315:     /**
> 316:      * Visits an AndPatternTree node.

Is this comment correct?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Attr.java line 1685:

> 1683:                         log.error(DiagnosticFlag.SOURCE_LEVEL, selector.pos(),
> 1684:                                   Feature.PATTERN_SWITCH.error(this.sourceName));
> 1685:                         allowPatternSwitch = true;

I assume this logic is to show only one error in case we're compiling multiple methods with pattern switches and preview features are not enabled? Is this consistent with what happens with other preview features though?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Attr.java line 1725:

> 1723:                                     log.error(DiagnosticFlag.SOURCE_LEVEL, expr.pos(),
> 1724:                                               Feature.CASE_NULL.error(this.sourceName));
> 1725:                                     allowCaseNull = true;

Same here about error recovery and minimize messages - if this is what we'd like to do, perhaps centralizing the check in Preview would be better (so that e.g. we'd return that a given preview feature is not supported only once).

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Attr.java line 1790:

> 1788:                         if (isTotal) {
> 1789:                             if (hasTotalPattern) {
> 1790:                                 log.error(pat.pos(), Errors.DuplicateTotalPattern);

Hard to explain - but a lot of the code here feels more like it belongs to `Flow` than to `Attr`. E.g. these "duplicateXYZ" labels really want to say "unreachable code", I think. Is there a strong reason as to why all this code shouldn't (apart from code computing bindings) move to Flow? Seems that the logic is partially duplicated anyway...

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Attr.java line 1861:

> 1859:         return null;
> 1860:     }
> 1861:     private Pair<Type, Boolean> primaryType(JCPattern pat) {

Maybe a record here would lead for better code (no boxing of Boolean, but, more importantly, better field names for fst/snd). More generally, now that we have records I think we should think twice before using Pairs :-)

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Flow.java line 757:

> 755:         }
> 756: 
> 757:         private TypeSymbol patternType(JCPattern p) {

If we need to keep duplicated code, this could move to TreeInfo

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Flow.java line 757:

> 755:         }
> 756: 
> 757:         private TypeSymbol patternType(JCPattern p) {

If we need to keep duplicated code, this could move to TreeInfo

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Flow.java line 2917:

> 2915:         void reportEffectivelyFinalError(DiagnosticPosition pos, Symbol sym) {
> 2916:             String subKey = switch (currentTree.getTag()) {
> 2917:                 case LAMBDA -> "lambda";

Can we make the switch return the fragment constant (e.g. Fragments.XYZ) rather than a String which is then re-wrapped?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Lower.java line 3649:

> 3647:         if (!enumSwitch && !stringSwitch && !selector.type.isPrimitive() &&
> 3648:             cases.stream().anyMatch(c -> TreeInfo.isNull(c.labels.head))) {
> 3649:             //a switch over a boxed primitive, with a null case. Pick two constants that are

Is this comment correct? If we're here, can't this be just any pattern switch?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/Lower.java line 3729:

> 3727:         if (cases.stream().anyMatch(c -> TreeInfo.isNull(c.labels.head))) {
> 3728:             //for enum switches with case null, do:
> 3729:             //switch ($selector != null ? $mapVar[$selector.ordinal()] : -1) {...}

Very  clever trick to handle away null treatment with just an extra default label. I wonder if this logic isn't big enough to deserve its own `visitXYZSwitch` method, as we do for strings and enums in switch. It seems asymmetric that we do some bits of null handling here (in certain cases) but then enum switches will deal with null in a different way. We should probably test what kind of switch we're dealing with earlier, and then branch out to completely disjoint pieces of code, each with its own null handling

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/MatchBindingsComputer.java line 124:

> 122:         // e.T = union(x.T, y.T)
> 123:         // e.F = intersection(x.F, y.F) (error recovery)
> 124:         List<BindingSymbol> bindingsWhenTrue =

Is this duplicate code for MatchBindingComputer::binary(AND) ? If so, should at least the impl delegate to that?

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/TransPatterns.java line 193:

> 191:                     : tree.expr.type;
> 192:             VarSymbol prevCurrentValue = currentValue;
> 193:             try {

IIUC, the changes here are to handle recursion of pattern in the instanceof case - e.g. `x instanceof (Foo foo)`. If so, it would be nice to have that reflected in the comment above.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/TransPatterns.java line 305:

> 303:                 selector = make.Ident(temp);
> 304:             } else {
> 305:                 List<Type> staticArgTypes = List.of(syms.methodHandleLookupType,

I guess strategy here is to convert the various labels into loadable constants and then pass them up to the bootstrap method which will give us back an int, which we'll use for a plain switch. If so, all this needs to be documented somehow.

src/jdk.compiler/share/classes/com/sun/tools/javac/comp/TransPatterns.java line 368:

> 366:                         JCContinue continueSwitch = make.Continue(null);
> 367:                         continueSwitch.target = tree;
> 368:                         c.stats = c.stats.prepend(make.If(makeUnary(Tag.NOT, test).setType(syms.booleanType),

I assume this is code which "resumes" to the following `case` if the pattern doesn't match. I guess in most cases the bootstrap method would do the check anyway - but I suppose that with guards, the bootstrap method, alone, cannot guarantee the match. Also, it seems like this requires backend support for `continue` in switch. Again, all this needs to be better documented, it's pretty hard to infer all this by looking at the code.

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java line 776:

> 774:         } else {
> 775:             JCPattern pattern;
> 776:             JCExpression e = parsedType == null ? term(EXPR | TYPE | NOLAMBDA/* | NOINVOCATION*/) : parsedType;

what's up with NO_INVOCATION?

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java line 3004:

> 3002:                 JCCaseLabel label = parseCaseLabel();
> 3003: 
> 3004:                 //TODO: final

What does this `todo` refers to?

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java line 3078:

> 3076:                 }
> 3077:                 Token twoBack;
> 3078:                 boolean pattern = S.token(lookahead - 1).kind == IDENTIFIER && ((twoBack = S.token(lookahead - 2)).kind == IDENTIFIER || twoBack.kind == GT || twoBack.kind == GTGT || twoBack.kind == GTGTGT);

tricky - but I think correct.

src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java line 3085:

> 3083:                 }
> 3084:             }
> 3085:             //XXX: modifiers!

Another todo here

src/jdk.compiler/share/classes/com/sun/tools/javac/resources/compiler.properties line 501:

> 499: 
> 500: compiler.err.pattern.dominated=\
> 501:     this pattern is dominated by a preceding pattern

Not sure whether the concept of "dominance" is the one that will work best here. As I said elsewhere, this is, morally, unreachable code.

src/jdk.compiler/share/classes/com/sun/tools/javac/resources/compiler.properties line 520:

> 518: 
> 519: compiler.err.multiple.patterns=\
> 520:     multiple pattern declarations

This text is very terse and doesn't really say what's wrong with the code. I think the point here is that we don't want code like this:


case String s, Integer i: ...

because this is morally:


case String s:
case Integer i: ...

which is, again, fall-through to a pattern. Maybe, just reusing the same "fall-through" message would be ok here?

-------------

PR: https://git.openjdk.java.net/jdk/pull/3863


More information about the core-libs-dev mailing list