Fwd: Feedback on the current state of switch pattern matching and record patterns
Brian Goetz
brian.goetz at oracle.com
Fri Jun 24 19:56:30 UTC 2022
Received on the -comments list.
Summary:
- Question regarding the asymmetry of true vs false boolean constants.
- Questions regarding matching of nulls in switch (fundamental
confusion: whether `case null, String s` is one thing or two)
- DA treatment of assignments inside guards
- questionable boxing in record patterns
-------- Forwarded Message --------
Subject: Feedback on the current state of switch pattern matching and
record patterns
Date: Sun, 19 Jun 2022 03:30:24 +0000
From: Robbe Pincket <robbepincket at live.be>
To: amber-spec-comments at openjdk.org <amber-spec-comments at openjdk.org>
Hello amber project experts.
I’ve been quite interested in recent java enhancement projects, and
while checking
out some of the details of both switch pattern matching and record patterns,
I’ve noticed some things I want to address.
Looking around, this seemed to be the latest version of the spec, so
I’ll be basing
my questions off of this:
https://cr.openjdk.java.net/~gbierman/jep427%2B405/jep427%2B405-20220601/specs/patterns-switch-record-patterns-jls.html
<https://cr.openjdk.java.net/~gbierman/jep427%2b405/jep427%2b405-20220601/specs/patterns-switch-record-patterns-jls.html>
Switch pattern matching
Switch syntax
The syntax section shows that a |CaseElement| can be |Pattern {Guard}|.
This would imply that there can be multiple |when| guards on a single
pattern
Constant boolean expressions in the guard expression
The specifications mention that unlike with most guards, if the
“expression has
the value true”, the dominance checking treats it just as if it were
an unguarded pattern case.
This raised two concerns:
* What does ‘has the value true’ mean? In my eyes, that can mean quite
a few things:
o The boolean literal ‘true’
o Any constant expression whose value is true
o Any boolean expression that always evaluates to true
* This treatment has no equivalent with guards that are false.
Considering guards that “have the value true” as a branch that is
always taken feels akin to how constant expressions are handled in
while loops, whilst |while (false)| always fails to compile as it
results in the body to be unreachable. However, when the guard is
false, the resultant behavior mirrors a traditional if, where it
considers the body to still be reachable, even though they can never
get executed.
Null labels
I wasn’t sure whether this was an issue at first, but after looking at
the spec again and realizing I misinterpreted part of the spec, I’m very
unsure about whether the compiler is technically following the spec or
not; nonetheless, it’s very counterintuitive to me. (This section is a
bit more ‘ranty’ than I’d like, but it’s getting late, and I keep
discovering new weird unexpected cases while typing)
Let’s look at some examples.
// example 1||
switch(|o|)||{||
||case||null,||String|s when s|.length()||==||0||->||{||
||System.|out|.println("o: "||+|o|);||
||}||
||...||
}||
I was very surprised when this gave me a npe when getting the length,
as I assumed that the guard would only be evaluated when the pattern
matched,
and didn’t expect the null to fall through to the guard.
So I tested the following:
// example 2||
switch(|o|)||{||
||case||String|s when s|.length()||==||0,||null||->||{||
||System.|out|.println("o: "||+|o|);||
||}||
||...||
}||
This still throws an npe. This where I decided to check the spec in
depth again.
If the resulting transformed case label has both a null case element
and a pattern
case element p where p is a pattern declaring a pattern variable x
of type U,
then p is replaced with an any pattern that declares x of type U
(14.30.1).
This explains the second case, even though the |null| label appears after.
Let’s use |String? s| as a notation for an any pattern. The way I see
the translation of the second example is like this:
switch(|o|)||{||
||case||String?|s when s|.length()||==||0:||
||case||null:||{||
||System.|out|.println("o: "||+|o|);||
||break;||
||}||
||...||
}||
I’m keeping the |case null| here cause the spec doesn’t mention this
getting removed, which seems like a mistake to me cause it implies that
the first example gets
translated as this:
switch(|o|)||{||
||case||null:||
||case||String?|s when s|.length()||==||0:||
||{||
||System.|out|.println("o: "||+|o|);||
||break;||
||}||
||...||
}||
where |case null| would match null first and therefore |String?
s| wouldn’t match the null.
These 2 translations bring up the question: what if there are 2 case
labels, one of
which is null?
// Example 3||
switch(|o|)||{||
||case||String|s when s|.length()||==||0:||
||case||null:||{||
||System.|out|.println("o: "||+|o|);||
||break;||
||}||
||...||
}||
Example 3 still throws an NPE when given null. This, just blew my mind,
and I think this behavior is even wrong according to the spec. The spec
mentions that if a *case label* contains both a null case label element
and a pattern case element, the pattern
gets converted to a any pattern. However in this example there are 2
separate *case labels* (but only one switch label). Note that adding a
single |;|
after the first |:| results in a variant that does work as expected,
most likely
cause they are now 2 separate *switch labels*.
One last example:
static||void||example4(Object|a|,||Object|b|)||{||
||switch(|a|)||{||
||case||null,||String|s when |Objects.equal(|s|,|b|)||->||{||
||System.|out|.println("a & b: "||+|s|);||
||}||
||case||null||->||{||
||System.|out|.println("a is NULL, but b isn't");||
||}||
||default||->||{||
||System.|out|.println("Objects: "||+|a |+||", "||+|b|);||
||}||
||}||
}||
This fails to compile, citing a “duplicate case label”, however I can’t
find anything in the spec that disallows 2 null cases in the same
switch, even without the guard. The closest thing I see is:
A switch label that supports a case constant dominates another
switch label supporting the same case constant.
|null|, however, is not considered a case constant - as the guard is
acting on the null case, it means the second null case shouldn’t be
dominated by the first one. Perhpas should be a rule that doesn’t allow
this, but nothing in the current spec should disallow this, even though
javac does, in fact, disallow it.
Definite assignment
It seems the part on definite assignment doesn’t take into account the
possible assignments in the guard of switch cases.
I thought this was gonna be a small thing to fix as javac seemed to
behave as expected despite the specification, but then I noticed
something else and ended up writing a some code where I was able to
reassign a final var.
Issue 1:
* V is [un]assigned before the switch rule expression, switch rule
block, or switch rule throw statement introduced by a switch
rule in the switch block iff V is [un]assigned after the
selector expression of the switch statement.
* V is [un]assigned before the first block statement of a switch
labeled statement group in the switch block iff both of the
following are true:
o V is [un]assigned after the selector expression of the
switch statement.
o …
If the case is guarded, this shouldn’t be checking the assignment status
after the selector expression, but after the guard expression when true.
It seems javac already does this.
Issue 2:
The naive assumption, which javac seems to be using, is that a variable
V is [un]assigned before a guard expression iff it is [un]assigned after
the selector expression. This however is not correct. See the following
snippet:
public||class|test |{||
||static||void||test(Object|o|)||{||
||final||String|res|;||
||switch(|o|)||{||
||case||String|s when |(|res |=|s|)||!=||null||&&||log(|res|)||&&|s|.length()|| >||3||->||{||
||System.|out|.println("res: "||+|res|);||
||}||
||case||String|i when |(|res |=||"???")||!=||null||&&||log(|res|)||->||{||
||System.|out|.println("res: "||+|res|);||
||}||
||default||->||{||
||System.|out|.println("Object: "||+|o|);||
||}||
||}||
||}||
||
||static||boolean||log(String|res|)||{||
||System.|out|.println("res: "||+|res|);||
||return||true;||
||}||
||
||public||static||void||main(String[]|args|){||
||System.|out|.println("Trial 1: ");||
||test("Hello");||
||System.|out|.println("Trial 2: ");||
||test("Huh");||
||}||
}||
Which outputs:
|Trial 1:|
|res: Hello|
|res: Hello|
|Trial 2:|
|res: Huh|
|res: ???|
|res: ???|
This code compiles, even though it is reassigning a final var.
A correct definition for definite (un)assignment in this case would have
to take all the pattern cases that could have matched before it into
account, and the maybe also the fact that (some of) those cases could
cover all possible pathways to this label, like:
case||A|a when |(|res |=||"Hi")||==||null||->||{}||
case||B|b when |(|res |=||"Hello")||==||null||->||{}||
case||SealedAOrB||->||{||
||// res is now definitely assigned?||
}||
Record Patterns
Integer components.
It seems that when accessing |int| fields, it boxes them, to then
immediately unbox them again.
Example
switch(|o|)||{||
||case||Point(var|x|,||var|y|)||->||System.|out|.println(|x |+|y|);||
||...||
compiles into something like:
|var1 |=|o|
switch(|indy typeswitch|)||{||
||case||0||->||{||
||int|var5 |=|$proxy$|x((Point)|var1|);||
||int|var3 |=||Integer.valueOf(|var5|).intValue();||
| var5 |=|$proxy$|y((Point)|var1|);||
||int|var4 |=||Integer.valueOf(|var5|).intValue();||
||
||System.|out|.println(|var3 |+|var4|)||
||}||
I gave a quick look to the code in |TransPattern|, but couldn’t actually
find anything about it. Seems like something that gets triggered by
accident?
Hope all of this helps with improving the language!
Greetings
Robbe Pincket
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-spec-experts/attachments/20220624/0c5e1181/attachment-0001.htm>
More information about the amber-spec-experts
mailing list