<div dir="ltr">I think I did a really poor job expressing my thoughts in the first message. I will try to be more clear now, along with some situations I have encountered myself.<div><br></div><div>Assume we have a stateless math expressions parser. For simplicity, let's assume that we split expressions into several types: "computation-heavy", "computation-lightweight", "erroneous" (permits null as input) and "computation-remote" (delegate parsing to another service via some communication protocol), and types can be assigned heuristically (speculatively). For some tasks, we need to perform some pre- and postprocessing around core parsing logic, like result caching, wrapping parsing into virtual thread and registering it in phaser etc., for others - log warning or error, or fail parsing with exception.</div><div><br></div><div>These types could be considered "abstract types", because they are "bins" that real token types fall into, exclusively into one. Lets omit class hierarchies this time, we will be good just under assumption that null value is considered erroneous token, and polynomial considered computation-heavy.</div><div><br></div><div>Now, in parser class, we define following patterns:</div><div><br></div><div><div>static case pattern matchesHeavy(String s) { .. }</div><div>static case pattern
matchesLight(String s) { .. }<br></div><div>static case pattern
matchesErroneous(String s) { .. } // covers null and unresolved</div></div><div>static case pattern matchesRemote(String s) { .. }<br></div><div>static case pattern matchesPolynomial(String s) { .. } // considered heavy<br></div><div>// other patterns that are either subcases of mentioned or aren't for String type...</div><div><br></div><div>Note: last time I have been kind of misled by an example I have found in jdk mails with Integer.parseInt pattern, when effectively what I meant was matching pattern, not parsing.</div><div><br></div><div>Now, we decide to use switch pattern matching when building a syntax tree like so:</div><div><br></div><div>switch (expression /*expression is of type String*/) {</div><div> case matchesHeavy(String heavy) -> ...</div><div> case matchesLight(String heavy) -> ...<br></div><div> case matchesRemote(String heavy) -> ...<br></div><div> case matchesErroneous(String heavy) -> ... // covers null and any unresolved (default brach)<br></div><div>}</div><div><br></div><div>Now, what we see is that effectively, every possible pattern for String type (I make an emphasis here and will return to this later) is covered, including null and default. But, as I understand, the following won't compile.</div><div><br></div><div>As I see, there are two layers for this problem, one of which was not even mentioned by me in the previous letter,</div><div><br></div><div>Layer 1 (not mentioned in first letter): patterns are exhausted ONLY for String type (and types assignable to it if there were such). If we are trying to match Object-typed value, patterns are not exhausted. You were really right to note that there weren't any real sources of exhaustiveness in my initial letter. This leads me to a thought that to declare a finite states list for some type, there will have to be introduced some kind of companion object that lists all possible states for some type, or at least I haven't come up with something better. This kind of state object would be a really complex concept to introduce (complex for users in the first place), however, this can also be a very powerful tool in the right hands.</div><div><br></div><div>Layer 2: even if there are some "patterns object" or any workaround, it will contain matchesPolynomial pattern. This pattern is a more specific version of matchesHeavy pattern, and therefore, covered by it. However, if we leave it as it is now, the compiler will think this branch is not implemented, and, therefore, will demand the default branch in switch, even if it is effectively unreachable. This is what I have tried to propose fix for with this matchesPolynomial extends matchesHeavy syntax. Also, matchesErroneous covers null (that is what i referred to in "covers null" syntax section), and default is covered too (but I don't think matchesErroneous covers default should be a thing). Also, there could be patterns that cover any non-null value (like potential Objects.nonNull), and the example I have provided in the first letter demonstrates that Objects.nonNull and null patterns are also exhausting for any non-primitive object.</div><div><br></div><div>I hope now my thoughts are more clear. Summarizing, what I want to tell by this thoughts is that there are potential to make member patterns much more powerful if there would be a way to a) assert that there are a finite amount of patterns for type and each value falls at least in one and b) declare that one pattern is subcase of another pattern. This is for sure would be complex for understanding of users, but could potentially offer a very powerful instrument in return. I'm sure that you could think of countless ways to apply this in various projects.</div><div><br></div><div>Regards</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Apr 29, 2024 at 3:09 PM Brian Goetz <<a href="mailto:brian.goetz@oracle.com">brian.goetz@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>
<br>
On 4/26/2024 3:00 PM, Olexandr Rotan wrote:<br>
> I read through some messages about member patterns design and <br>
> accumulated some thoughts to share.<br>
><br>
> It happened so that all of my recent projects were linked to parsing <br>
> some data:equations, java statements etc., so I have been on close <br>
> terms with switch statements during tokenization and syntax tree <br>
> construction.<br>
><br>
> It is common practice in parser development to build either type <br>
> hierarchies. For older solutions, it is common to just informally <br>
> "imply" that one type of tokens is a subtype of another and use some <br>
> field like "kind" as a discriminator. The thing in common in both <br>
> approaches, is that one "state" of token or ex[ression could be not <br>
> mutually exclusive to another. Consider following:<br>
><br>
> Token<br>
> | - KeywordToken<br>
> | | - ExtendsToken<br>
> | | - SuperToken<br>
> | | ....<br>
> | - LiteralToken<br>
> | | - NumberToken<br>
> | | - StringToken<br>
> ...<br>
> Also, we have some factory that receives String and returns token:<br>
> class Tokens {<br>
> Token parse(String s) { .. }<br>
> static case pattern parseKeyword(String s) { .. } // hope I <br>
> understood syntax correctly<br>
> static case pattern parseLiteral(String s) { .. }<br>
> static case pattern parseNumberLiteral(String s) { .. }<br>
> }<br>
<br>
I am not sure what semantics you have in mind, but given the name <br>
(`parseXxx`), you might be on very dangerous territory here. If these <br>
parse methods are intended to actually mutate the internal state of the <br>
parser, then declaring them as patterns is going to be pretty wrong. <br>
Patterns are not merely "conditional methods with multiple return". So <br>
its quite possible this whole question rests on a faulty base.<br>
<br>
Your patterns are missing the declaration of what the match candidate <br>
is, so I will assume that these are patterns whose match candidate is <br>
Token. You also don't show whether Token actually has a hierarchy like <br>
you describe, or whether that is only on the whiteboard.<br>
<br>
> Now, somewhere in the processing pipeline, I decided to use member <br>
> pattern matching:<br>
> switch (str) {<br>
> case Tokens.parseLiteral(String literalStr) -> ...<br>
> case Tokens.parseKeyword(String kwdStr) -> ...<br>
> case null -> ...<br>
> }<br>
><br>
> At this point, all possible states are already exhausted. However, If <br>
> i got everything correctly, this won't compile, as the compiler still <br>
> thinks parseNumberLiteral is not exhausted, while effectively it is.<br>
<br>
Why is it effectively exhausted here? What hidden source of <br>
exhaustiveness information is there in your hiearchy? And, could it be <br>
made explicit?<br>
<br>
There are many questions here, but my strong impression here is that you <br>
are using the wrong tool in a few places, and so not surprisingly are at <br>
a dead end.<br>
<br>
One way to make this all explicit would be:<br>
<br>
sealed interface Token { ... }<br>
sealed interface KeywordToken extends Token { ... }<br>
sealed interface LiteralToken extends Token { ... }<br>
// records for each token type<br>
<br>
Now, you would get record patterns for free (case LiteralToken(...)) and <br>
exhaustiveness for free also. And if you were switching only on a <br>
LiteralToken, you would only have to cover the subtypes of LIteralToken.<br>
<br>
> I think such situations could become pretty common. One solution I can <br>
> think of is to add an option to specify a supercase like static case <br>
> pattern parseNumberLiteral(String s) extends parseLiteral { .. }.<br>
<br>
I think we're pretty far from identifying the problem, so probably best <br>
to hold off suggesting solutions at this point.Let's back up and expose <br>
the assumptions here first.<br>
</blockquote></div>