Array patterns (and varargs patterns)
Brian Goetz
brian.goetz at oracle.com
Wed Sep 7 22:15:04 UTC 2022
> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
Its a named pattern, but to work, it would need varargs patterns -- and
array patterns are the underpinnings of varargs, just as array creation
is the underpinning of varargs invocation. We're not going to do
varargs patterns differently than we do varargs invocation, just to
avoid doing array patterns -- that would be silly.
> I see also other reasons to not specify the array pattern now,
> - record with a varargs are quite rare so people are not desperately in need for the corresponding pattern,
> - the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong).
As I've said, the fact that people are not desperate for this yet
(though obviously you and Tagir want varargs patterns, so there is some
demand for it) is not the primary reason to do this now. The symmetry
between aggregation and deconstruction is very, very, very important to
people understanding properly how pattern matching fits into the
language. I am trying to button up the sources of asymmetry in the
patterns we have before moving on to cool new patterns. Otherwise we
leave a trail of accidental complexity behind us, where certain things
are reversible and others are not, for no apparent reason. (Primitives
in type patterns are in this category too, and we'll be returning to
them very soon.)
So I'm not going to hold up the discussion of named patterns for array
patterns (I'm working on a document for named patterns too, but its much
longer), but I'm also not going to hold up array patterns until we get
named patterns done either. I want to close up the holes in what we've
already built before laying the next layer.
(As to List and Map patterns, these will have to be co-designed with
List and Map literals, which will likely require some additional
groundwork. They're a ways away, we're building a tower layer by layer.)
>
> Rémi
>
>>> With best regards,
>>> Tagir Valeev.
>>>
>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>>> We dropped this out of the record patterns JEP, but I think it is time to
>>>> revisit this.
>>>>
>>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>>> container for nested patterns. And they have an obvious duality with array
>>>> creation expressions.
>>>>
>>>> The main open question here was how we distinguish between "match an array of
>>>> length exactly N" (where there are N nested patterns) and "match an array of
>>>> length at least N". We toyed with the idea of a "..." indicator to mean "more
>>>> elements", but this felt a little forced and opened new questions.
>>>>
>>>> It later occurred to me that there is another place to nest a pattern in an
>>>> array pattern -- to match (and bind) the length. In the following, assume for
>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>>> denote here with a constant literal.
>>>>
>>>> There is an obvious place to put this (optional) pattern: in between the
>>>> brackets. So:
>>>>
>>>> case String[1] { P }:
>>>> ^ a constant pattern
>>>>
>>>> would match string arrays of length 1 whose sole element matches P. And
>>>>
>>>> case String[] { P, Q }
>>>>
>>>> would match string arrays of length exactly 2, whose first two elements match P
>>>> and Q respectively. (If the length pattern is not specified, we infer a
>>>> constant pattern whose constant is equal to the length of the nested pattern
>>>> list.)
>>>>
>>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>>
>>>> x instanceof String[] arr
>>>> && arr.length matches L
>>>> && arr.length >= n
>>>> && arr[0] matches P0
>>>> && arr[1] matches P1
>>>> ...
>>>> && arr[n] matches Pn
>>>>
>>>> More examples:
>>>>
>>>> case String[int len] { P }
>>>>
>>>> would match string arrays of length >= 1 whose first element matches P, and
>>>> further binds the array length to `len`.
>>>>
>>>> case String[_] { P, Q }
>>>>
>>>> would match string arrays of any length whose first two elements match P and Q.
>>>>
>>>> case String[3] { }
>>>> ^constant pattern
>>>>
>>>> matches all string arrays of length 3.
>>>>
>>>>
>>>> This is a more principled way to do it, because the length is a part of the
>>>> array and deserves a chance to match via nested patterns, just as with the
>>>> elements, and it avoid trying to give "..." a new meaning.
>>>>
>>>> The downside is that it might be confusing at first (though people will learn
>>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>>
>>>>
>>>>
>>>>
>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>>
>>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>>> attach another sub-feature: array patterns. (This also bears on the question
>>>> of "how would varargs patterns work", which I'll address below, though they
>>>> might come later.)
>>>>
>>>> ## Array Patterns
>>>>
>>>> If we want to create a new array, we do so with an array construction
>>>> expression:
>>>>
>>>> new String[] { "a", "b" }
>>>>
>>>> Since each form of aggregation should have its dual in destructuring, the
>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>>> is:
>>>>
>>>> if (arr instanceof String[] { var a, var b }) { ... }
>>>>
>>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>>> to the nested patterns `var a` and `var b`. This is the natural analogue of
>>>> deconstruction patterns for arrays, complete with nesting.
>>>>
>>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>>> rather than simply "length == 2". There are multiple syntactic ways to get
>>>> there, for now I'm going to write
>>>>
>>>> if (arr instanceof String[] { var a, var b, ... })
>>>>
>>>> to indicate "more". The "..." matches zero or more elements and binds nothing.
>>>>
>>>> <digression>
>>>> People are immediately going to ask "can I bind something to the remainder"; I
>>>> think this is mostly an "attractive distraction", and would prefer to not have
>>>> this dominate the discussion.
>>>> </digression>
>>>>
>>>> Here's an example from the JDK that could use this effectively:
>>>>
>>>> String[] limits = limitString.split(":");
>>>> try {
>>>> switch (limits.length) {
>>>> case 2: {
>>>> if (!limits[1].equals("*"))
>>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>> }
>>>> case 1: {
>>>> if (!limits[0].equals("*"))
>>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>> }
>>>> }
>>>> }
>>>> catch(NumberFormatException ex) {
>>>> setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>> setMultilineLimit(MultilineLimit.LENGTH, -1);
>>>> }
>>>>
>>>> becomes (eventually)
>>>>
>>>> switch (limitString.split(":")) {
>>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>> }
>>>>
>>>> Note how not only does this become more compact, but the unchecked
>>>> "NumberFormatException" is folded into the match, rather than being a separate
>>>> concern.
>>>>
>>>>
>>>> ## Varargs patterns
>>>>
>>>> Having array patterns offers us a natural way to interpret deconstruction
>>>> patterns for varargs records. Assume we have:
>>>>
>>>> void m(X... xs) { }
>>>>
>>>> Then a varargs invocation
>>>>
>>>> m(a, b, c)
>>>>
>>>> is really sugar for
>>>>
>>>> m(new X[] { a, b, c })
>>>>
>>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>>> array pattern. So for a record
>>>>
>>>> record R(X... xs) { }
>>>>
>>>> a varargs match:
>>>>
>>>> case R(var a, var b, var c):
>>>>
>>>> is really sugar for an array match:
>>>>
>>>> case R(X[] { var a, var b, var c }):
>>>>
>>>> And similarly, we can use our "more arity" indicator:
>>>>
>>>> case R(var a, var b, var c, ...):
>>>>
>>>> to indicate that there are at least three elements.
>>>>
>>>>
More information about the amber-spec-observers
mailing list