Array patterns (and varargs patterns)
Remi Forax
forax at univ-mlv.fr
Wed Sep 7 22:01:04 UTC 2022
----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Tagir Valeev" <amaembo at gmail.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 7, 2022 4:32:34 PM
> Subject: Re: Array patterns (and varargs patterns)
> I understand where this sentiment comes from. But the motivation is
> somewhat more indirect than "people are falling over themselves to
> deconstruct arrays today".
>
> Because deconstruction is the dual of aggregation, it is desirable for
> each of the forms of aggregation -- constructors, factories, etc -- to
> have pattern counterparts. Not doing so creates asymmetries that make
> the whole thing seem more ad-hoc. Many of the "not as important"
> pattern features we're working on now, are in the realm of "completing"
> the feature.
>
> More importantly, array patterns are how we fully support varargs in
> records. If we have a varargs record:
>
> record VA(String... strings) { }
>
> we can construct it with a varargs invocation
>
> new VA("a", "b")
>
> which is sugar for
>
> new VA(new String[] { "a", "b" })
>
> But we cannot yet deconstruct it with:
>
> case VA(var a, var b)
>
> and analogously, for a varargs record, the above is sugar for
>
> case VA(String[] { var a, var b })
>
> So it is not just about arrays.
>
> I agree that named patterns are more useful, and we are working on them
> too. But they are also a bigger feature (bringing in overload
> selection, reflection, translation, etc), so they will take longer.
> Whereas array patterns are really a remix of things we've already worked
> out -- nested patterns, exhaustiveness, etc. In any case I would like
> to avoid leaving a trail of unfinished work, so cleaning up the loose
> ends on basic patterns first seems preferable before adding bigger new
> pattern features.
>
>> Hello!
>>
>> Honestly, to me this whole feature looks not very important. It's a
>> rare case in modern Java applications that business logic operates
>> with arrays directly. They are mostly used in low-level system code
>> where performance matters more than code elegance. Custom defined
>> named patterns for lists would be much more useful. Moreover, if named
>> patterns are supported, then array deconstruction could be implemented
>> in a library, without complicating the language specification (like `x
>> instanceof Arrays.of(String first, String next, String last)`).
>
> I'm not sure how this Arrays.of pattern is going to work, unless we're
> willing to have overloads for every arity up to, say, 22? Otherwise, we
> need varargs, and varargs is sugar for an array pattern.
For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
So i agree with Tagir, let's figure out how named patterns work first.
I see also other reasons to not specify the array pattern now,
- record with a varargs are quite rare so people are not desperately in need for the corresponding pattern,
- the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong).
Rémi
>
>>
>> With best regards,
>> Tagir Valeev.
>>
>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>> We dropped this out of the record patterns JEP, but I think it is time to
>>> revisit this.
>>>
>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>> container for nested patterns. And they have an obvious duality with array
>>> creation expressions.
>>>
>>> The main open question here was how we distinguish between "match an array of
>>> length exactly N" (where there are N nested patterns) and "match an array of
>>> length at least N". We toyed with the idea of a "..." indicator to mean "more
>>> elements", but this felt a little forced and opened new questions.
>>>
>>> It later occurred to me that there is another place to nest a pattern in an
>>> array pattern -- to match (and bind) the length. In the following, assume for
>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>> denote here with a constant literal.
>>>
>>> There is an obvious place to put this (optional) pattern: in between the
>>> brackets. So:
>>>
>>> case String[1] { P }:
>>> ^ a constant pattern
>>>
>>> would match string arrays of length 1 whose sole element matches P. And
>>>
>>> case String[] { P, Q }
>>>
>>> would match string arrays of length exactly 2, whose first two elements match P
>>> and Q respectively. (If the length pattern is not specified, we infer a
>>> constant pattern whose constant is equal to the length of the nested pattern
>>> list.)
>>>
>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>
>>> x instanceof String[] arr
>>> && arr.length matches L
>>> && arr.length >= n
>>> && arr[0] matches P0
>>> && arr[1] matches P1
>>> ...
>>> && arr[n] matches Pn
>>>
>>> More examples:
>>>
>>> case String[int len] { P }
>>>
>>> would match string arrays of length >= 1 whose first element matches P, and
>>> further binds the array length to `len`.
>>>
>>> case String[_] { P, Q }
>>>
>>> would match string arrays of any length whose first two elements match P and Q.
>>>
>>> case String[3] { }
>>> ^constant pattern
>>>
>>> matches all string arrays of length 3.
>>>
>>>
>>> This is a more principled way to do it, because the length is a part of the
>>> array and deserves a chance to match via nested patterns, just as with the
>>> elements, and it avoid trying to give "..." a new meaning.
>>>
>>> The downside is that it might be confusing at first (though people will learn
>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>
>>>
>>>
>>>
>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>
>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>> attach another sub-feature: array patterns. (This also bears on the question
>>> of "how would varargs patterns work", which I'll address below, though they
>>> might come later.)
>>>
>>> ## Array Patterns
>>>
>>> If we want to create a new array, we do so with an array construction
>>> expression:
>>>
>>> new String[] { "a", "b" }
>>>
>>> Since each form of aggregation should have its dual in destructuring, the
>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>> is:
>>>
>>> if (arr instanceof String[] { var a, var b }) { ... }
>>>
>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>> to the nested patterns `var a` and `var b`. This is the natural analogue of
>>> deconstruction patterns for arrays, complete with nesting.
>>>
>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>> rather than simply "length == 2". There are multiple syntactic ways to get
>>> there, for now I'm going to write
>>>
>>> if (arr instanceof String[] { var a, var b, ... })
>>>
>>> to indicate "more". The "..." matches zero or more elements and binds nothing.
>>>
>>> <digression>
>>> People are immediately going to ask "can I bind something to the remainder"; I
>>> think this is mostly an "attractive distraction", and would prefer to not have
>>> this dominate the discussion.
>>> </digression>
>>>
>>> Here's an example from the JDK that could use this effectively:
>>>
>>> String[] limits = limitString.split(":");
>>> try {
>>> switch (limits.length) {
>>> case 2: {
>>> if (!limits[1].equals("*"))
>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>> }
>>> case 1: {
>>> if (!limits[0].equals("*"))
>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>> }
>>> }
>>> }
>>> catch(NumberFormatException ex) {
>>> setMultilineLimit(MultilineLimit.DEPTH, -1);
>>> setMultilineLimit(MultilineLimit.LENGTH, -1);
>>> }
>>>
>>> becomes (eventually)
>>>
>>> switch (limitString.split(":")) {
>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>> }
>>>
>>> Note how not only does this become more compact, but the unchecked
>>> "NumberFormatException" is folded into the match, rather than being a separate
>>> concern.
>>>
>>>
>>> ## Varargs patterns
>>>
>>> Having array patterns offers us a natural way to interpret deconstruction
>>> patterns for varargs records. Assume we have:
>>>
>>> void m(X... xs) { }
>>>
>>> Then a varargs invocation
>>>
>>> m(a, b, c)
>>>
>>> is really sugar for
>>>
>>> m(new X[] { a, b, c })
>>>
>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>> array pattern. So for a record
>>>
>>> record R(X... xs) { }
>>>
>>> a varargs match:
>>>
>>> case R(var a, var b, var c):
>>>
>>> is really sugar for an array match:
>>>
>>> case R(X[] { var a, var b, var c }):
>>>
>>> And similarly, we can use our "more arity" indicator:
>>>
>>> case R(var a, var b, var c, ...):
>>>
>>> to indicate that there are at least three elements.
>>>
>>>
More information about the amber-spec-observers
mailing list