Array patterns (and varargs patterns)

Remi Forax forax at univ-mlv.fr
Wed Sep 7 22:01:04 UTC 2022


----- Original Message -----
> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Tagir Valeev" <amaembo at gmail.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Wednesday, September 7, 2022 4:32:34 PM
> Subject: Re: Array patterns (and varargs patterns)

> I understand where this sentiment comes from.  But the motivation is
> somewhat more indirect than "people are falling over themselves to
> deconstruct arrays today".
> 
> Because deconstruction is the dual of aggregation, it is desirable for
> each of the forms of aggregation -- constructors, factories, etc -- to
> have pattern counterparts.  Not doing so creates asymmetries that make
> the whole thing seem more ad-hoc.  Many of the "not as important"
> pattern features we're working on now, are in the realm of "completing"
> the feature.
> 
> More importantly, array patterns are how we fully support varargs in
> records.  If we have a varargs record:
> 
>     record VA(String... strings) { }
> 
> we can construct it with a varargs invocation
> 
>     new VA("a", "b")
> 
> which is sugar for
> 
>     new VA(new String[] { "a", "b" })
> 
> But we cannot yet deconstruct it with:
> 
>     case VA(var a, var b)
> 
> and analogously, for a varargs record, the above is sugar for
> 
>     case VA(String[] { var a, var b })
> 
> So it is not just about arrays.
> 
> I agree that named patterns are more useful, and we are working on them
> too.  But they are also a bigger feature (bringing in overload
> selection, reflection, translation, etc), so they will take longer.
> Whereas array patterns are really a remix of things we've already worked
> out -- nested patterns, exhaustiveness, etc.  In any case I would like
> to avoid leaving a trail of unfinished work, so cleaning up the loose
> ends on basic patterns first seems preferable before adding bigger new
> pattern features.
> 
>> Hello!
>>
>> Honestly, to me this whole feature looks not very important. It's a
>> rare case in modern Java applications that business logic operates
>> with arrays directly. They are mostly used in low-level system code
>> where performance matters more than code elegance. Custom defined
>> named patterns for lists would be much more useful. Moreover, if named
>> patterns are supported, then array deconstruction could be implemented
>> in a library, without complicating the language specification (like `x
>> instanceof Arrays.of(String first, String next, String last)`).
> 
> I'm not sure how this Arrays.of pattern is going to work, unless we're
> willing to have overloads for every arity up to, say, 22? Otherwise, we
> need varargs, and varargs is sugar for an array pattern.

For me, Arrays.of() is a named pattern with a vararg list of bindings, no ?
So i agree with Tagir, let's figure out how named patterns work first.

I see also other reasons to not specify the array pattern now,
- record with a varargs are quite rare so people are not desperately in need for the corresponding pattern,
- the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong).

Rémi 

> 
>>
>> With best regards,
>> Tagir Valeev.
>>
>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>> We dropped this out of the record patterns JEP, but I think it is time to
>>> revisit this.
>>>
>>> The concept of array patterns was pretty straightforward; they mimic the nesting
>>> and exhaustiveness rules of record patterns, they are just a different sort of
>>> container for nested patterns.  And they have an obvious duality with array
>>> creation expressions.
>>>
>>> The main open question here was how we distinguish between "match an array of
>>> length exactly N" (where there are N nested patterns) and "match an array of
>>> length at least N".  We toyed with the idea of a "..." indicator to mean "more
>>> elements", but this felt a little forced and opened new questions.
>>>
>>> It later occurred to me that there is another place to nest a pattern in an
>>> array pattern -- to match (and bind) the length.  In the following, assume for
>>> sake of exposition that "_" is the "any" pattern (matches everything, binds
>>> nothing) and that we have some way to denote a constant pattern, which I'll
>>> denote here with a constant literal.
>>>
>>> There is an obvious place to put this (optional) pattern: in between the
>>> brackets.  So:
>>>
>>>      case String[1] { P }:
>>>                  ^ a constant pattern
>>>
>>> would match string arrays of length 1 whose sole element matches P.  And
>>>
>>>      case String[] { P, Q }
>>>
>>> would match string arrays of length exactly 2, whose first two elements match P
>>> and Q respectively.  (If the length pattern is not specified, we infer a
>>> constant pattern whose constant is equal to the length of the nested pattern
>>> list.)
>>>
>>> Matching a target to `String[L] { P0, .., Pn }` means
>>>
>>>      x instanceof String[] arr
>>>          && arr.length matches L
>>>          && arr.length >= n
>>>          && arr[0] matches P0
>>>          && arr[1] matches P1
>>>          ...
>>>          && arr[n] matches Pn
>>>
>>> More examples:
>>>
>>>      case String[int len] { P }
>>>
>>> would match string arrays of length >= 1 whose first element matches P, and
>>> further binds the array length to `len`.
>>>
>>>      case String[_] { P, Q }
>>>
>>> would match string arrays of any length whose first two elements match P and Q.
>>>
>>>      case String[3] { }
>>>                  ^constant pattern
>>>
>>> matches all string arrays of length 3.
>>>
>>>
>>> This is a more principled way to do it, because the length is a part of the
>>> array and deserves a chance to match via nested patterns, just as with the
>>> elements, and it avoid trying to give "..." a new meaning.
>>>
>>> The downside is that it might be confusing at first (though people will learn
>>> quickly enough) how to distinguish between an exact match and a prefix match.
>>>
>>>
>>>
>>>
>>> On 1/5/2021 1:48 PM, Brian Goetz wrote:
>>>
>>> As we get into the next round of pattern matching, I'd like to opportunistically
>>> attach another sub-feature: array patterns.  (This also bears on the question
>>> of "how would varargs patterns work", which I'll address below, though they
>>> might come later.)
>>>
>>> ## Array Patterns
>>>
>>> If we want to create a new array, we do so with an array construction
>>> expression:
>>>
>>>      new String[] { "a", "b" }
>>>
>>> Since each form of aggregation should have its dual in destructuring, the
>>> natural way to represent an array pattern (h/t to AlanM for suggesting this)
>>> is:
>>>
>>>      if (arr instanceof String[] { var a, var b }) { ... }
>>>
>>> Here, the applicability test is: "are you an instanceof of String[], with length
>>> = 2", and if so, we cast to String[], extract the two elements, and match them
>>> to the nested patterns `var a` and `var b`.   This is the natural analogue of
>>> deconstruction patterns for arrays, complete with nesting.
>>>
>>> Since an array can have more elements, we likely need a way to say "length >= 2"
>>> rather than simply "length == 2".  There are multiple syntactic ways to get
>>> there, for now I'm going to write
>>>
>>>      if (arr instanceof String[] { var a, var b, ... })
>>>
>>> to indicate "more".  The "..." matches zero or more elements and binds nothing.
>>>
>>> <digression>
>>> People are immediately going to ask "can I bind something to the remainder"; I
>>> think this is mostly an "attractive distraction", and would prefer to not have
>>> this dominate the discussion.
>>> </digression>
>>>
>>> Here's an example from the JDK that could use this effectively:
>>>
>>> String[] limits = limitString.split(":");
>>> try {
>>>      switch (limits.length) {
>>>          case 2: {
>>>              if (!limits[1].equals("*"))
>>>                  setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1]));
>>>          }
>>>          case 1: {
>>>              if (!limits[0].equals("*"))
>>>                  setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0]));
>>>          }
>>>      }
>>> }
>>> catch(NumberFormatException ex) {
>>>      setMultilineLimit(MultilineLimit.DEPTH, -1);
>>>      setMultilineLimit(MultilineLimit.LENGTH, -1);
>>> }
>>>
>>> becomes (eventually)
>>>
>>>      switch (limitString.split(":")) {
>>>          case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i);
>>>          case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i);
>>>          default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); }
>>>      }
>>>
>>> Note how not only does this become more compact, but the unchecked
>>> "NumberFormatException" is folded into the match, rather than being a separate
>>> concern.
>>>
>>>
>>> ## Varargs patterns
>>>
>>> Having array patterns offers us a natural way to interpret deconstruction
>>> patterns for varargs records.  Assume we have:
>>>
>>>      void m(X... xs) { }
>>>
>>> Then a varargs invocation
>>>
>>>      m(a, b, c)
>>>
>>> is really sugar for
>>>
>>>      m(new X[] { a, b, c })
>>>
>>> So the dual of a varargs invocation, a varargs match, is really a match to an
>>> array pattern.  So for a record
>>>
>>>      record R(X... xs) { }
>>>
>>> a varargs match:
>>>
>>>      case R(var a, var b, var c):
>>>
>>> is really sugar for an array match:
>>>
>>>      case R(X[] { var a, var b, var c }):
>>>
>>> And similarly, we can use our "more arity" indicator:
>>>
>>>      case R(var a, var b, var c, ...):
>>>
>>> to indicate that there are at least three elements.
>>>
>>>


More information about the amber-spec-observers mailing list