RFR: 8291769: Translation of switch with record patterns could be improved

Thu Aug 4 17:23:07 UTC 2022

----- Original Message -----
> From: "jan lahoda" <jan.lahoda at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>, "Jan Lahoda" <jlahoda at openjdk.org>, "Brian Goetz" <brian.goetz at oracle.com>
> Cc: "compiler-dev" <compiler-dev at openjdk.org>
> Sent: Thursday, August 4, 2022 6:35:18 PM
> Subject: Re: RFR: 8291769: Translation of switch with record patterns could be improved

> On 04. 08. 22 17:43, Remi Forax wrote:
> 
>>
>> ----- Original Message -----
>>> From: "Jan Lahoda" <jlahoda at openjdk.org>
>>> To: compiler-dev at openjdk.org
>>> Sent: Thursday, August 4, 2022 5:04:52 PM
>>> Subject: RFR: 8291769: Translation of switch with record patterns could be
>>> improved
>>> This is an attempt to improve the performance and scalability of switches with
>>> record patterns.
>>>
>>> There are two main parts of this patch:
>>> 1. for cases of consecutive runs of cases with the same record pattern, these
>>> are replaced with a single case and a nested switch. E.g.:
>>>
>>> switch (obj) {
>>>      case Box(String s) -> {}
>>>      case Box(Integer i) -> {}
>>>      case Box(Number n) -> {}
>>>      ...
>>> }
>>> =>
>>> switch (obj) {
>>>      case Box b ->
>>>           switch (b.o()) {
>>>               case String s -> {}
>>>               case Integer i -> {}
>>>               case Number n -> {}
>>>               default -> continue-with-outer-switch;
>>>           };
>>>     ...
>>> }
>>>
>>>
>>> This is done by first unrolling the record patterns into simple binding patterns
>>> a guards, and then by finding cases with the common binding pattern as a label
>>> and a viable guard. Binding patterns are reused as much as possibly,
>>> eliminating specialized handling of record patterns as much as possible.
>> How it works when you have a record pattern followed by a type pattern with the
>> same type ?
>> Something like:
>>
>>    switch (obj) {
>>      case Box(String s) -> {}
>>      case Box(Integer i) -> {}
>>      
>>      case Box b -> {}
>>    }
>>
>> because you now have twice the same type pattern on the outer switch but the
>> first one should not NPE.
> 
> 
> Currently the `case Box b -> {}` is kept separate. It would be possible
> to merge (it would simply be a default in the nested switch), but it is
> an additional complexity in a fairly complex code already.
> 
> 
>>
>>
>>> 2. When a record accessor method fails with an exception, we need to wrap this
>>> exception with a `MatchException`. Currently this is being done by introducing
>>> a special proxy method which catches the exception and re-throws. The proposed
>>> patch here eliminates the need for the accessor methods by producing an
>>> appropriate `ExceptionTable` in the classfile, and a separate catch handler.
>>> This handler is attached to the innermost usable block, which is either the
>>> method block, lambda body, (static or non-static) initializer or a try block.
>>> This should ensure correct semantics, while not producing too many unnecessary
>>> catch handlers.
>> Yes !
>>
>> As said above, the semantics for null is slightly changed, because now you may
>> throw a NPE instead of a MatchException.
> 
> 
> The idea here is that the semantics should not be changed, and changes
> to semantics are bugs. But thanks to your comment, I realized null
> record components are not handled properly, so I've attempted to fix that:
> 
> https://github.com/openjdk/jdk/pull/9746/commits/7ddeafab1d1ea23285e60765ec0e61d7a509e442
> 
> 
>> You can add explicit nullchecks to throw a MatchException instead but i think we
>> can do better for our users.
>>
>> I believe that with this transformation, we can now make the difference between
>> the 3 kinds of errors,
>> - the call to the accessor / destructor fails throwing an error similar to
>> ExceptionInTheInitializerError (ExceptionInTheDestructorError ?)
>> - the tested value is null throwing a NPE because most of them are implicit when
>> b.o() is called. This means that the compiler has to insert a requireNonNull if
>> the nullcheck is not implicit.
>> - when there are more permitted subtypes at runtime throwing an
>> IncompatibleClassChangeError (like with an switch expression on an enum)
> 
> 
> (FWIW, I believe the spec is saying what should happen in specific
> cases, so I don't think this is something javac should decide on its
> own. javac can decide on how the behavior is implemented, but not the
> behavior.)

yes, i want to change the spec because as a user it's hard to know how to fix the code when a MatchException is thrown.

> 
> 
>>
>> for the last one, the idea is that if there is a sealed type, instead of
>> generating an if instanceof for all cases, the last one should not be an
>> instanceof but just a cast with a cacth handler that catch the
>> ClassCastException and throw an IncompatibleClassChangeError wrapping the CCE
>> instead.
> 
> 
> Note that when switching over a type we don't generate instanceof for
> the subtypes of that type. The typeSwitch bootstrap/indy does the
> checks, and the bytecode produced by javac only produces checkcasts. (It
> may produce instanceofs for the nested patterns, though.) It would be
> possible to use a default instead of the last case, and add catch
> handler for the checkcast, but generating cases for all the checked
> types, and adding a default which throws (which I believe is the current
> implementation) feels much easier and cleaner to implement, and I don't
> really see a significant disadvantage in that.

It's a performance issue because you need to recheck the same type pattern multiple times if the switch alternate a record pattern followed by a type pattern followed by a record pattern ...

> 
> 
> Thanks,
> 
>     Jan
> 
> 

Rémi