It's the data, stupid !

Mon May 30 17:43:27 UTC 2022

> From: "Brian Goetz" <brian.goetz at oracle.com>
> To: "Remi Forax" <forax at univ-mlv.fr>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Sent: Monday, May 30, 2022 6:40:22 PM
> Subject: Re: It's the data, stupid !

>> First, i've overlook the importance of the record pattern as a check of the
>> shape of the data.

>> Then if we say that data are more important than code and that the aim of the
>> pattern matching is to detect changes of the shapes of the data,
>> it changes the usefulness of some features/patterns.

> OK, now that I see what argument you are really winding up for, I think I'm
> going to disagree. Yes, data-as-data is a huge benefit; it is something we were
> not so good at before, and something that has become more important over time.
> That has motivated us to *prioritize* the data-centric features of pattern
> matching over more general ones, because they deliver direct value the soonest.
> But if you're trying to leverage that into a "this is the only benefit" (or
> even the main benefit), I think that's taking it too far.

> The truly big picture here is that pattern matching is the dual of aggregation.
> Java gives us lots of ways to put things together (constructors, factories,
> builders, maybe some day collection literals), but the reverse of each of these
> is ad-hoc, different, and usually harder-to-use / more error-prone. The big
> picture here is that pattern matching *completes the object model*, by
> providing the missing reverse link. (In mathematical terms, a constructor and
> deconstructor (or factory and static pattern, or builder and "unbuilder", or
> collection literal and collection pattern) form an *embedding-projection
> pair*.)

> Much of this was laid out in Pattern Matching in the Java Object Model:

> [
> https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md
> |
> https://github.com/openjdk/amber-docs/blob/master/site/design-notes/patterns/pattern-match-object-model.md
> ]

The problem is that what you propose is a leaky abstraction, because pattern matching works on classes and not on types, so it's not a reverse link. 

Let say we have a class with two shapes/deconstruction 

class A { 
deconstructor (B) { ... } 
deconstructor (C) { ... } 
} 

With the pattern A(D d), D is a runtime class not a type, you have no idea if it means 
instanceof A a && B b = a.deconstructor() && b instanceof D 
or 
instanceof A a && C c = a.deconstructor() && c instanceof D 

Unlike with a method call (constructor call) where the type of the arguments are available, with the pattern matching, you do not have the types of the arguments, only runtime classes to match. 

so while a deconstructor can be seen as the inverse of a constructor, a type pattern does not give you the information of the type that allow you to do the method selection on the deconstructors at compile time. 

>> it makes the varargs pattern a kind of harmful, because it matches data of
>> several shapes, so the code may still compile if the shape of the
>> record/data-type change.

> I think you've stretched your argument to the breaking point. No one said that
> each pattern can only match *one* structure of data.
I think it's a very good question, the reason we may want to match several structures of data is backward compatibility, but it does not make a lot of sense to offer backward compatibility on data if at the same time the data are more important than the code i.e. if the data drive the code. 

As i said, it's a question where OOP and DOD (data oriented design ?) disagree one with the other. 

And this is a problem specific to the deconstructor, for named pattern method, there is no such problem, obviously a user can add as many pattern methods he/she want. 

> But for each way of putting together the data, there should be a corresponding
> way to take it apart.
if the pattern matching was a real inverse link, yes, maybe. 

>> - the varargs pattern can be emulated by an array pattern and it's even better
>> because an array pattern checks that the shape is an array and

> Well, we don't have array patterns yet either, but just as varargs invocation is
> shorthand for a manually created array, varargs patterns are shorthand for an
> explicit array pattern.
The problem is that varargs pattern can also recognizes a record with no record or class with a deconstructor with no varargs. 

>> The result is that i'm not sure the vararg pattern is a target worth pursuing.

> I think its fine to be "not sure", and its doubly fine to say "I'm not sure the
> cost-benefit is so compelling, maybe there are other features that we should do
> first" (like array patterns.) But if you're trying to make the argument that
> varargs patterns are actually harmful, you've got a much bigger uphill battle.

> And don't forget, records are just the first vehicle here; this is coming for
> arbitrary classes too. And being able to construct things via varargs
> construction, but not take them apart by varargs patterns, seems a gratuitous
> inconsistency. (Again, maybe we decide that better type inference is worth
> doing first, but the lack of varargs will still be a wart.)
You think term of inverse function, we have varargs constructors so we should have varargs pattern, but a pattern is not an inverse function. 
We have the freedom to provide a simpler model. 

>> Deconstructors of a class also becomes a kind of a war ground between the OOP
>> and the pattern matching, OOP says that API is important and pattern matching
>> says it's ok to change the data changing the API because the compiler will
>> points where the code should be updated.
>> We still want encapsulation because it's a class but we want to detect if its
>> shape change so having a class with several shapes becomes not as useful as i
>> first envision.

> No, these are not in conflict at all. The biggest tool OOP offers us is
> encapsulation; it gives us a way to decide how much state we want to expose, in
> what form, etc, fully decoupled from the representation. (Records don't have
> this option for decoupling representation from API, which is what makes it so
> easy to deliver these features first for records.) Most classes still choose to
> give clients _some_ way to access most of the state we pass into the
> constructor and other API points; its just that this part of the API is usually
> gratuitously different (e.g., accessors, wrapping with Optional) from the part
> where state goes in. Which means that we *do* expose the state to readers, just
> in a gratuitously different way that we do to writers. What pattern matching
> does is gives us exactly the same control we have today over what to expose,
> and in what form, but lets us do it in a way that is structurally related to
> how we put state into objects. It does so with combining multiple return,
> conditionality, and flow analysis in an integrated way, so we don't have to
> reinvent these in an ad-hoc way in every class.
You are choosing the OOP view here, i'm not sure if i disagree or not, i don't know, but i'm just saying that this is a choice and it is a choice that is far from obvious to me. 

> So while we agree that records + sealed classes + pattern matching enable a nice
> form of data-oriented programming, and that was indeed a big goal, I think the
> model you're trying to extrapolate about what the "point" of pattern matching
> is may be missing its mark. There's a bigger picture here.
maybe or maybe not, i don't want to invent something nobody will use in the future apart on slides. 

Rémi 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20220530/3835db3a/attachment-0001.htm>