My experience with Sealed Types and Data-Oriented Programming

Fri Sep 9 17:24:44 UTC 2022

 It's been a while.  One parser I used allowed for writing a grammar.  It
then parsed the grammar and turned it into Java code.  The grammar
specified how to hook the generated Java code into code that hooked into
the rest of the Java code.  If the grammar became out of sync with the rest
of the Java code, then the build process would flag errors in the grammar.

Instead of "of()", use "parse()".  The name is more descriptive.

On Fri, Sep 9, 2022 at 10:15 AM David Alayachew <davidalayachew at gmail.com>
wrote:

> Hello,
>
> (Sorry for resending, I forgot to include amber-dev as one of the
> recipients)
>
> Thank you for your response! I am having trouble visualizing what the code
> would look like, but I think I understand the spirit of your request.
>
> In short, it sounds like you are addressing Problem 1 by finding a way to
> make the regex follow the records structure. Almost as if the records
> body/components are the regex itself. That is a very good idea on paper,
> and I am trying to think of ways to put it into practice.
>
> Maybe instead of having the constructor and regex be separate, we could
> combine them both into a method that returns some variant of an Optional
> type? The method would attempt to parse the String into an object. If it
> succeeded, it would return a wrapped value, which we would have to use
> pattern matching to fetch. And if not, we can have a dedicated type to
> explain why it failed, or an Empty variant of optional if that's too much
> (though at that point, just use Optional).
>
> And the nice part of the solution is that it would respond to refactoring
> like you mentioned above. If I change the fields in my record, I am forced
> to change this method too because this method constructs the record. You
> can't construct a new record using old fields. That would make it easier to
> build and maintain. I could call the method ::of(String) and have it return
> that Optional-like thing I mentioned above.
>
> While I don't think this solution addresses Problem 2, we have further
> dealt with Problem 1. I think your suggestion is a good idea, and I will
> try to implement it tonight. Let me know if you think this should be done
> differently.
>
> Thank you for your help!
> David Alayachew
>
> On Fri, Sep 9, 2022 at 11:31 AM Nathan Reynolds <numeralnathan at gmail.com>
> wrote:
>
>> I am not even an amateur with parsers.  It seems like you need a
>> backtracking parser that converts strings into records.  When the parser
>> finds a construct that matches a record, it creates the record.  The code
>> that links the grammar to the records needs to be linked at the Java code
>> level so that a change in a record breaks the grammar and vis-a-versa.  I
>> don't of anything like this, but that doesn't say much since writing this
>> was pushing the boundaries of my knowledge.
>>
>> On Fri, Sep 9, 2022 at 7:41 AM David Alayachew <davidalayachew at gmail.com>
>> wrote:
>>
>>> Hello Amber Team,
>>>
>>> I just wanted to share my experiences with Sealed Types and
>>> Data-Oriented Programming. Specifically, I wanted to show how things turned
>>> out when I used them in a project. This project was built from the ground
>>> up to use Data-Oriented Programming to its fullest extent. If you want an
>>> in-depth look at the project itself, here is a link to the GitHub. If you
>>> clone it, you can run it right now using Java 18 + preview features.
>>>
>>> The GitHub repo = https://github.com/davidalayachew/RulesEngine
>>>
>>> Current version =
>>> https://github.com/davidalayachew/RulesEngine/commit/0e7fa42db4bbebaa3aa30f882645226d28e63ff4
>>>
>>> The project I am building is essentially an Inference Engine. This is
>>> very similar to Prolog, where you can give the language a set of rules, and
>>> then the language can use those rules to make logical deductions if you ask
>>> it a question. The only difference is that my version accepts plain English
>>> sentences, as opposed to requiring you to know syntax beforehand.
>>>
>>> Here is a snippet from the logs to show how things work.
>>>
>>> David is a programmer
>>> -------- OK
>>> Every programmer is an engineer
>>> -------- OK
>>> Every engineer is an artist
>>> -------- OK
>>> Is David an artist?
>>> -------- CORRECT
>>>
>>> As you can see, it takes in natural English and gleans rules from it,
>>> then uses those rules to perform logical deductions when a query is later
>>> made.
>>>
>>> Sealed types made this really powerful to work with because it helped me
>>> ensure that I was covering every edge case. I used a sealed interface to
>>> hold all of the possible rule types, then would use switch expressions to
>>> ensure that all possible branches of my code were handled if the parameter
>>> is of that sealed type. For the most part, it was a pleasant experience
>>> where the code more or less wrote itself.
>>>
>>> The part that I enjoyed the most about this was the ease of refactoring
>>> that sealed types, records, and switch expressions allowed. This project
>>> grew in difficulty very quickly, so I found myself refactoring my solution
>>> many times. Records automatically update all of their methods when you
>>> realize that that record needs to/shouldn't have a field. And switch
>>> expressions combined with sealed types ensured that if I added a new
>>> permitted subclass, I would have to update all of my methods that used
>>> switch expressions. That fact especially made me gravitate to using switch
>>> expressions to get as much totality as possible. When refactoring your
>>> code, totality is a massive time-saver and bug-preventer. Combine that with
>>> the pre-existing fact that interfaces force all subclasses to have the
>>> instance methods defined, and I had some powerful tools for refactoring
>>> that allowed me to iterate very quickly. I found that to be especially
>>> powerful because, when dealing with software that is exposed to the outside
>>> world, making that code easy to refactor is a must. The outside world is
>>> constantly changing, so it is important that we can change quickly too.
>>> Therefore, I really want to congratulate you all on creating such a
>>> powerful and expressive feature. I really enjoyed building this project,
>>> and I'm excited to add a lot more functionality to it.
>>>
>>> However, while I found working with sealed types and their permitted
>>> subclasses to be a smooth experience, I found the process of turning data
>>> from untyped Strings into one of the permitted subclasses to be a rather
>>> frustrating and difficult experience.
>>>
>>> At first glance, the solution looks really simple - just make a simple
>>> parse method like this.
>>>
>>> public static MySealedType parse(String sanitizedUserInput)
>>> {
>>>
>>>     //if string can safely be turned into Subclass1, then store into
>>> Subclass1 and return
>>>     //else, attempt for all other subclasses
>>>     //else, fail because string must be invalid to get here
>>>
>>> }
>>>
>>> Just like that, I should have my gateway into the world of strongly
>>> typed, expressive data-oriented programming, right? Unfortunately,
>>> maintaining that method got ugly fast. For starters, I don't have a small
>>> handful of permitted subclasses, I have many of them. Currently, there are
>>> 11, but I'm expecting my final design to have a little over 30 subclasses
>>> total. On top of that, since my incoming strings are natural English, each
>>> of my if branches carries non-trivial amounts of logic so that I can
>>> perform the necessary validation against all edge cases.
>>>
>>> To better explain the complexity, I had created a complex regex with
>>> capture groups for each permitted subclass, and then used that to validate
>>> the incoming String. If the regex matches, pass the values contained in the
>>> capture groups onto the constructor as a List<String>, then return the
>>> subclass containing the data of the string.
>>>
>>> At first, this worked well, but as the number of subclasses grew, this
>>> got very difficult to maintain as well. This difficulty was twofold.
>>>
>>> Problem 1 - I found that my regex would frequently be misaligned with my
>>> constructors during refactoring. If I decided that a record needed a new
>>> field, or that a field should be removed, I would update the record but not
>>> the regex, and then find errors during runtime. In fact, I sometimes didn't
>>> find errors during runtime because List<String> had the same number of
>>> elements as the constructor was expecting, but the fields were not aligned
>>> to the right index. This cost me a lot of development time.
>>>
>>> Problem 2 - I found that there wasn't an easy way to make sure that all
>>> of my subclasses followed all the rules that they were supposed to, and
>>> thus, I kept forgetting to implement those rules in one way or another
>>> every time I refactored. For example, for problem 1, I said that every
>>> subclass must have a regex. However, I couldn't find some compiler enforced
>>> way to enforce this.
>>>
>>> * Interfaces are only able to enforce instance methods. However, I can't
>>> have my regex be an instance method. That would be putting the cart before
>>> the horse - I am using the regex to create an instance, so the instance
>>> method is not helpful here
>>>
>>> * If I used a sealed abstract class instead and had permitted subclasses
>>> instead of permitted records, I still couldn't store my regex as a final
>>> instance field for the above reason.
>>>
>>> * In Java, static methods cannot be overrided, so I can't use a static
>>> method on my sealed interface. The static method would belong to the
>>> interface, not to the child subclasses.
>>>
>>> * And a static final field would not work for the same reason above.
>>>
>>> I ran into similar troubles when creating the alternative constructors
>>> for each permitted subrecord. Almost all of the above bulleted points
>>> apply, with the only exception being that for an abstract class, you can
>>> *technically* force your subclasses to call the super constructor. However,
>>> that did very little to help me solve my problem. Maybe I'm wrong and this
>>> is the silver bullet I am looking for, but it certainly doesn't seem like
>>> it. Therefore, I stuck to my original solution of a sealed interface with
>>> permitted subrecords.
>>>
>>> But back to my original point. I had 2 problems - misalignment and no
>>> enforcement of my abstract rules. Since I kept changing and creating and
>>> recreating more and more subclasses, these 2 pain points became bigger and
>>> bigger thorns in my side. Worse yet, I actually wanted to add more rules to
>>> make these classes even easier to work with, but decided not to after
>>> seeing the above difficulty.
>>>
>>> To alleviate problem 1, I stored my regexes in the records themselves,
>>> so that I would be forced to see the regex each time I looked at the
>>> record. For the most part, that solution seems to be good enough to deal
>>> with regex misalignment.
>>>
>>> To alleviate problem 2, I decided to brute force some totality and
>>> enforcement of my own. I fully admit, the solution I came up with here is a
>>> bad practice and something no one should imitate, but I found this to be
>>> the most effective way for me to enforce the rules I needed.
>>>
>>> I used reflection on my sealed interface. I got the sealed type class,
>>> called Class::getPermittedSubclasses, looped through the subclasses, did an
>>> unsafe cast from Class<?> to Class<SealedInterface> (because
>>> ::getPermittedSubclasses doesn't do that on its own for some reason???),
>>> called Class::getConstructor with the parameter being List.class (to
>>> represent the list of strings), and then used that to construct a
>>> Map<Pattern, Function<List<String>, MySealedInterface>>. I didn't do the
>>> same for the regex because that monstrosity of code included a Map::put
>>> which would take in the regex and the constructor. Therefore, it was pretty
>>> easy to remember both since they were right next to each other, and JVM
>>> will error out on startup if I forget to include my constructor. So, I have
>>> effectively solved both of my problems, but in less than desirable ways.
>>>
>>> For problem 2, one analogy that kept popping into my head was the idea
>>> of there being 2 islands. The island on the right has strong types,
>>> totality, pattern matching, and more. Meanwhile the island on the left is
>>> where everything is untyped and just strings. There does exist a bridge
>>> between the 2, but it's either difficult to make, doesn't scale very well,
>>> or not very flexible.
>>>
>>> This analogy really helped realize my frustration with it because it
>>> actually showed why I like Java enums so much. You can use the same analogy
>>> as above. The island on the right has ::ordinal, ::name, ::values, enums
>>> having their own instance fields and methods, and even some powerful tools
>>> like EnumSet and EnumMap. But what really ties it all together is that,
>>> there is a very clear and defined bridge between the left and the right -
>>> the ::valueOf method. Having this centralized pathway between the 2 made
>>> working with enums a pleasure and something I always liked to use when
>>> dealing with my code's interactions with the outside world. That ::valueOf
>>> enforced a constraint against all incoming Strings. And therefore, it
>>> allowed me to just perform some sanitizations along the way to make sure
>>> that that method could do it's job (uppercase all strings, remove
>>> non-identifier characters, etc). If it wasn't for JEP 301, I would call
>>> enums perfect.
>>>
>>> I just wish that there was some similar centralized pathway between
>>> data-oriented programming and the outside world. Some way for me to define
>>> on my sealed type, a method to give me a pathway to all of the permitted
>>> subclasses. Obviously, I can build it on my own, but that is where most of
>>> my pain points came from. Really, having some way to enforce that all of my
>>> subclasses have a similar class level validation logic and a similar class
>>> level factory/constructor method is what I am missing.
>>>
>>> That is the summary of my thoughts. Please do not misinterpret the
>>> extended discussion on the negatives to mean that I found the negative to
>>> be even equal to, let alone more than, the positives. I found this to be an
>>> overwhelmingly pleasant experience. Once I got my data turned into a type,
>>> everything flowed perfectly. It was just difficult to get it into a type in
>>> the first place, and it took a lot of words for me to explain why.
>>>
>>> Thank you all for your time and your help!
>>> David Alayachew
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20220909/4bc139f4/attachment-0001.htm>