My experience with Sealed Types and Data-Oriented Programming

Fri Sep 9 15:31:21 UTC 2022

I am not even an amateur with parsers.  It seems like you need a
backtracking parser that converts strings into records.  When the parser
finds a construct that matches a record, it creates the record.  The code
that links the grammar to the records needs to be linked at the Java code
level so that a change in a record breaks the grammar and vis-a-versa.  I
don't of anything like this, but that doesn't say much since writing this
was pushing the boundaries of my knowledge.

On Fri, Sep 9, 2022 at 7:41 AM David Alayachew <davidalayachew at gmail.com>
wrote:

> Hello Amber Team,
>
> I just wanted to share my experiences with Sealed Types and Data-Oriented
> Programming. Specifically, I wanted to show how things turned out when I
> used them in a project. This project was built from the ground up to use
> Data-Oriented Programming to its fullest extent. If you want an in-depth
> look at the project itself, here is a link to the GitHub. If you clone it,
> you can run it right now using Java 18 + preview features.
>
> The GitHub repo = https://github.com/davidalayachew/RulesEngine
>
> Current version =
> https://github.com/davidalayachew/RulesEngine/commit/0e7fa42db4bbebaa3aa30f882645226d28e63ff4
>
> The project I am building is essentially an Inference Engine. This is very
> similar to Prolog, where you can give the language a set of rules, and then
> the language can use those rules to make logical deductions if you ask it a
> question. The only difference is that my version accepts plain English
> sentences, as opposed to requiring you to know syntax beforehand.
>
> Here is a snippet from the logs to show how things work.
>
> David is a programmer
> -------- OK
> Every programmer is an engineer
> -------- OK
> Every engineer is an artist
> -------- OK
> Is David an artist?
> -------- CORRECT
>
> As you can see, it takes in natural English and gleans rules from it, then
> uses those rules to perform logical deductions when a query is later made.
>
> Sealed types made this really powerful to work with because it helped me
> ensure that I was covering every edge case. I used a sealed interface to
> hold all of the possible rule types, then would use switch expressions to
> ensure that all possible branches of my code were handled if the parameter
> is of that sealed type. For the most part, it was a pleasant experience
> where the code more or less wrote itself.
>
> The part that I enjoyed the most about this was the ease of refactoring
> that sealed types, records, and switch expressions allowed. This project
> grew in difficulty very quickly, so I found myself refactoring my solution
> many times. Records automatically update all of their methods when you
> realize that that record needs to/shouldn't have a field. And switch
> expressions combined with sealed types ensured that if I added a new
> permitted subclass, I would have to update all of my methods that used
> switch expressions. That fact especially made me gravitate to using switch
> expressions to get as much totality as possible. When refactoring your
> code, totality is a massive time-saver and bug-preventer. Combine that with
> the pre-existing fact that interfaces force all subclasses to have the
> instance methods defined, and I had some powerful tools for refactoring
> that allowed me to iterate very quickly. I found that to be especially
> powerful because, when dealing with software that is exposed to the outside
> world, making that code easy to refactor is a must. The outside world is
> constantly changing, so it is important that we can change quickly too.
> Therefore, I really want to congratulate you all on creating such a
> powerful and expressive feature. I really enjoyed building this project,
> and I'm excited to add a lot more functionality to it.
>
> However, while I found working with sealed types and their permitted
> subclasses to be a smooth experience, I found the process of turning data
> from untyped Strings into one of the permitted subclasses to be a rather
> frustrating and difficult experience.
>
> At first glance, the solution looks really simple - just make a simple
> parse method like this.
>
> public static MySealedType parse(String sanitizedUserInput)
> {
>
>     //if string can safely be turned into Subclass1, then store into
> Subclass1 and return
>     //else, attempt for all other subclasses
>     //else, fail because string must be invalid to get here
>
> }
>
> Just like that, I should have my gateway into the world of strongly typed,
> expressive data-oriented programming, right? Unfortunately, maintaining
> that method got ugly fast. For starters, I don't have a small handful of
> permitted subclasses, I have many of them. Currently, there are 11, but I'm
> expecting my final design to have a little over 30 subclasses total. On top
> of that, since my incoming strings are natural English, each of my if
> branches carries non-trivial amounts of logic so that I can perform the
> necessary validation against all edge cases.
>
> To better explain the complexity, I had created a complex regex with
> capture groups for each permitted subclass, and then used that to validate
> the incoming String. If the regex matches, pass the values contained in the
> capture groups onto the constructor as a List<String>, then return the
> subclass containing the data of the string.
>
> At first, this worked well, but as the number of subclasses grew, this got
> very difficult to maintain as well. This difficulty was twofold.
>
> Problem 1 - I found that my regex would frequently be misaligned with my
> constructors during refactoring. If I decided that a record needed a new
> field, or that a field should be removed, I would update the record but not
> the regex, and then find errors during runtime. In fact, I sometimes didn't
> find errors during runtime because List<String> had the same number of
> elements as the constructor was expecting, but the fields were not aligned
> to the right index. This cost me a lot of development time.
>
> Problem 2 - I found that there wasn't an easy way to make sure that all of
> my subclasses followed all the rules that they were supposed to, and thus,
> I kept forgetting to implement those rules in one way or another every time
> I refactored. For example, for problem 1, I said that every subclass must
> have a regex. However, I couldn't find some compiler enforced way to
> enforce this.
>
> * Interfaces are only able to enforce instance methods. However, I can't
> have my regex be an instance method. That would be putting the cart before
> the horse - I am using the regex to create an instance, so the instance
> method is not helpful here
>
> * If I used a sealed abstract class instead and had permitted subclasses
> instead of permitted records, I still couldn't store my regex as a final
> instance field for the above reason.
>
> * In Java, static methods cannot be overrided, so I can't use a static
> method on my sealed interface. The static method would belong to the
> interface, not to the child subclasses.
>
> * And a static final field would not work for the same reason above.
>
> I ran into similar troubles when creating the alternative constructors for
> each permitted subrecord. Almost all of the above bulleted points apply,
> with the only exception being that for an abstract class, you can
> *technically* force your subclasses to call the super constructor. However,
> that did very little to help me solve my problem. Maybe I'm wrong and this
> is the silver bullet I am looking for, but it certainly doesn't seem like
> it. Therefore, I stuck to my original solution of a sealed interface with
> permitted subrecords.
>
> But back to my original point. I had 2 problems - misalignment and no
> enforcement of my abstract rules. Since I kept changing and creating and
> recreating more and more subclasses, these 2 pain points became bigger and
> bigger thorns in my side. Worse yet, I actually wanted to add more rules to
> make these classes even easier to work with, but decided not to after
> seeing the above difficulty.
>
> To alleviate problem 1, I stored my regexes in the records themselves, so
> that I would be forced to see the regex each time I looked at the record.
> For the most part, that solution seems to be good enough to deal with regex
> misalignment.
>
> To alleviate problem 2, I decided to brute force some totality and
> enforcement of my own. I fully admit, the solution I came up with here is a
> bad practice and something no one should imitate, but I found this to be
> the most effective way for me to enforce the rules I needed.
>
> I used reflection on my sealed interface. I got the sealed type class,
> called Class::getPermittedSubclasses, looped through the subclasses, did an
> unsafe cast from Class<?> to Class<SealedInterface> (because
> ::getPermittedSubclasses doesn't do that on its own for some reason???),
> called Class::getConstructor with the parameter being List.class (to
> represent the list of strings), and then used that to construct a
> Map<Pattern, Function<List<String>, MySealedInterface>>. I didn't do the
> same for the regex because that monstrosity of code included a Map::put
> which would take in the regex and the constructor. Therefore, it was pretty
> easy to remember both since they were right next to each other, and JVM
> will error out on startup if I forget to include my constructor. So, I have
> effectively solved both of my problems, but in less than desirable ways.
>
> For problem 2, one analogy that kept popping into my head was the idea of
> there being 2 islands. The island on the right has strong types, totality,
> pattern matching, and more. Meanwhile the island on the left is where
> everything is untyped and just strings. There does exist a bridge between
> the 2, but it's either difficult to make, doesn't scale very well, or not
> very flexible.
>
> This analogy really helped realize my frustration with it because it
> actually showed why I like Java enums so much. You can use the same analogy
> as above. The island on the right has ::ordinal, ::name, ::values, enums
> having their own instance fields and methods, and even some powerful tools
> like EnumSet and EnumMap. But what really ties it all together is that,
> there is a very clear and defined bridge between the left and the right -
> the ::valueOf method. Having this centralized pathway between the 2 made
> working with enums a pleasure and something I always liked to use when
> dealing with my code's interactions with the outside world. That ::valueOf
> enforced a constraint against all incoming Strings. And therefore, it
> allowed me to just perform some sanitizations along the way to make sure
> that that method could do it's job (uppercase all strings, remove
> non-identifier characters, etc). If it wasn't for JEP 301, I would call
> enums perfect.
>
> I just wish that there was some similar centralized pathway between
> data-oriented programming and the outside world. Some way for me to define
> on my sealed type, a method to give me a pathway to all of the permitted
> subclasses. Obviously, I can build it on my own, but that is where most of
> my pain points came from. Really, having some way to enforce that all of my
> subclasses have a similar class level validation logic and a similar class
> level factory/constructor method is what I am missing.
>
> That is the summary of my thoughts. Please do not misinterpret the
> extended discussion on the negatives to mean that I found the negative to
> be even equal to, let alone more than, the positives. I found this to be an
> overwhelmingly pleasant experience. Once I got my data turned into a type,
> everything flowed perfectly. It was just difficult to get it into a type in
> the first place, and it took a lot of words for me to explain why.
>
> Thank you all for your time and your help!
> David Alayachew
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20220909/17e37939/attachment-0001.htm>