[External] : Re: Question about circular references

Thu Jul 6 19:34:34 UTC 2023

Hello Ron,

Thank you for your response!

> > All of that is to say this. This is a problem that I
> > pay a tax for several times a day, every day. So in my
> > eyes, this is a very large thorn in my side.
>
> What is the tax, though? Is it the lack of access to
> pattern matching? Is it having to manually write accessor
> methods? Is it just boilerplate or is there something
> that’s difficult to do correctly without language support?

In short, it's the boilerplate and indirection combined.

Pattern-matching is not a problem at all. Pattern-matching actually tends
to pair well with STD's.

But back to the problem -- it seems small at first glance, but it adds up
quickly. Let me show you what I mean.

Starting from literal ground zero, let's pick the simplest State Transition
Diagram that has a circular reference -- Rock, Paper, Scissors.

Since my it's literally just 3 states that have a reference to the other
state that they "defeat", then an enum sounds like a perfect fit for this
type of problem.

And btw, here is a stackoverflow post I made about a very similar
situation. I encourage reading it and the accepted answer, since it goes
into way more depth about the problem I am describing.

https://stackoverflow.com/questions/75072937/why-does-my-lambda-get-illegal-forward-reference-but-my-anonymous-class-does-no

enum RPS0
{

Rock(Paper),
Paper(Scissors),
Scissors(Rock),
;

public final losesTo;

RPS0(RPS0 losesTo)
{

this.losesTo = losesTo;

}

}

Plain, simple, obvious. No confusion or difficulty whatsoever.

But this doesn't compile -- I am referencing a value that does not yet
exist. The specific error I get is illegal forward reference.

So, I am forced to dig into a couple of other pots. Following your
suggestion from earlier Ron, I can do a builder-esque pattern of sorts and
go from there.

So here we go.

      enum RPS1
      {

         Rock,
         Paper,
         Scissors,
         ;

         private RPS1 losesTo;

         static //a static initialization block is the simplest idea imo.
         {

            Rock.losesTo = Paper;
            Paper.losesTo = Scissors;
            Scissors.losesTo = Rock;

         }

         public RPS1 losesTo()
         {

            return this.losesTo;

         }

      }

Now, one could argue that this is actually even slightly simpler to
understand than what we saw before. I am willing to concede that. But what
happens if we add a new option to the game, let's say Shotgun? We need to
make sure not to forget to add the Shotgun branch to our static
initialization block. That's a problem.

Let's try and fix that with a switch expression.

      enum RPS2
      {

         Rock,
         Paper,
         Scissors,
         ;

         public static RPS2 losesTo(RPS2 choice)
         {

            return
               switch (choice)
               {

                  case Rock      -> Paper;
                  case Paper     -> Scissors;
                  case Scissors  -> Rock;

               };

         }

      }

Ignore the fact that I didn't include a shotgun, I'm just making a point.

By using a switch expression, not only have we gained exhaustiveness, which
solves our previous problem, but we have actually made our solution a
little simpler! I'd go so far as to say that this strategy is better than
the hypothetical I proposed.

Cool, switch expressions are (from what I can see) the best solution for
representing single transition STD's.

But what about multiple transitions? Let's try that with switch expressions.

      enum RPS3
      {

         Rock,
         Paper,
         Scissors,
         ;

         public static RPS3 losesTo(RPS3 choice)
         {

            return
               switch (choice)
               {

                  case Rock      -> Paper;
                  case Paper     -> Scissors;
                  case Scissors  -> Rock;

               };

         }

         public static RPS3 winsAgainst(RPS3 choice)
         {

            return
               switch (choice)
               {

                  case Rock      -> Scissors;
                  case Paper     -> Rock;
                  case Scissors  -> Paper;

               };

         }

      }

So, since switch expressions only can handle one transition, we create
another method to handle the other transition.

But from there, things start to get a little unideal. For starters, now the
transitions are separate from each other. This means that if a state has to
change significantly, then we need to make sure that we modify all of its
transitions. Currently, that is easy enough -- there's only 2 transitions.
But you see my point here? If there are 4 different transitions are
possible for each state, then this means that maintainability is being
damaged by this strategy. The more transitions you add, the closer we get
to an ugly mess.

Let's address that by combining our 2 solutions above.

      enum RPS4
      {

         Rock,
         Paper,
         Scissors,
         ;

         public static final Map<RPS4, Transitions> lookup;

         record Transitions(RPS4 losesTo, RPS4 winsAgainst) {}

         static
         {

            Map<RPS4, Transitions> tempLookup = new HashMap<>();

            for (RPS4 each : RPS4.values())
            {

               tempLookup
                  .put
                  (
                     each,
                     switch (each)
                     {

                        case Rock      -> new Transitions(Paper, Scissors);
                        case Paper     -> new Transitions(Scissors, Rock);
                        case Scissors  -> new Transitions(Rock, Paper);

                     }
                  );

            }

            lookup = Map.copyOf(tempLookup);

         }

      }

So, now the cracks are starting to form. Already, our business logic is
getting drowned out by a lot of boilerplate and excess. But all of that
boilerplate is "necessary", since it gives us guarantees that we depend
upon in order to write correct code. And to add insult to injury, the
business logic looks very similar to what my original, hypothetical
solution looks like.

But there's room for improvement -- get rid of the map entirely, and just
make the switch expression the entirety of the method body.

      enum RPS4_1
      {

         Rock,
         Paper,
         Scissors,
         ;

         record Transitions(RPS4_1 losesTo, RPS4_1 winsAgainst) {}

         public static Transitions transitions(RPS4_1 choice)
         {

            return
               switch (choice)
               {
                  case Rock      -> new Transitions(Paper, Scissors);
                  case Paper     -> new Transitions(Scissors, Rock);
                  case Scissors  -> new Transitions(Rock, Paper);

               };

         }

      }

Now, I am a believer that premature optimization causes more problems than
it solves. That said, I am a bit uncomfortable with the idea of making a
new Transitions object each time this method is called. I will go ahead and
assume that the JVM can optimize most, if not all of that cost, but the
concern remains. But either way, worst case scenario, we could make private
static final fields that house the Transitions.

Regardless, by using a switch expression, we have essentially created our
own little lookup map, but as a method.

But notice, this Transitions object is returning a lot more than what we
want. Which is not a problem at all at this scale. But what if we want our
Rock Paper Scissors example to have more details than just transitions?

Imagine that each enum had other attributes like weight, size, etc. That
puts us back into the Catch 21 from before.

If we stick the extra metadata into Transitions, then we are constructing
(maybe, maybe JVM saves us) all of this excess data when we only need 1 or
2 of the fields. And even if the JVM saves us, it doesn't change the fact
that we still have to do more hops and unboxing just to get to what I want
- the data. There's that indirection creeping back in.

If we don't stick the extra metadata into Transitions, then we are back to
making 2 or more methods to capture what we need, which has the same
problems as above.

And might I remind you, State Transition Diagrams were originally built for
(and currently, most frequently used with) algorithms, specifically
path-finding algorithms. So performance and memory usage mean a lot to
STD's most common domains.

But, still, there are ways around this too. Java is an incredibly flexible
language, as this response has clearly proven. We can get direct
relationships and exhaustiveness another way in java -- abstract methods.
So let's try that.

      enum RPS5
      {

         Rock{
            public Transitions transitions() { return new
Transitions(Paper, Scissors); }
            public String metadata() { return "whatever"; }
         },
         Paper{
            public Transitions transitions() { return new
Transitions(Scissors, Rock); }
            public String metadata() { return "whatever"; }
         },
         Scissors{
            public Transitions transitions() { return new Transitions(Rock,
Paper); }
            public String metadata() { return "whatever"; }
         },
         ;

         record Transitions(RPS5 losesTo, RPS5 winsAgainst) {}

         public abstract Transitions transitions();

         public abstract String metadata();

      }

For simplicity's sake, I made everything one line. But whether it's one
line or multiple, it's pretty clear here that we still have a lot of
overhead for what boils down to just simple relationships, right?

But there's actually another hidden undesirable. In order to appease the
abstract method constraint, I had to use an anonymous class for each one of
my enum values. Each anonymous class literally creates a new .class file
upon compilation.

For proof, here is my directory of .class files for this example.

'ScratchPad$RPS4$1.class'
'ScratchPad$RPS4$2.class'
'ScratchPad$RPS4$3.class'
'ScratchPad$RPS4.class'

Now, this may not seem so bad. But what happens if my STD has 50 states?
And what if I have multiple STD's like this? This balloons up very quickly,
and can create hundreds or thousands of classes for every single state that
I create. And please recall, I spend a lot of my time working with State
Transition Diagrams.

So, still plenty of boilerplate, and now I have a .class file per state.

If we're going to make a literal .class file per state, why don't we just
go all the way and make it a record and a sealed type? At this point, we'd
be adding only a little extra overhead, but getting back some simplicity in
return and a lot more directness in return. The big reason why going to
records makes sense is because we can escape the Catch 21.

Now, I'll go ahead and skip to the end and say that records and sealed
expressions work well enough for my Rock Paper Scissors problem, but we end
up recreating a lot of enum functionality. When I had enum's, I had the
values() method, EnumSet's, EnumMap's, and more. I have to replace that
with a static final field on the sealed interface, IdentityHashSet, and
IdentityHashMap, respectively.

My entire point in showing all of this is to demonstrate all the work we
had to do each time our requirements change. Rather than adding another
attribute to my enum value's state, I had to rearchitect my solution in yet
another way. And this is because the sensical, obvious solution, kept
bumping into this circular reference thing, forcing me to uproot my
solution and replace it with something different.

Whereas if we go back to my hypothetical solution, each time the problem
changes, the answer becomes obvious, with little to no rearchitecting. We
add a field to our enum, we get to keep our exhaustiveness and directness
for free, while only charging us a fair tax in terms of conciseness. A
clear, obvious solution. And might I add, once we achieve any sort of scale
(5+ values), my solution becomes the most concise and easy to read solution.

But the part that I want to highlight here is this -- each of the solutions
that we highlighted are "simple enough" to read. So, their legibility and
ease of comprehension is really not my problem here. What bothers me is how
maintaining and modifying them is annoying and non-obvious anytime a
functionality change request comes in.

As we travel further and further down the rabbit hole, we find ourselves
recreating the things that we used to have for free. Because I didn't have
exhaustiveness, I needed a switch expression. Because I didn't have
directness, I switched from enums to records and sealed types.

And for those who think we are at the end of the line, thinking that this
is all annoying, but manageable if that is all there is to keep in mind,
please note that I have taken you down exactly one path. We went to that
path's logical conclusion, but only one path. For example, what if my
states are different types because each state needs different fields or a
differing number of transitions? There's a whole laundry list of other
paths that I didn't even touch, and spoiler alert -- there's surprisingly
little overlap between them.

Hopefully this illustrates my pain points better?

Thank you for your time and insight!
David Alayachew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20230706/c632979b/attachment-0001.htm>