Question about circular references

Tue Jul 4 11:16:51 UTC 2023

As Gavin says, you can create what you want with non-record classes, and you can even do it with immutable objects!

jshell> class A {

   ...>     final B b;

   ...>     A () {

   ...>         b = new B(this);

   ...>     }

   ...> }

|  created class A, however, it cannot be referenced until class B is declared

jshell> class B {

   ...>     final A a;

   ...>     B (A a) {

   ...>         this.a = a;

   ...>     }

   ...> }

|  created class B

Using a combination of lambdas or similar to capture that work that needs to be done and leaking `this` out of the constructor you can produce bigger cycles, but for more complex graphs you’d need to do some work to create everything in the right order (or maybe complete abuse Loom’s threads to yield in the middle of a constructor and pass references around in a queue), but it’s completely evil and probably not worth the effort.

You can even pull the above trick with records, but you’ll need an extra level of cunning because you must delegate to the canonical constructor, so you’d need to find somewhere else to stash all the state you’re passing around (private thread local maybe?) and pass entirely dummy values into that constructor, and it’s even more not worth it. You’ll also find that this sort of thing break lots of stuff, such as printing values in jshell because there is an implicit assumption that nobody would be quite this evil.

If you were going to be this evil then I think it would be easier to just indirect everything through a single array list or something and use indexes rather than refs.

Duncan.

From: amber-dev <amber-dev-retn at openjdk.org> on behalf of Gavin Bierman <gavin.bierman at oracle.com>
Date: Tuesday, 4 July 2023 at 11:14
To: David Alayachew <davidalayachew at gmail.com>
Cc: amber-dev <amber-dev at openjdk.org>
Subject: Re: Question about circular references
[External Email]

________________________________
Hi David,

I think what you are asking for is “why can’t I build cyclic record values”?

This is a reasonable question - it’s a very common question in languages that are build to be immutable from the ground-up, e.g. functional languages. For these languages the solution is either to just add pointers (e.g Standard ML took this approach), or to support something more first class. If you look at F#, you’ll see that they support this - although it’s pretty hidden in the spec! They allow (i) let expressions to be recursive (with a REC keyword) and (ii) they provide the AND operator to define mutually recursive value bindings and type declarations. This leads to code like this (taken from https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/records<https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/records> )

// Create a Person type and use the Address type that is not defined
type Person =
  { Name: string
    Age: int
    Address: Address }
// Define the Address type which is used in the Person record
and Address =
  { Line1: string
    Line2: string
    PostCode: string
    Occupant: Person }

// Create a Person type and use the Address type that is not defined
let rec person =
  {
      Name = "Person name"
      Age = 12
      Address =
          {
              Line1 = "line 1"
              Line2 = "line 2"
              PostCode = "abc123"
              Occupant = person
          }
  }

I think this is the sort of thing you are after, right?

The issue is that this would create **all sorts of problems**, e.g. around DA/DU, initialisation semantics, pattern matching semantics, etc. etc.

As Ron has suggested - if you want to build a cyclic structure then we already have fantastic technology for that - use classes. You want all the other fields to be immutable? Use final! Once we support patterns in classes, then you’ll get many of the advantages of using records as a client.

With records we have carved out a subset of classes that fit the design, for which we can give a great experience without massive changes to the language and its semantics. We could have gone further but we would have to have made compromises that weren’t worth the price.

Hope this helps,
Gavin

PS We also provide another option: Want to be a hacker? Make your record with array components and mutate the array contents to your heart's content!

record Loop(String head, Loop[] tail){};

Loop[] tl = new Loop[]{null};

var l = new Loop("hello", tl);

// A loop of hellos
tl[0]=l;

var tmp = l;
for(int i=0; i<100; i++){
    System.out.println(i+": "+tmp.head());
    tmp=tmp.tail()[0];
}

// Make l a loop of hello worlds
tl[0]=new Loop("world", new Loop[]{l});

for(int i=0; i<100; i++){
    System.out.println(i+": "+tmp.head());
    tmp=tmp.tail()[0];
}

(Don’t tell anyone I showed you this code :-) )

On 30 Jun 2023, at 23:28, David Alayachew <davidalayachew at gmail.com> wrote:

Hello all,

First off, please let me know if I have CC'd the wrong groups. I've CC'd the Amber Dev Team since this involves records, but it's not specifically about records.

---

For the past couple of weeks, I have been building a program that uses a State Transition Diagram (aka State Machine, Finite Automata, etc. -- https://en.wikipedia.org/wiki/Finite-state_machine<https://en.wikipedia.org/wiki/Finite-state_machine>) to model part of its control flow. I've been using some Amber features to facilitate this (and having a wonderful time of it), but then I hit a snag.

Here is a(n EXTREMELY) simplified version of my actual problem. Imagine I have code like the following.

```java

sealed interface Node<T> permits StartNode, BranchingNode, EndNode {...unrelated stuff here...}

record StartNode<T> (Node<T> a, Node<T> b, Node<T> c) implements Node<T> {}
record BranchingNode<T> (Node<T> a, Node<T> b, Node<T> c, ...fields unrelated to transitioning...) implements Node<T> {}
record EndNode<T> (...fields unrelated to transitioning...) implements Node<T> {}

```

This type hierarchy is meant to represent a control flow of sorts. Control flow is (imo) best modeled using a State Transition Diagram, so I instinctively reached for that. And since my API needed to be nothing but the data (each Node needed to be tightly coupled to my internal state representation), I realized that this is an ideal use case for records.

Things worked out well enough until I tried to model a circular relationship.

Through chance, all of my control flows up to this point were tree-like, so I could model them by starting from the "leaves," then climbing up until I got to the "roots". To use State Transition Diagram terminology, I started from my exit states and modeled my way up to my entry states.

For example, assume that my State Transition Diagram is as so.

S ---a---> T
S ---b---> U
S ---c---> V
T ---a---> U
T ---b---> V
T ---c---> E
U ---a---> V
U --b|c--> E
V -a|b|c-> E

S is my StartNode, and E is my ExitNode.

In this case, modeling with records is easy. It would look like so.

```java

ExitNode<UnrelatedStuff> e = new ExitNode<>(...unrelated...);
BranchingNode<UnrelatedStuff> v = new BranchingNode<>(e, e, e, ...unrelated...);
BranchingNode<UnrelatedStuff> u = new BranchingNode<>(v, e, e, ...unrelated...);
BranchingNode<UnrelatedStuff> t = new BranchingNode<>(u, v, e, ...unrelated...);
StartNode<UnrelatedStuff> s = new StartNode<>(t, u, v);

return s;

```

But once I hit a circular reference, I could not figure out how to model the code using the same format.

For example, what if I say the following instead?

V ---a---> T

How do I model that using my current representation?

Obviously, I could change my representation, but all of them required me to "taint" my representation in incorrect ways.

For example, I could swap out my records for simple classes where the references to Node's were mutable. But I strongly disapprove of this strategy because these nodes do NOT have a mutable relationship. Obviously, I could put something in the Javadoc, but I want to fix the incorrect representation, not put a warning sign in front of it.

Also, I could use indirection, like having a separate Map whose values are the actual Node references and the keys would be a record Pair<T>(String nodeId, Branch branch) {} where Branch is enum Branch { a, b, c, ; } and then give each Node an id, changing my record to now be record BranchingNode<T> (String id, ...the same as above...) {}. But ignoring the fact that I now have to deal with managing an id, I've also added a lot of unnecessary bloat and indirection just to get the circular reference I wanted. What should be a direct relationship now requires a Map lookup.

In that same vein, someone suggested that I use pattern-matching for switch, but that would require me to create a new switch expression for every single state. That's even more verbose and indirect than the Map. At least with the map, I can put them all in one expression. This strategy has an expression for each state!

I've been told that there is another pathway involving reflection, but it basically amounts to breaking the rules of Java. Apparently, you can turn off finality to insert in fields you want, and then turn it back on? I liked this idea the least compared to all of the others, so I didn't pursue it any further.

In the end, I decided to go down the Map lookup route. But I just wanted to point out my experience with this because it was a surprising and annoying speed bump along an otherwise smooth road. I didn't think that something as small as a circular reference would require me to uproot my entire solution.

And finally, I want to emphasize that the same blockers above apply no matter what pathway I go down. I had actually tried implementing this first as an enum before I tried a record, since an enum would more accurately represent my state.

```java

enum State
{

V(T, EXIT, EXIT), //FAILURE -- T cannot be referenced yet
U(V, EXIT, EXIT),
T(U, V, EXIT),
;

...public final fields and constructor...

}

```

But even then, the same problem occurred -- I can't reference an enum value until it has been declared. I thought going down the path of records would give me the flexibility I wanted, but no dice.

It reminded me of that one programming meme.

> * High Quality
> * Quickly Built
> * Low Cost
>
> You can only pick 2

But instead, it's

* Circular Relationship
* Immutability
* Direct References

What are your thoughts? Is this a problem in your eyes too? Or simply a non-issue?

Thank you for your time and insight!
David Alayachew

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/amber-dev/attachments/20230704/93137354/attachment-0001.htm>