[code-reflection] RFR: Support storing the code that builds the code model [v2]

Thu Feb 27 11:05:04 UTC 2025

On Mon, 10 Feb 2025 12:11:51 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> Mourad Abbay has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Make TEXT the default storage mechanism for code models
>
> Marked as reviewed by mcimadamore (Reviewer).

> @mcimadamore Mourad implemented a transformation of the code model that builds a code model that adds local variables for values with more than one use, which makes it easier to generate the AST nodes. Would use of the internal `LetExpr` help avoid such a transformation, if so we can consider that for follow on work.

I'm not sure I have all the context here. The problem here seems to be when you have a value that is resulting from some potentially side-effect operation. E.g. like a method call:

%2 = bar(%1)

If `%2` is used multiple times, then javac has only one option -- that is, to hoist `%2` in a local variable, and then replaces all references to `%2` with references to the local variable. Inlining the call to `bar` at the use-site is not really an option, as that could change the semantics of the program.

Let expression nodes are useful when dealing with compact expressions. E.g.

`List l = let x = 42 in List.of(42)`

E.g. javac typically uses a let expression when it has to translate a single expression into something more complex, but it wants to do so by keeping the result as an expression (rather than turning the expression into a statement, which is not possible in all cases, such as in the case of a variable initializer).

It is true that what seems like a linear list of ops in a block can be modelled as something like more convoluted, like so:

let op1 = <op1 init> in
    let op2 = <op2 init> in
        let op3 = <op2 init> in
               ....
               <result>

This would mean to generate one let expression per op, where the "body" of the let expression is the remainder of the code model block. All this nesting is confusing, but is also avoidable -- a `LetExpr` node allows for more than one declaration for each body -- so you can translate the above as follows:

let (op1 = <op1 init> ;
     op2 = <op2 init> ;
     op3 = <op2 init>) in <result>

Doing something like this would probably avoid the need of generating extra local variables -- you now have one var declaration per op in the "statements" part of a `LetExpr`. It looks a bit odd -- visually -- that the body of the `LetExpr` is just the result of the code model block -- e.g. all the interesting part is in the setup code. But seems doable.

At the end of the day either adding extra variables (which can even be done as a pre-processing step, by javac), or using a more functional translation with `LetExpr` should work. 

P.S.
I looked at the code and, at least in some cases (`h`) the `TestAddVarsWhenNecessary` seems to add intermediate `Var` ops, but which seem redundant - as they are initialized with some function parameter. I'd expect javac to be able to deal with references to function parameters using a `JCIdent` pointing at the desired parameter.

-------------

PR Comment: https://git.openjdk.org/babylon/pull/305#issuecomment-2687617699