<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><span style="font-family:Arial,Helvetica,sans-serif">On Wed, Jul 27, 2022 at 1:41 PM Brian Goetz <<a href="mailto:brian.goetz@oracle.com">brian.goetz@oracle.com</a>> wrote:</span><br></div></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">


  <div>In ASM, you would use the "tree" API, to materialize the body into a

    random-access data structure.  This is a bit unfortunate, because

    (a) the tree API is much slower than the streaming API, and (b) it

    is also somewhat different from the streaming API.  (And mutable.) <br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Indeed, I abandoned using ASM for parsing for this reason, in favor of a hand-written ByteBuffer-backed parser. It goes without saying that I would be pleased to use something else though. :-)</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>

    We intent to improve on that, by having the "materialized" API just

    be "put the elements in a list/tree structure".  For

    ClassModel/MethodModel, you can see the idea in play; you can stream

    the elements of a ClassModel, and you'll get methods, fields, etc,

    but you could also just call ClassModel::fields and it will

    materialize (and cache) a List<FieldModel> and return that. 

    What you want is the equivalent for CodeModel, which is conceptually

    similar but we are missing a few things.  <br>

    <br>

    You can of course call CodeModel::elementList and get a

    List<CodeElement> out, which includes the label targets

    inline.  What's missing is the ability to map labels to *list

    indexes*.  We know we want this, we made a stab at it in an early

    prototype, it was a mess (because some other things were a mess),

    but we would like to return to this. <br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Great! I do especially like the lazy (or at the very least, potentially lazy) materialization, which was another contributing reason for leaving ASM behind on the parsing end.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>

    This is related to a comment recently from Rafael, in that this

    works when we are traversing a *bound* CodeModel, but not a buffered

    code model (which might result from an intermediate stage of a

    transformation.)  If we are OK with making operations like bci()

    partial, we can address this by, say, defining a refined

    `Iterator<CodeElement>` that also has a bci() accessor.  This

    works when parsing, but not necessarily when transforming, but that

    might be OK. </div></blockquote><div><span style="font-family:arial,helvetica,sans-serif"></span><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">OK, I look forward to seeing whether or how this gets addressed.</div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>I think you may be mixing the Opcode and Instruction abstractions? 

    The `Opcode` abstraction is explicitly about bytecodes and

    bytecode-specific metadata, whereas an Instruction is an

    instantiation of an Opcode + operands.  (Some instructions, of

    course, have no operations (e.g., `iadd`); in this case, you'll

    notice the implementation has a singleton cache.) </div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Maybe I explained poorly but I was specifically thinking of Opcodes and their characteristics, independently of any particular realization of an instruction in a code model.</div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><blockquote type="cite"><div dir="ltr"><div><div style="font-family:arial,helvetica,sans-serif">Would it not

            make sense to make `Opcode` a sealed interface, with an enum

            for each opcode shape? </div>

        </div>

      </div>

    </blockquote>

    <br>

    We tried something like this early on.  It ran into the problem that

    switching over multiple enums in one switch is not supported.  So

    having multiple enums may be more rich in modeling, but clients pay

    a penalty -- multiple switches.  This didn't feel like a good

    trade.  (It is possible the API and implementation has evolved since

    then, to make this less problematic, but that would have to be

    established.)<br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Ah, I had assumed that switching over multiple enums was addressed in the pattern-matching-switch update. Not having personally kept on top of the latest developments there, I had quickly sketched up a test in IntelliJ and it appeared to work so long as the static type of the switch argument was a sealed interface which permits only enum types *and* you happened to have static-imported the enum constant names (for syntactic reasons I suppose). But I didn't actually verify that this was allowed by spec and sure enough, `javac` rejects it as it does not consider the statically imported enum values to be constant expressions. Oh well.</div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><blockquote type="cite"><div dir="ltr"><div style="font-family:arial,helvetica,sans-serif">Obviously this

          is wandering dangerously close to the bikeshed borderline,

          however one other real-world advantage is that an enum

          constant in a more specific `*Opcode` subtype type can store

          more useful information about itself that a consumer could

          use; for example, the opcode constant for `IFEQ` could have a

          method `complement` which yields `IFNE`, which can be useful

          for simplifying some code generators (and I can think of

          specific cases both within qbicc and within Quarkus where this

          would have been useful).</div>

      </div>

    </blockquote>

    <br>

    This method exists in the library as an Opcode -> Opcode method.<br></div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Ah yes, I found it in `BytecodeHelpers`, excellent. That's in the `impl` subpackage though, so it doesn't feel very "public". Perhaps that class could be moved into the `jdk.classfile` package?</div><br></div><div> </div></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr">- DML • he/him<br></div></div></div>