Another experience report
Brian Goetz
brian.goetz at oracle.com
Sun Aug 7 15:46:23 UTC 2022
Thanks again for the thoughtful feedback.
> Because of the high density
> of instanceof checks on instruction types, this was the one place
> where I wished for shorter class names.
We did consider a few alternatives to the XxxInstruction convention,
such as XxxOp. I'm open to rationalizing names at this point; the
current conventions were chosen early on to be clear and consistent, so
we could move on to actually implementing the right idioms, but it is
reasonable to circle back and question naming (for some bounded time)
now that the shape of the API is more settled.
(Irritating confounding factor: if you name the builder methods load(),
store(), etc, when you get to "return" and "instanceof", you have a
problem...)
> Another huge boon is `block()`. I call it just once in my code base,
> but it reduces the work of managing local slots, their lifetimes, and
> their debug information by an enormous amount.
These were added relatively late, especially the local-variable
management, and we could especially use more feedback (and test
coverage!) on that. There are some missing bits here too, since
`allocateLocal` doesn't automatically generate LVT/LVTT entries, and it
should. Feedback and contribution is welcome here.
> With regard to performance, I can only say that it is fast.
I'm sure authors of other frameworks will take that as "fighting words",
but yes, it is likely to be somewhat surprising to many that such an
allocation-heavy approach offers this degree of "fast enough" for most
use cases.
> ### Lower than I like
>
> The bulk of API usage stays on a single level of abstraction, for
> example working with *Desc entry points instead of *Entry. There are
> rare places where this slipped at little.
I'm not following what you're getting at here. Do you mean there are
gaps where we are not consistent about choices of Desc vs Entry? Or
simply that you stuck to the Desc level as a user? And if so, was there
friction here?
> The New*Array family of instructions gave me a hard time. I had
> forgotten that there are three of those, and there was some bumping
> around involved while re-learning this fact. (Btw, anewarray() accepts
> a primitive ClassDesc and converts it into a reference type, e.g. "I"
> to "LI;".) Two of the instructions have no ClassDesc factory, which
> meant I had to wrap the family into a intermediate representation node
> to carry essentially (ClassDesc,int) downstream to the CodeBuilder
> instance. I wonder if it is worthwhile to move the three under an
> umbrella interface, similar to what ConstantInstruction is doing.
Quite possibly. These instructions gave us a hard time too, because
they are so irregular in how they work. It is quite possible that there
is a simplification here. The constant instructions have a similar
abstraction-resistence, and we did go around a few times with how much
to lump and split there too. The main constraint is that the lowest
level modeling should map directly to the classfile spec, which means
any creativity around merging representations has to happen one level up.
When designing the groupings of instructions, there are a lot of forces
pulling in different directions. The main normalizing force is that
transformation is an emergent property of reading+writing, and we want
for the no-op transformation to do as little "lifting and lowering" work
as possible. This is why no XxxInstruction uses XxxDesc in its data
model; converting from CP entry -> descriptor string -> XxxDesc ->
descriptor string -> CP entry for a no-op transform is expensive and
pointless. We have to worry about this because the default
representation is a stream of element; if the elements are "too far"
from what's in the classfile, then the lifting and lowering overhead
will eat you.
If you try to merge semi-related instructions like the constant or
new-array instructions into a common representation, that representation
will not be the natural representation for some or all of the
instructions in the group. When we hit walls like this, we fall back to
splitting, and modeling the classfile spec directly, but there is
probably more we can do atop them. Suggestions welcome.
> The one place where I use labelToBci() is try/catch/finally. There is
> the special case of exceptionCatch() failing for an empty region, a
> condition that in turn can lead to handler blocks becoming
> unreachable. For me, the only robust way to deal with this to a)
> guard against an empty region by inspecting the bcis and b)
> subsequently omitting the invalid/unreachable parts.
This raises a good question about how much the library wants to do to
"fix" questionable bytecode. We already NOP out unreachable bytecode
(otherwise the verifier freaks out). Should we just silently drop catch
clauses associated with empty try blocks? (We won't know that they are
empty until after all the labels are resolved, so we can't usually
detect this at the point of emitting the catch entry.) What about when,
by the time we get to the end of generation, a label used in a
try-catch, or LVT[T], isn't bound? Should we throw, or just drop the
entry? I suspect one size does not fit all here and we have to design
some more options-handling.
> Another single use only is constantPool(), to go from a DMHD instance
> to CodeBuilder's (field|invoke)Instruction. This was a consequence of
> DMHD only providing the lookupDescriptor() as String and not as an MTD
> as well. With hindsight, it may have been better for me to recover
> the MTD from the String regardless, and to stay on the level of *Desc
> throughout.
Is there something missing that would bridge that for you?
> Finally, is there a way to decide between tableswitch and
> lookupswitch? Lacking something better, I'm trying to emulate this
> code here:
> https://github.com/openjdk/jdk-sandbox/blob/master/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/Gen.java#L1320
Most compilers use a heuristic based on the size of the two
alternatives, comparing `(hi-lo) / count` to some density threshold.
You're mostly optimizing for bytecode size here, since when the JIT gets
its hands on it, it has its own heuristics.
> ### Lost in translation
>
> One feature I cannot duplicate with Classfile is try/catch/finally in
> expression position when the operand stack is not empty. The old
> bytecode generator dealt with this case by unwinding the operand stack
> into locals, evaluating the t/c/f, and then rebuilding the operand
> stack with the result on top. But to do this, one needs to know what
> the operand stack looks like at the point of the `try`.
Interesting point. We do not build stack maps as we go, so we don't
have our hands readily on the stack state. However, I could imagine an
overload of the try-catch builder that would let you feed it a
TypeKind[], and that would use allocateLocal to automate the push/pop
logic. This is something you should be able to build from outside the
library, too; this would be a good experiment try try. (You'd have to
manually compute the stack state.)
Cheers,
-Brian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/classfile-api-dev/attachments/20220807/a7cb1489/attachment-0001.htm>
More information about the classfile-api-dev
mailing list