New JEP: Classfile Processing API

Sun Jun 19 18:21:21 UTC 2022

Hello Per!

Nice to hear from you again.

> An aside (and not telling you anything new I'm sure) but a "fixup" 
> post-processing pass
> is probably preferable to doing the code-generation twice: It's hard 
> for the latter to use
> a mix of short and long jumps as needed.

Yeah, it's unfortunate either way.  It turns out that 99+% of methods 
have no long jumps, either because they're shorter than 32K, or are 
longer but lucky.  So the retry approach is based on an optimistic 
assumption; we save having to do the fixup + compression pass almost all 
the time, at the cost of running the generation twice a tiny fraction of 
the time.  (Of course, if you know a method is going large, it might 
also be reasonable to indicate that when you start generating, to skip 
over the first pass -- this is something worth looking at.)

Even so, there are still fixups needed for forward branches since you 
don't know the offset until later (and have to hope that someone 
actually emits the corresponding label before the method is done.) But 
this can be accumulated as you go and patched cheaply in place at the 
end if you commit to a specific branch offset width.

Cheers,
-Brian

> When gnu.bytecodes generates code the instructions are appended to a byte
> array in optimistic form. In addition "fixup" operations are appended 
> to the fixup
> buffer (the arrays fixup_offset and fixup_labels).  Then when done 
> emitting instructions
> we call processFixups, which iterates over the fixups (3 times!) to 
> adjust the bytecode
> and assign final offsets to the labels.
>
> The processFixups method is complicated because it handles a number of 
> issues at
> the same time: various optimizations (including moving some code 
> blocks around),
> as well as assigning offsets to Labels.
>
> Feel free to get inspiration from:
>
> https://urldefense.com/v3/__https://gitlab.com/kashell/Kawa/-/blob/master/gnu/bytecode/CodeAttr.java__;!!ACWV5N9M2RV99hQ!NuFecLzqRMWfxSH4G-KYTp_9Jn6vyGPkm9yEuj7VQPxtb7CpXAzN5fAeyINsqhMRRkTrX_0Ua5gu7w$ 
>
> The code is a bit convoluted, written more for performance then 
> readability.
> However, it is at least somewhat commented - and I'm happy to answer 
> any questions.