[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)
Andrew Dinn
adinn at redhat.com
Tue Jul 23 16:12:56 UTC 2024
Hi Vladimir,
On 18/07/2024 17:15, Vladimir Kozlov wrote:
> What about allocating word in CodeCache as we do for some intrinsics
> stubs tables? You will need to generate it only once and can use
> runtime_type relocation to access it.
I am looking into that now. I've been working on something else that
might interest you ...
> It is all about loading with existing relocation vs specialized
> relocation for immediate value (Option three).
> I would like to see how complex option three is.
I have an implementation of option 3 in my JDK-8335440-aot-reloc branch.
m.b. it is based on a slightly out of date premain but the
implementation indicates what is involved even without a rebase (I'll do
that soon):
https://github.com/openjdk/leyden/compare/premain...adinn:leyden:JDK-8335440-aot-reloc?expand=1
Basically this solution emits an aot_reloc for the GC barrier left shift
immediate instruction when we are generating AOT code (StoreCachedCode
== true). When loading AOT code (LoadCachedCode == true) any left shift
immediate tagged with an aot_reloc has its operand patched with the log
region grain size from the current JVM. I use a format field to identify
what reloc is required so the same model will support other AOT Load
time relocs if need be.
The implementation makes it fairly obvious that we could use the same
technique of tagging with an aot_reloc at StoreCachedCode time and
patching at LoadCachedCode time for any other instruction (or
indivisible instruction sequence) which 1) encodes some runtime constant
from the original JVM and 2) is able to be reset by patching it to use
the value in the new JVM.
It is worth noting that for both C1 and C2 generated code I had to set a
tag on the LIR node (c1_LIROp in C1, URShiftNode in C2) in order to mark
it as a relocatable instruction. Later on, in the generation phase, I
detect the mark and emit a reloc to the code buffer. This works because
none of the LIR transforms modify the left shift node by merging it into
some other operation or by merging another operation into it (I checked
but this is a contingent fact based on the current state of the code).
Clearly, in the case of barrier patching we could finesse the above
problem by generating the GC barrier late enough to bypass graph
normalization and back end reductions. However, if we want to use a
similar technique to AOT patch other instructions or sequences then we
will need a more reliable way of ensuring that the relocatable
instructions are not replaced or merged during normalization/final
reduction.
regards,
Andrew Dinn
-----------
More information about the leyden-dev
mailing list