[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)

Tue Jul 23 17:44:57 UTC 2024

 > The implementation makes it fairly obvious that we could use the same technique of tagging with an aot_reloc at
 > StoreCachedCode time and patching at LoadCachedCode time for any other instruction (or indivisible instruction sequence)
 > which 1) encodes some runtime constant from the original JVM and 2) is able to be reset by patching it to use the value
 > in the new JVM.

I agree. As we discussed on meeting we may use it to patch compressed oops/klass code too.

We have the same issue with card_table_address. I actually don't know why current AOT code works??? May be because it is 
the same value in training and production runs. But we need to fix it too.

It is constant node which is embedded into AddP node:
https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp#L96

I marked it only in platform specific code:
https://github.com/openjdk/leyden/blob/premain/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp#L68

Be careful with using `format()` in shared code because it depends on platform specific setting `format_width` in 
`relocInfo_<arch>.hpp` (32-bit arm has format_width = 0).

Thanks,
Vladimir K

On 7/23/24 9:12 AM, Andrew Dinn wrote:
> Hi Vladimir,
> 
> On 18/07/2024 17:15, Vladimir Kozlov wrote:
>> What about allocating word in CodeCache as we do for some intrinsics stubs tables? You will need to generate it only 
>> once and can use runtime_type relocation to access it.
> 
> I am looking into that now. I've been working on something else  that might interest you ...
> 
>> It is all about loading with existing relocation vs specialized relocation for immediate value (Option three).
>> I would like to see how complex option three is.
> I have an implementation of option 3 in my JDK-8335440-aot-reloc branch. m.b. it is based on a slightly out of date 
> premain but the implementation indicates what is involved even without a rebase (I'll do that soon):
> 
> 
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/compare/premain...adinn:leyden:JDK-8335440-aot-reloc?expand=1__;!!ACWV5N9M2RV99hQ!OUrEzWWlhEAqFsyLSPxuvh29GQ2eSz44-5Y2PpWI6ttTbJRcG6HWGLGXyVXUioqXgSChwiwlU2TcdtCAlzM$
> Basically this solution emits an aot_reloc for the GC barrier left shift immediate instruction when we are generating 
> AOT code (StoreCachedCode == true). When loading AOT code (LoadCachedCode == true) any left shift immediate tagged with 
> an aot_reloc has its operand patched with the log region grain size from the current JVM. I use a format field to 
> identify what reloc is required so the same model will support other AOT Load time relocs if need be.
> 
> The implementation makes it fairly obvious that we could use the same technique of tagging with an aot_reloc at 
> StoreCachedCode time and patching at LoadCachedCode time for any other instruction (or indivisible instruction sequence) 
> which 1) encodes some runtime constant from the original JVM and 2) is able to be reset by patching it to use the value 
> in the new JVM.
> 
> It is worth noting that for both C1 and C2 generated code I had to set a tag on the LIR node (c1_LIROp in C1, 
> URShiftNode in C2) in order to mark it as a relocatable instruction. Later on, in the generation phase, I detect the 
> mark and emit a reloc to the code buffer. This works because none of the LIR transforms modify the left shift node by 
> merging it into some other operation or by merging another operation into it (I checked but this is a contingent fact 
> based on the current state of the code).
> 
> Clearly, in the case of barrier patching we could finesse the above problem by generating the GC barrier late enough to 
> bypass graph normalization and back end reductions. However, if we want to use a similar technique to AOT patch other 
> instructions or sequences then we will need a more reliable way of ensuring that the relocatable instructions are not 
> replaced or merged during normalization/final reduction.
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
>