[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)

Vladimir Kozlov vladimir.kozlov at oracle.com
Wed Jul 24 19:43:27 UTC 2024


Thank you for checking code for card_table_address.

 > Oh, that's a nuisance. I was concerned that I was perhaps using the wrong reloc model here. Should I be encoding info
 > about what/how to patch the target instruction(s) using reloc data?

I am currently thinking about adding indication what relocation is pointing on: blob, stubs, external address, string.
Currently we are looking through all our address tables in SCCache to find match. Would be nice if we can get this 
information from relocation. And I also thought about using format for it but realized it is PD and it reserves bits in 
RelocInfo word. On other hand we can add an other 12 (31 - 18 - 1) new relocations without using additional bits.

In short add new relocation if you need it.

Thanks,
Vladimir K

On 7/24/24 7:40 AM, Andrew Dinn wrote:
> On 23/07/2024 18:44, Vladimir Kozlov wrote:
>> I agree. As we discussed on meeting we may use it to patch compressed oops/klass code too.
> 
> Yes, we could potentially do something similar to adjust the shift and base adjustment employed when generating a narrow 
> oop or klass encode or decode. However, that would only be useful with certain limits.
> 
> 1) We can only cater for a change from one narrow-oop/klass base+shift configuration to another base+shift 
> configuration. Patching code to switch from narrow oops/klass pointers to full width (or vice versa) would require 
> adjusting not just the oop/klass load/store but also other instructions that hard-wire header/payload field sizes and 
> offsets based on the narrow/full width layout assumptions. Likewise it would require resizing and repacking objects 
> stored in the CDS mapped heap section.
> 
> 2) Even if we restrict ourselves to the case where we retain narrow oops and simply allow for a change of base and/or 
> shift (i.e. keep the object layouts the same), that will only work if pointers installed in mapped CDS heap objects are 
> also rewritten to use the runtime encoding.
> 
> 3) When generating AOT code for the encode and decode we would have to allow for the worst case i.e. leave room (nops) 
> for both heap-base adjustment and shift even when they are not needed at generate time.
> 
>> We have the same issue with card_table_address. I actually don't know why current AOT code works??? May be because it 
>> is the same value in training and production runs. But we need to fix it too.
> 
>> It is constant node which is embedded into AddP node:
>> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp*L96__;Iw!!ACWV5N9M2RV99hQ!LjgTGnmqk1Ez4pzqXuTXAbZZfHhhP9a1zAFdWGz7V59xO3ggQYVrdW3jTlSNYML_SvRN0Ip_HA-9E0w0Jbo$ 
> 
> I believe C2 is ok. On AArch64 the C2 back end has an operand called immByteMapBase that matches any use of 
> byte_map_base as a pointer constant. A corresponding rule ensures that any attempt to load that constant pointer is done 
> by calling
> 
>    __ load_byte_map_base()
> 
> and the macr assembler uses an ExternalAddress when StoreCachedCode is set.
> 
>> I marked it only in platform specific code:
>> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp*L68__;Iw!!ACWV5N9M2RV99hQ!LjgTGnmqk1Ez4pzqXuTXAbZZfHhhP9a1zAFdWGz7V59xO3ggQYVrdW3jTlSNYML_SvRN0Ip_HA-92hI_Afg$ 
> 
> I also think there is no mystery as to why this works on AArch64. The equivalent code in 
> cardTableBarrierSetAssembler_aarch64.cpp also calls
> 
>    __ load_byte_map_base()
> 
> However, that said, looking at the barrier card table write it seems the card table shift is another value that the user 
> might decide to reconfigure between assembly and production runs. The shift is derived form GCCardTableSize which is a 
> gc global command line option so susceptible to being reset.
> 
> We could use an aot_reloc with a different format to tag card table shift lsr instructions and patch them at shared code 
> load to use the current runtime GCCardTableSize. Of course, this is not as urgent a problem as the grain size shift 
> because the card table size only ever changes thanks to an explicit command line request whereas the grain size may be 
> changed ergonomically when -Xmx is passed.
> 
>> Be careful with using `format()` in shared code because it depends on platform specific setting `format_width` in 
>> `relocInfo_<arch>.hpp` (32-bit arm has format_width = 0).
> 
> Oh, that's a nuisance. I was concerned that I was perhaps using the wrong reloc model here. Should I be encoding info 
> about what/how to patch the target instruction(s) using reloc data?
> 
> regards,
> 
> 
> Andrew Dinn
> -----------
> 


More information about the leyden-dev mailing list