[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)
Vladimir Kozlov
vladimir.kozlov at oracle.com
Wed Jul 24 19:43:27 UTC 2024
Thank you for checking code for card_table_address.
> Oh, that's a nuisance. I was concerned that I was perhaps using the wrong reloc model here. Should I be encoding info
> about what/how to patch the target instruction(s) using reloc data?
I am currently thinking about adding indication what relocation is pointing on: blob, stubs, external address, string.
Currently we are looking through all our address tables in SCCache to find match. Would be nice if we can get this
information from relocation. And I also thought about using format for it but realized it is PD and it reserves bits in
RelocInfo word. On other hand we can add an other 12 (31 - 18 - 1) new relocations without using additional bits.
In short add new relocation if you need it.
Thanks,
Vladimir K
On 7/24/24 7:40 AM, Andrew Dinn wrote:
> On 23/07/2024 18:44, Vladimir Kozlov wrote:
>> I agree. As we discussed on meeting we may use it to patch compressed oops/klass code too.
>
> Yes, we could potentially do something similar to adjust the shift and base adjustment employed when generating a narrow
> oop or klass encode or decode. However, that would only be useful with certain limits.
>
> 1) We can only cater for a change from one narrow-oop/klass base+shift configuration to another base+shift
> configuration. Patching code to switch from narrow oops/klass pointers to full width (or vice versa) would require
> adjusting not just the oop/klass load/store but also other instructions that hard-wire header/payload field sizes and
> offsets based on the narrow/full width layout assumptions. Likewise it would require resizing and repacking objects
> stored in the CDS mapped heap section.
>
> 2) Even if we restrict ourselves to the case where we retain narrow oops and simply allow for a change of base and/or
> shift (i.e. keep the object layouts the same), that will only work if pointers installed in mapped CDS heap objects are
> also rewritten to use the runtime encoding.
>
> 3) When generating AOT code for the encode and decode we would have to allow for the worst case i.e. leave room (nops)
> for both heap-base adjustment and shift even when they are not needed at generate time.
>
>> We have the same issue with card_table_address. I actually don't know why current AOT code works??? May be because it
>> is the same value in training and production runs. But we need to fix it too.
>
>> It is constant node which is embedded into AddP node:
>> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp*L96__;Iw!!ACWV5N9M2RV99hQ!LjgTGnmqk1Ez4pzqXuTXAbZZfHhhP9a1zAFdWGz7V59xO3ggQYVrdW3jTlSNYML_SvRN0Ip_HA-9E0w0Jbo$
>
> I believe C2 is ok. On AArch64 the C2 back end has an operand called immByteMapBase that matches any use of
> byte_map_base as a pointer constant. A corresponding rule ensures that any attempt to load that constant pointer is done
> by calling
>
> __ load_byte_map_base()
>
> and the macr assembler uses an ExternalAddress when StoreCachedCode is set.
>
>> I marked it only in platform specific code:
>> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp*L68__;Iw!!ACWV5N9M2RV99hQ!LjgTGnmqk1Ez4pzqXuTXAbZZfHhhP9a1zAFdWGz7V59xO3ggQYVrdW3jTlSNYML_SvRN0Ip_HA-92hI_Afg$
>
> I also think there is no mystery as to why this works on AArch64. The equivalent code in
> cardTableBarrierSetAssembler_aarch64.cpp also calls
>
> __ load_byte_map_base()
>
> However, that said, looking at the barrier card table write it seems the card table shift is another value that the user
> might decide to reconfigure between assembly and production runs. The shift is derived form GCCardTableSize which is a
> gc global command line option so susceptible to being reset.
>
> We could use an aot_reloc with a different format to tag card table shift lsr instructions and patch them at shared code
> load to use the current runtime GCCardTableSize. Of course, this is not as urgent a problem as the grain size shift
> because the card table size only ever changes thanks to an explicit command line request whereas the grain size may be
> changed ergonomically when -Xmx is passed.
>
>> Be careful with using `format()` in shared code because it depends on platform specific setting `format_width` in
>> `relocInfo_<arch>.hpp` (32-bit arm has format_width = 0).
>
> Oh, that's a nuisance. I was concerned that I was perhaps using the wrong reloc model here. Should I be encoding info
> about what/how to patch the target instruction(s) using reloc data?
>
> regards,
>
>
> Andrew Dinn
> -----------
>
More information about the leyden-dev
mailing list