[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)
Andrew Dinn
adinn at redhat.com
Wed Jul 24 14:40:38 UTC 2024
On 23/07/2024 18:44, Vladimir Kozlov wrote:
> I agree. As we discussed on meeting we may use it to patch compressed
> oops/klass code too.
Yes, we could potentially do something similar to adjust the shift and
base adjustment employed when generating a narrow oop or klass encode or
decode. However, that would only be useful with certain limits.
1) We can only cater for a change from one narrow-oop/klass base+shift
configuration to another base+shift configuration. Patching code to
switch from narrow oops/klass pointers to full width (or vice versa)
would require adjusting not just the oop/klass load/store but also other
instructions that hard-wire header/payload field sizes and offsets based
on the narrow/full width layout assumptions. Likewise it would require
resizing and repacking objects stored in the CDS mapped heap section.
2) Even if we restrict ourselves to the case where we retain narrow oops
and simply allow for a change of base and/or shift (i.e. keep the object
layouts the same), that will only work if pointers installed in mapped
CDS heap objects are also rewritten to use the runtime encoding.
3) When generating AOT code for the encode and decode we would have to
allow for the worst case i.e. leave room (nops) for both heap-base
adjustment and shift even when they are not needed at generate time.
> We have the same issue with card_table_address. I actually don't know
> why current AOT code works??? May be because it is the same value in
> training and production runs. But we need to fix it too.
> It is constant node which is embedded into AddP node:
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp#L96
I believe C2 is ok. On AArch64 the C2 back end has an operand called
immByteMapBase that matches any use of byte_map_base as a pointer
constant. A corresponding rule ensures that any attempt to load that
constant pointer is done by calling
__ load_byte_map_base()
and the macr assembler uses an ExternalAddress when StoreCachedCode is set.
> I marked it only in platform specific code:
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp#L68
I also think there is no mystery as to why this works on AArch64. The
equivalent code in cardTableBarrierSetAssembler_aarch64.cpp also calls
__ load_byte_map_base()
However, that said, looking at the barrier card table write it seems the
card table shift is another value that the user might decide to
reconfigure between assembly and production runs. The shift is derived
form GCCardTableSize which is a gc global command line option so
susceptible to being reset.
We could use an aot_reloc with a different format to tag card table
shift lsr instructions and patch them at shared code load to use the
current runtime GCCardTableSize. Of course, this is not as urgent a
problem as the grain size shift because the card table size only ever
changes thanks to an explicit command line request whereas the grain
size may be changed ergonomically when -Xmx is passed.
> Be careful with using `format()` in shared code because it depends on
> platform specific setting `format_width` in `relocInfo_<arch>.hpp`
> (32-bit arm has format_width = 0).
Oh, that's a nuisance. I was concerned that I was perhaps using the
wrong reloc model here. Should I be encoding info about what/how to
patch the target instruction(s) using reloc data?
regards,
Andrew Dinn
-----------
More information about the leyden-dev
mailing list