[External] : Re: premain: Possible solutions to use runtime G1 region grain size in AOT code (JDK-8335440)

Andrew Dinn adinn at redhat.com
Wed Jul 24 14:40:38 UTC 2024


On 23/07/2024 18:44, Vladimir Kozlov wrote:
> I agree. As we discussed on meeting we may use it to patch compressed 
> oops/klass code too.

Yes, we could potentially do something similar to adjust the shift and 
base adjustment employed when generating a narrow oop or klass encode or 
decode. However, that would only be useful with certain limits.

1) We can only cater for a change from one narrow-oop/klass base+shift 
configuration to another base+shift configuration. Patching code to 
switch from narrow oops/klass pointers to full width (or vice versa) 
would require adjusting not just the oop/klass load/store but also other 
instructions that hard-wire header/payload field sizes and offsets based 
on the narrow/full width layout assumptions. Likewise it would require 
resizing and repacking objects stored in the CDS mapped heap section.

2) Even if we restrict ourselves to the case where we retain narrow oops 
and simply allow for a change of base and/or shift (i.e. keep the object 
layouts the same), that will only work if pointers installed in mapped 
CDS heap objects are also rewritten to use the runtime encoding.

3) When generating AOT code for the encode and decode we would have to 
allow for the worst case i.e. leave room (nops) for both heap-base 
adjustment and shift even when they are not needed at generate time.

> We have the same issue with card_table_address. I actually don't know 
> why current AOT code works??? May be because it is the same value in 
> training and production runs. But we need to fix it too.

> It is constant node which is embedded into AddP node:
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/gc/shared/c2/cardTableBarrierSetC2.cpp#L96

I believe C2 is ok. On AArch64 the C2 back end has an operand called 
immByteMapBase that matches any use of byte_map_base as a pointer 
constant. A corresponding rule ensures that any attempt to load that 
constant pointer is done by calling

   __ load_byte_map_base()

and the macr assembler uses an ExternalAddress when StoreCachedCode is set.

> I marked it only in platform specific code:
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp#L68

I also think there is no mystery as to why this works on AArch64. The 
equivalent code in cardTableBarrierSetAssembler_aarch64.cpp also calls

   __ load_byte_map_base()

However, that said, looking at the barrier card table write it seems the 
card table shift is another value that the user might decide to 
reconfigure between assembly and production runs. The shift is derived 
form GCCardTableSize which is a gc global command line option so 
susceptible to being reset.

We could use an aot_reloc with a different format to tag card table 
shift lsr instructions and patch them at shared code load to use the 
current runtime GCCardTableSize. Of course, this is not as urgent a 
problem as the grain size shift because the card table size only ever 
changes thanks to an explicit command line request whereas the grain 
size may be changed ergonomically when -Xmx is passed.

> Be careful with using `format()` in shared code because it depends on 
> platform specific setting `format_width` in `relocInfo_<arch>.hpp` 
> (32-bit arm has format_width = 0).

Oh, that's a nuisance. I was concerned that I was perhaps using the 
wrong reloc model here. Should I be encoding info about what/how to 
patch the target instruction(s) using reloc data?

regards,


Andrew Dinn
-----------



More information about the leyden-dev mailing list