RFR: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug" [v3]
Dean Long
dlong at openjdk.org
Thu Jun 13 23:20:13 UTC 2024
On Thu, 13 Jun 2024 22:14:45 GMT, Dean Long <dlong at openjdk.org> wrote:
>> Thanks for your review, @dean-long .
>>
>> Yes, based on the current implementation of our ADL compiler, even if we match on "DecodeN regn", using `$reg` is safe and perhaps even must.
>>
>> When ADLC is parsing operand interface from `indOffIX`, it always fetches useful information from the **first** match rule `match(AddP reg off)` and does not care about others, even though we have multiple match rules.
>>
>> See https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/formssel.cpp#L2461 and https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/output_c.cpp#L3025.
>>
>> It searches `reg` in `match(AddP reg off);` and finds that `reg` is the `first` one in all components, which is like `regn` is the `first` in `match(AddP (DecodeN regn) off);`. Then it concludes that the **first** operand starting from `oper_input_base()` is the base address input. In the stage of `emit()`, the node structure has been reduced into like:
>>
>> Load === ctrl mem reg val
>> Load === ctrl mem regn val
>>
>> `off` is saved on Operand field.
>>
>> The final JVM code will be shown as:
>>
>> void loadLNode::emit(C2_MacroAssembler* masm, PhaseRegAlloc* ra_) const {
>> // Start at oper_input_base() and count operands
>> unsigned idx0 = 2;
>> unsigned idx1 = 2; // mem
>> {
>>
>> #line 2914 "/home/feigao02/chelsea/jdk_src/src/hotspot/cpu/aarch64/aarch64.ad"
>>
>> Register dst_reg = as_Register(opnd_array(0)->reg(ra_,this)/* dst */);
>> loadStore(masm, &MacroAssembler::ldr, dst_reg, opnd_array(1)->opcode(),
>> as_Register(opnd_array(1)->base(ra_,this,idx1)), opnd_array(1)->index(ra_,this,idx1), opnd_array(1)->scale(), opnd_array(1)->disp(ra_,this,idx1), 8);
>>
>> #line 999999
>> }
>> }
>>
>>
>>
>> virtual int base(PhaseRegAlloc *ra_, const Node *node, int idx) const {
>> // Replacement variable: reg
>> return (int)ra_->get_encode(node->in(idx));
>> }
>>
>>
>> To be honest, `$reg` here is a little confusing but, IMO, it may represent a relative index. WDYT? Thanks.
>
> Yes, this is confusing. It's unfortunate that we can't optimize the Decode nodes in the IR, but that would require platform-specific C2 code.
>
> This decode optimization reminds me of the iRegIorL2I optimization. Can't we use the same trick here?
Something like this:
// Special operand allowing iRegN to be treated as iRegP
operand iRegN2P(iRegN regn) %{
predicate(CompressedOops::shift() == 0);
op_cost(0);
match(DecodeN regn);
format %{ "decode($regn)" %}
interface(REG_INTER)
%}
// Pointer Register
operand iRegP()
%{
constraint(ALLOC_IN_RC(ptr_reg));
[...]
match(iRegN2P);
[...]
I tried using an opclass like iRegIorL2I, but adlc can't seem to handle an opclass before all operands are defined.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16991#discussion_r1639038426
More information about the hotspot-compiler-dev
mailing list