RFR: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug" [v3]

Dean Long dlong at openjdk.org
Thu Jun 13 22:17:13 UTC 2024


On Fri, 31 May 2024 14:41:10 GMT, Fei Gao <fgao at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/aarch64.ad line 5193:
>> 
>>> 5191:   constraint(ALLOC_IN_RC(ptr_reg));
>>> 5192:   match(AddP reg off);
>>> 5193:   match(AddP (DecodeN regn) off);
>> 
>> I'm surprised this works.  If we match on "DecodeN regn", is it really safe to use $reg instead?
>
> Thanks for your review, @dean-long .
> 
> Yes, based on the current implementation of our ADL compiler, even if we match on "DecodeN regn", using `$reg` is safe and perhaps even must.
> 
> When ADLC is parsing operand interface from `indOffIX`, it always fetches useful information from the **first** match rule `match(AddP reg off)` and does not care about others, even though we have multiple match rules.
> 
> See https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/formssel.cpp#L2461 and https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/output_c.cpp#L3025. 
> 
> It searches `reg` in `match(AddP reg off);` and finds that `reg` is the `first` one in all components, which is like `regn` is the `first` in `match(AddP (DecodeN regn) off);`. Then it concludes that the **first** operand starting from `oper_input_base()` is the base address input. In the stage of `emit()`, the node structure has been reduced into like:
> 
> Load === ctrl mem reg  val 
> Load === ctrl mem regn val 
> 
> `off` is saved on Operand field.
> 
> The final JVM code will be shown as:
> 
> void loadLNode::emit(C2_MacroAssembler* masm, PhaseRegAlloc* ra_) const {
>   // Start at oper_input_base() and count operands
>   unsigned idx0 = 2;
>   unsigned idx1 = 2; 	// mem
>   {
> 
> #line 2914 "/home/feigao02/chelsea/jdk_src/src/hotspot/cpu/aarch64/aarch64.ad"
> 
>     Register dst_reg = as_Register(opnd_array(0)->reg(ra_,this)/* dst */);
>     loadStore(masm, &MacroAssembler::ldr, dst_reg, opnd_array(1)->opcode(),
>                as_Register(opnd_array(1)->base(ra_,this,idx1)), opnd_array(1)->index(ra_,this,idx1), opnd_array(1)->scale(), opnd_array(1)->disp(ra_,this,idx1), 8);
>   
> #line 999999
>   }
> }
> 
> 
> 
>   virtual int base(PhaseRegAlloc *ra_, const Node *node, int idx) const { 
>     // Replacement variable: reg
>     return (int)ra_->get_encode(node->in(idx));
>   }
> 
> 
> To be honest, `$reg` here is a little confusing but, IMO, it may represent a relative index. WDYT? Thanks.

Yes, this is confusing.  It's unfortunate that we can't optimize the Decode nodes in the IR, but that would require platform-specific C2 code.

This decode optimization reminds me of the iRegIorL2I optimization.  Can't we use the same trick here?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16991#discussion_r1639003010


More information about the hotspot-compiler-dev mailing list