RFR: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug" [v3]

Dean Long dlong at openjdk.org
Thu Jun 13 23:20:13 UTC 2024


On Thu, 13 Jun 2024 22:14:45 GMT, Dean Long <dlong at openjdk.org> wrote:

>> Thanks for your review, @dean-long .
>> 
>> Yes, based on the current implementation of our ADL compiler, even if we match on "DecodeN regn", using `$reg` is safe and perhaps even must.
>> 
>> When ADLC is parsing operand interface from `indOffIX`, it always fetches useful information from the **first** match rule `match(AddP reg off)` and does not care about others, even though we have multiple match rules.
>> 
>> See https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/formssel.cpp#L2461 and https://github.com/openjdk/jdk/blob/1e04ee6d57d5fe84e1d202b16e8d13dc13c002ff/src/hotspot/share/adlc/output_c.cpp#L3025. 
>> 
>> It searches `reg` in `match(AddP reg off);` and finds that `reg` is the `first` one in all components, which is like `regn` is the `first` in `match(AddP (DecodeN regn) off);`. Then it concludes that the **first** operand starting from `oper_input_base()` is the base address input. In the stage of `emit()`, the node structure has been reduced into like:
>> 
>> Load === ctrl mem reg  val 
>> Load === ctrl mem regn val 
>> 
>> `off` is saved on Operand field.
>> 
>> The final JVM code will be shown as:
>> 
>> void loadLNode::emit(C2_MacroAssembler* masm, PhaseRegAlloc* ra_) const {
>>   // Start at oper_input_base() and count operands
>>   unsigned idx0 = 2;
>>   unsigned idx1 = 2; 	// mem
>>   {
>> 
>> #line 2914 "/home/feigao02/chelsea/jdk_src/src/hotspot/cpu/aarch64/aarch64.ad"
>> 
>>     Register dst_reg = as_Register(opnd_array(0)->reg(ra_,this)/* dst */);
>>     loadStore(masm, &MacroAssembler::ldr, dst_reg, opnd_array(1)->opcode(),
>>                as_Register(opnd_array(1)->base(ra_,this,idx1)), opnd_array(1)->index(ra_,this,idx1), opnd_array(1)->scale(), opnd_array(1)->disp(ra_,this,idx1), 8);
>>   
>> #line 999999
>>   }
>> }
>> 
>> 
>> 
>>   virtual int base(PhaseRegAlloc *ra_, const Node *node, int idx) const { 
>>     // Replacement variable: reg
>>     return (int)ra_->get_encode(node->in(idx));
>>   }
>> 
>> 
>> To be honest, `$reg` here is a little confusing but, IMO, it may represent a relative index. WDYT? Thanks.
>
> Yes, this is confusing.  It's unfortunate that we can't optimize the Decode nodes in the IR, but that would require platform-specific C2 code.
> 
> This decode optimization reminds me of the iRegIorL2I optimization.  Can't we use the same trick here?

Something like this:

// Special operand allowing iRegN to be treated as iRegP
operand iRegN2P(iRegN regn) %{
  predicate(CompressedOops::shift() == 0);

  op_cost(0);

  match(DecodeN regn);

  format %{ "decode($regn)" %}

  interface(REG_INTER)
%}

// Pointer Register
operand iRegP()
%{
  constraint(ALLOC_IN_RC(ptr_reg));
[...]
  match(iRegN2P);
[...]


I tried using an opclass like iRegIorL2I, but adlc can't seem to handle an opclass before all operands are defined.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16991#discussion_r1639038426


More information about the hotspot-compiler-dev mailing list