RFR: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug" [v2]

Tue May 28 10:01:08 UTC 2024

On Fri, 24 May 2024 08:54:40 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision:
>> 
>>  - Remove unused immIOffset/immLOffset
>>  - Merge branch 'master' into fg8319690
>>  - 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug"
>>    
>>    On LP64 systems, if the heap can be moved into low virtual
>>    address space (below 4GB) and the heap size is smaller than the
>>    interesting threshold of 4 GB, we can use unscaled decoding
>>    pattern for narrow klass decoding. It means that a generic field
>>    reference can be decoded by:
>>    ```
>>    cast<64> (32-bit compressed reference) + field_offset
>>    ```
>>    
>>    When the `field_offset` is an immediate, on aarch64 platform, the
>>    unscaled decoding pattern can match perfectly with a direct
>>    addressing mode, i.e., `base_plus_offset`, supported by LDR/STR
>>    instructions. But for certain data width, not all immediates can
>>    be encoded in the instruction field of LDR/STR[1]. The ranges are
>>    different as data widths vary.
>>    
>>    For example, when we try to load a value of long type at offset of
>>    `1030`, the address expression is `(AddP (DecodeN base) 1030)`.
>>    Before the patch, the expression was matching with
>>    `operand indOffIN()`. But, for 64-bit LDR/STR, signed immediate
>>    byte offset must be in the range -256 to 255 or positive immediate
>>    byte offset must be a multiple of 8 in the range 0 to 32760[2].
>>    `1030` can't be encoded in the instruction field. So, after
>>    matching, when we do checking for instruction encoding, the
>>    assertion would fail.
>>    
>>    In this patch, we're going to filter out invalid immediates
>>    when deciding if current addressing mode can be matched as
>>    `base_plus_offset`. We introduce `indOffIN4/indOffLN4` and
>>    `indOffIN8/indOffLN8` for 32-bit data type and 64-bit data
>>    type separately in the patch. E.g., for `memory4`, we remove
>>    the generic `indOffIN/indOffLN`, which matches wrong unscaled
>>    immediate range, and replace them with `indOffIN4/indOffLN4`
>>    instead.
>>    
>>    Since 8-bit and 16-bit LDR/STR instructions also support the
>>    unscaled decoding pattern, we add the addressing mode in the
>>    lists of `memory1` and `memory2` by introducing
>>    `indOffIN1/indOffLN1` and `indOffIN2/indOffLN2`.
>>    
>> ...
>
> test/hotspot/jtreg/compiler/c2/aarch64/TestUnalignedAccessCompressedOops.java line 35:
> 
>> 33:  * @library /test/lib
>> 34:  * @modules java.base/jdk.internal.misc
>> 35:  * @requires os.arch=="aarch64" & vm.compiler2.enabled
> 
> I would remove these two lines. Because who knows, maybe some other platform has similar issues down the road. Or maybe graalVM has a bug that we could catch with this.

I'll continue processing this patch. I can remove it on next update if you don't mind it.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16991#discussion_r1616943068