RFR: 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug"

Wed Dec 6 09:59:50 UTC 2023

On LP64 systems, if the heap can be moved into low virtual address space (below 4GB) and the heap size is smaller than the interesting threshold of 4 GB, we can use unscaled decoding pattern for narrow klass decoding. It means that a generic field reference can be decoded by:

cast<64> (32-bit compressed reference) + field_offset

When the `field_offset` is an immediate, on aarch64 platform, the unscaled decoding pattern can match perfectly with a direct addressing mode, i.e., `base_plus_offset`, supported by `LDR/STR` instructions. But for certain data width, not all immediates can be encoded in the instruction field of `LDR/STR` [[1]](https://github.com/openjdk/jdk/blob/8db7bad992a0f31de9c7e00c2657c18670539102/src/hotspot/cpu/aarch64/assembler_aarch64.inline.hpp#L33). The ranges are different as data widths vary.

For example, when we try to load a value of long type at offset of `1030`, the address expression is `(AddP (DecodeN base) 1030)`. Before the patch, the expression was matching with `operand indOffIN()`. But, for 64-bit `LDR/STR`, signed immediate byte offset must be in the range -256 to 255 or positive immediate byte offset must be a multiple of 8 in the range 0 to 32760 [[2]](https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/LDR--immediate---Load-Register--immediate--?lang=en). `1030` can't be encoded in the instruction field. So, after matching, when we do checking for instruction encoding, the assertion would fail.

In this patch, we're going to filter out invalid immediates when deciding if current addressing mode can be matched as `base_plus_offset`. We introduce `indOffIN4/indOffLN4` and `indOffIN8/indOffLN8` for 32-bit data type and 64-bit data type separately in the patch. E.g., for `memory4`, we remove the generic `indOffIN/indOffLN`, which matches wrong unscaled immediate range, and replace them with `indOffIN4/indOffLN4` instead.

Since 8-bit and 16-bit `LDR/STR` instructions also support the unscaled decoding pattern, we add the addressing mode in the lists of `memory1` and `memory2` by introducing `indOffIN1/indOffLN1` and `indOffIN2/indOffLN2`.

We also remove unused operands `indOffI/indOffl/indOffIN/indOffLN` to avoid misuse.

Tier 1-3 passed on aarch64.

-------------

Commit messages:
 - 8319690: [AArch64] C2 compilation hits offset_ok_for_immed: assert "c2 compiler bug"

Changes: https://git.openjdk.org/jdk/pull/16991/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16991&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8319690
  Stats: 297 lines in 2 files changed: 262 ins; 28 del; 7 mod
  Patch: https://git.openjdk.org/jdk/pull/16991.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/16991/head:pull/16991

PR: https://git.openjdk.org/jdk/pull/16991