RFR: 8351627: C2 AArch64 ROR/ROL: assert((1 << ((T>>1)+3)) > shift) failed: Invalid Shift value [v2]
Xiaohong Gong
xgong at openjdk.org
Mon Mar 24 02:09:15 UTC 2025
On Fri, 21 Mar 2025 10:07:48 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> Hmm so your patch adds in an extra node. It probably does not cost much, but I'd like to be sure that it's needed. Is there any case where we now have wrong results on master? Because I could not find one, only the assert on `aarch64`. But given that you are adding the `AndI` node, it seems there should be wrong results, right? Can you find a test for that?
Actually as I mentioned the variable vector case in the commit message, there would be not wrong results for the rotate operation, if `AndI` is not added besides the assertion. The shift count value that may overflow here is only `0` (other values are safe).
For vector variables as shift counts, the masking can be safely omitted because:
Vector shift instructions that take a vector register as the shift count may not automatically apply modulo arithmetic based on register size. When the shift count is 32 for int type, the result may be either zeros or src. However, this doesn't affect correctness for rotate since the final result is combined with src using a logical OR operation.
Another reason besides the AArch64 assertion that `AndI` is necessary here is: it finally generate the shift IR which has a duplicate `0` as the shift count. And the whole vector shift IR can be optimized out. See the transformation below:
(LShiftVI src 32) -> (LShiftVI src 0) -> src
> Actually, I have a question. Below, there is this section:
>
> if (!is_binary_vector_op) {
> shiftLCnt = phase->transform(new LShiftCntVNode(shiftLCnt, vt));
> shiftRCnt = phase->transform(new RShiftCntVNode(shiftRCnt, vt));
> }
>Can you tell me what this is for? Maybe it is something else.
`LShiftCntVNode` and `RShiftCntVNode` are used to generate a vector shift count IR from the scalar `shiftLCnt`/`shiftRCnt` IR. On AArch64, they are the same with `ReplicateNode`. This is necessary, because the shift vector IR requires two vector inputs.]
> I see that I'm doing the same AndI trick in SuperWord, so maybe it is needed in general:
Yes, this meets the java spec definition for scalar shift operation.
> Generally, I think we need better annotations in vectornode.hpp. So for example it would be nice if you could document the assumptions about the input above ShiftVNode. Do we expect the shift value to be in a specific range? Or is it wrapped like the scalar shift operator << and >>?
I think it depends on the instruction definition. On AArch64, the scalar shift instruction [1] will modulo the element size it self, which meets the java scalar shift operator `<<` and `>>`, while the vector shift instructions [2] will not do it. I think this is also the reason why an explicit `and` is needed in superword. As a summary, for vector shift IR, we do not have the expectation the shift value is in a specific range.
[1] https://developer.arm.com/documentation/ddi0602/2024-12/Base-Instructions/LSL--register---Logical-shift-left--register---an-alias-of-LSLV-?lang=en
[2] https://developer.arm.com/documentation/ddi0602/2024-12/SVE-Instructions/LSL--vectors---Logical-shift-left-by-vector--predicated--?lang=en
Another solution of this issue: maybe we could specially handle the case for shift cnt of constant `0`. Do you think that would be better?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24051#issuecomment-2746686827
More information about the hotspot-compiler-dev
mailing list