RFR: 8282528: AArch64: Incorrect replicate2L_zero rule

Eric Liu eliu at openjdk.java.net
Thu Mar 24 12:16:47 UTC 2022


On Thu, 24 Mar 2022 09:47:06 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> I think we need a `can_encode(imm, arrangement)` function.
>
> And then another function that actually does the arranging, and the generation of instructions calls those functions.

Thanks for your review. I agree with that `can_encode(imm, arrangment)` function is better. My concern is that this JBS is just a bug fix for replicate2L_imm backend, and for other SIMD_Arrangment, I found that they can have some other choice for the code generation, but I didn’t touch them in this patch to keep it clear and small.  I list two examples below.

Example1:

        movi  v16.4s, #0x34
        orr v16.4s, #0x12, lsl #8

        vs

        mov w8, #0x1234
        dup v16.4s, w8


Example2:

        movi    v16.4s, #0x78
        orr     v16.4s, #0x56, lsl #8
        orr     v16.4s, #0x34, lsl #16
        orr     v16.4s, #0x12, lsl #24

        vs

        mov     w14, #0x5678
        movk    w14, #0x1234, lsl #16
        dup     v16.4s, w14


I'm considering to measure the performance and refine the mov macro assembler if it's necessary. `can_encode` can also be done in the refined work. What do you think?

-------------

PR: https://git.openjdk.java.net/jdk/pull/7939


More information about the hotspot-dev mailing list