RFR: 8339507: Test generation tool and gtest for testing APX encoding of extended gpr instructions [v4]

Jatin Bhateja jbhateja at openjdk.org
Thu Oct 17 13:44:26 UTC 2024


On Wed, 16 Oct 2024 21:13:57 GMT, hanklo6 <duke at openjdk.org> wrote:

>> Add test generation tool and gtest for testing APX encoding of instructions with extended general-purpose registers.
>> 
>> Intel Advanced Performance Extensions (APX) doubles the number of general-purpose registers, from 16 to 32. For more information about APX, see https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html.
>> 
>> By specification, instruction encoding remains unchanged for instructions using only the lower 16 GPRs. For cases where one or more instruction operands reference extended GPRs (Egprs), encoding targets either REX2, an extension of REX encoding, or an extended version of EVEX encoding. These new encoding schemes extend or modify existing instruction prefixes only when Egprs are used.
>> 
>> ### Generate test instructions
>> With `binutils = 2.43`
>> * `python3 x86-asmtest.py > asmtest.out.h`
>> ### Run test
>> *  `make test TEST="gtest:AssemblerX86"`
>
> hanklo6 has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
> 
>  - Merge branch 'master' of https://git.openjdk.java.net/jdk into apx-test-tool
>  - Add comment and defined
>  - Add copyright header
>  - Remove tab
>  - Remove whitespace
>  - Replace whitespace with tab
>  - Add flag before testing
>  - Fix assertion error on MacOS
>  - Add _LP64 flag
>  - Add missing header
>  - ... and 6 more: https://git.openjdk.org/jdk/compare/cbb32376...ca48f240

test/hotspot/gtest/x86/asmtest.out.h line 1:

> 1: // BEGIN  Generated code -- do not edit

All the memory operand instructions being validated are checking for only one kind of memory addressing mode which is 
 `- BASE + INDEX`
We should also check for following flavors for at least some instructions :-

-   BASE  
-   INDEX * SCALE + DISPLACEMENT
-   BASE + INDEX + DISPLACEMENT
-   BASE  + INDEX * SCALE + DISPLACEMENT


Where BASE and INDEX are EGPRs.

test/hotspot/gtest/x86/asmtest.out.h line 1:

> 1: // BEGIN  Generated code -- do not edit

Can you also emit the instruction IDs in the comments against each row in insns_strs and  insns_lens tables, it 
e.g. 


// Generated by x86-asmtest.py
    __ shldl(rcx, rdx);                                //    {load}shld ecx, edx     IID0
    __ shldl(rdx, rbx);                                //    {load}shld edx, ebx    IID1
......
.....
  static const uint8_t insns[] =
  {
    0x0f, 0xa5, 0xd1,               // IID0
    0x0f, 0xa5, 0xda,               // IID1
   ...
   static const unsigned int insns_lens[] =
  {
    3,     // IID0
    3,     // IID1
#ifdef _LP64
   ......
  static const char* insns_strs[] =
  {
    "__ shldl(rcx, rdx);",    // IID0
    "__ shldl(rdx, rbx);",   // IID1
#ifdef _LP64

It will ease correlating and manually inspecting these statically emitted tables.

test/hotspot/gtest/x86/test_assemblerx86.cpp line 44:

> 42:   // Different encoding for GCC and OpenJDK
> 43:   {"shll", {'\xd3', '\xd1'}},
> 44:   {"shlq", {'\xd3', '\xd1'}},

For the record.

// For single register operand salq, C2 assumes shift will be passed through CL register and emits the encoding with opcode set to  'D3".
void Assembler::shlq(Register dst) {
  int encode = prefixq_and_encode(dst->encoding());
  emit_int16((unsigned char)0xD3, (0xE0 | encode));
}

// With immediate shift operand we explicitly handle special case of shift by '1' bit... and emit D1 opcode.
void Assembler::shlq(Register dst, int imm8) {
  assert(isShiftCount(imm8 >> 1), "illegal shift count");
  int encode = prefixq_and_encode(dst->encoding());
  if (imm8 == 1) {
    emit_int16((unsigned char)0xD1, (0xE0 | encode));
  } else {
    emit_int24((unsigned char)0xC1, (0xE0 | encode), imm8);
  }
}

So, GCC toolchain is following a different convention than C2,  but both are emitting correct encodings.
Our test infrastructure is biased toward C2 and hence does not comply with GCC encoding, thus we are
skipping over following cases. Please see below a small inline assembly snippet and its corresponding encoding.

void micro(){
   asm volatile(
      "shlq $1, %%r11          \n\t"                                [InstID 1]
      "shlq %%r11              \n\t"                                [InstID 2]
      "shlq %%cl, %%r11        \n\t"                                [InstID 3]
  :  :  : "%r11", "%rcx"
   );
}
CPROMPT>objdump -D shl.o
Disassembly of section .text:
0000000000000000 <micro>:
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   49 d1 e3                shl    $1,%r11                       [InstID 1]
   7:   49 d1 e3                shl    $1,%r11                       [InstID 2]                
   a:   49 d3 e3                shl    %cl,%r1 1                     [InstID 3]

test/hotspot/gtest/x86/test_assemblerx86.cpp line 99:

> 97:   asm_check((const uint8_t *)entry, (const uint8_t *)insns, insns_lens, insns_strs, sizeof(insns_lens) / sizeof(insns_lens[0]));
> 98:   BufferBlob::free(b);
> 99: }

Following MAP0 and MAP1 instructions are missing :-

bsfl   bsfq  bsrl  bsrq bswapl bswapq
btq  
call
cmpb cmpl cmpq cmpw
cmpxchgb  cmpxchgl cmpxchgq cmpxchgw
cvttsd2siq
incl incq
lea leal  leaq 
mov mov64  movb movl movq
movsbl movsbq movslq movswl movswq
movw 
movzbl movzbq movzwl movzwq
orw
sall salq
testb testl testq
xaddb xaddl xaddq xaddw
xchgb xchgl xchgq xchgw


But, given that all assembly routines share same leaf level prefix emitting routines, we can skip them for the time being or add validate just one from each row  

Please do add following new MAP4 APX instructions since you are already taking care of their two operand counterparts with PPX.

1. popp
2. pushp

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20857#discussion_r1804498306
PR Review Comment: https://git.openjdk.org/jdk/pull/20857#discussion_r1804565059
PR Review Comment: https://git.openjdk.org/jdk/pull/20857#discussion_r1804651875
PR Review Comment: https://git.openjdk.org/jdk/pull/20857#discussion_r1804707138


More information about the hotspot-compiler-dev mailing list