RFR: 8273021: C2: Improve Add and Xor ideal optimizations

Tobias Hartmann thartmann at openjdk.java.net
Thu Sep 2 14:24:18 UTC 2021


On Thu, 26 Aug 2021 09:19:41 GMT, Yi Yang <yyang at openjdk.org> wrote:

> Greetings. This patch adds the following identical equations for Add and Xor node, respectively, which probably drives further optimizations.
> 
> 
> ~(x-1) => -x
> ~x + 1 => -x
> 
> 
> 
> Verified by generated opto assembly, maybe an IR verification test can be added later.
> 
> ============================= C2-compiled nmethod ==============================
> ----------------------- MetaData before Compile_id = 1 ------------------------
> {method}
>  - this oop:          0x00007fe29f003518
>  - method holder:     'compiler/c2/TestAddXorIdeal'
>  - constants:         0x00007fe29f003248 constant pool [38] {0x00007fe29f003248} for 'compiler/c2/TestAddXorIdeal' cache=0x00007fe29f0038f0
>  - access:            0x81000009  public static 
>  - name:              'test1'
>  - signature:         '(I)I'
>  - max stack:         3
>  - max locals:        1
>  - size of params:    1
>  - method size:       13
>  - vtable index:      -2
>  - i2i entry:         0x00007fe2fd23fc00
>  - adapters:          AHE at 0x00007fe3082160d0: 0xa i2c: 0x00007fe2fd34c660 c2i: 0x00007fe2fd34c719 c2iUV: 0x00007fe2fd34c6e3 c2iNCI: 0x00007fe2fd34c756
>  - compiled entry     0x00007fe2fd34c719
>  - code size:         6
>  - code start:        0x00007fe29f003508
>  - code end (excl):   0x00007fe29f00350e
>  - method data:       0x00007fe29f0070f8
>  - checked ex length: 0
>  - linenumber start:  0x00007fe29f00350e
>  - localvar length:   0
> 
> ------------------------ OptoAssembly for Compile_id = 1 -----------------------
> #
> #  int ( int )
> #
> #r018 rsi   : parm 0: int
> # -- Old rsp -- Framesize: 32 --
> #r591 rsp+28: in_preserve
> #r590 rsp+24: return address
> #r589 rsp+20: in_preserve
> #r588 rsp+16: saved fp register
> #r587 rsp+12: pad2, stack alignment
> #r586 rsp+ 8: pad2, stack alignment
> #r585 rsp+ 4: Fixed slot 1
> #r584 rsp+ 0: Fixed slot 0
> #
> 000     N1: #	out( B1 ) <- in( B1 )  Freq: 1
> 
> 000     B1: #	out( N1 ) <- BLOCK HEAD IS JUNK  Freq: 1
> 000     # stack bang (96 bytes)
> 	pushq   rbp	# Save rbp
> 	subq    rsp, #16	# Create frame
> 
> 00c     movl    RAX, RSI	# spill
> 00e     negl    RAX	# int
> 010     addq    rsp, 16	# Destroy frame
> 	popq    rbp
> 	cmpq     rsp, poll_offset[r15_thread] 
> 	ja       #safepoint_stub	# Safepoint: poll for GC
> 
> 022     ret
> 
> --------------------------------------------------------------------------------
> 
> ============================= C2-compiled nmethod ==============================
> ----------------------- MetaData before Compile_id = 3 ------------------------
> {method}
>  - this oop:          0x00007fe29f003668
>  - method holder:     'compiler/c2/TestAddXorIdeal'
>  - constants:         0x00007fe29f003248 constant pool [38] {0x00007fe29f003248} for 'compiler/c2/TestAddXorIdeal' cache=0x00007fe29f0038f0
>  - access:            0x81000009  public static 
>  - name:              'test3'
>  - signature:         '(J)J'
>  - max stack:         5
>  - max locals:        2
>  - size of params:    2
>  - method size:       13
>  - vtable index:      -2
>  - i2i entry:         0x00007fe2fd23fc00
>  - adapters:          AHE at 0x00007fe30829c8f0: 0xbe i2c: 0x00007fe2fd2d88e0 c2i: 0x00007fe2fd2d89c0 c2iUV: 0x00007fe2fd2d898a c2iNCI: 0x00007fe2fd2d89fd
>  - compiled entry     0x00007fe2fd2d89c0
>  - code size:         8
>  - code start:        0x00007fe29f003658
>  - code end (excl):   0x00007fe29f003660
>  - method data:       0x00007fe29f007408
>  - checked ex length: 0
>  - linenumber start:  0x00007fe29f003660
>  - localvar length:   0
> 
> ------------------------ OptoAssembly for Compile_id = 3 -----------------------
> #
> #  long/half ( long, half )
> #
> #r018 rsi:rsi   : parm 0: long
> # -- Old rsp -- Framesize: 32 --
> #r591 rsp+28: in_preserve
> #r590 rsp+24: return address
> #r589 rsp+20: in_preserve
> #r588 rsp+16: saved fp register
> #r587 rsp+12: pad2, stack alignment
> #r586 rsp+ 8: pad2, stack alignment
> #r585 rsp+ 4: Fixed slot 1
> #r584 rsp+ 0: Fixed slot 0
> #
> 000     N1: #	out( B1 ) <- in( B1 )  Freq: 1
> 
> 000     B1: #	out( N1 ) <- BLOCK HEAD IS JUNK  Freq: 1
> 000     # stack bang (96 bytes)
> 	pushq   rbp	# Save rbp
> 	subq    rsp, #16	# Create frame
> 
> 00c     movq    RAX, RSI	# spill
> 00f     negq    RAX	# long
> 012     addq    rsp, 16	# Destroy frame
> 	popq    rbp
> 	cmpq     rsp, poll_offset[r15_thread] 
> 	ja       #safepoint_stub	# Safepoint: poll for GC
> 
> 024     ret
> 
> --------------------------------------------------------------------------------
> 
> ============================= C2-compiled nmethod ==============================
> ----------------------- MetaData before Compile_id = 2 ------------------------
> {method}
>  - this oop:          0x00007fe29f0035c0
>  - method holder:     'compiler/c2/TestAddXorIdeal'
>  - constants:         0x00007fe29f003248 constant pool [38] {0x00007fe29f003248} for 'compiler/c2/TestAddXorIdeal' cache=0x00007fe29f0038f0
>  - access:            0x81000009  public static 
>  - name:              'test2'
>  - signature:         '(I)I'
>  - max stack:         3
>  - max locals:        1
>  - size of params:    1
>  - method size:       13
>  - vtable index:      -2
>  - i2i entry:         0x00007fe2fd23fc00
>  - adapters:          AHE at 0x00007fe3082160d0: 0xa i2c: 0x00007fe2fd34c660 c2i: 0x00007fe2fd34c719 c2iUV: 0x00007fe2fd34c6e3 c2iNCI: 0x00007fe2fd34c756
>  - compiled entry     0x00007fe2fd34c719
>  - code size:         6
>  - code start:        0x00007fe29f0035b0
>  - code end (excl):   0x00007fe29f0035b6
>  - method data:       0x00007fe29f007280
>  - checked ex length: 0
>  - linenumber start:  0x00007fe29f0035b6
>  - localvar length:   0
> 
> ------------------------ OptoAssembly for Compile_id = 2 -----------------------
> #
> #  int ( int )
> #
> #r018 rsi   : parm 0: int
> # -- Old rsp -- Framesize: 32 --
> #r591 rsp+28: in_preserve
> #r590 rsp+24: return address
> #r589 rsp+20: in_preserve
> #r588 rsp+16: saved fp register
> #r587 rsp+12: pad2, stack alignment
> #r586 rsp+ 8: pad2, stack alignment
> #r585 rsp+ 4: Fixed slot 1
> #r584 rsp+ 0: Fixed slot 0
> #
> 000     N1: #	out( B1 ) <- in( B1 )  Freq: 1
> 
> 000     B1: #	out( N1 ) <- BLOCK HEAD IS JUNK  Freq: 1
> 000     # stack bang (96 bytes)
> 	pushq   rbp	# Save rbp
> 	subq    rsp, #16	# Create frame
> 
> 00c     movl    RAX, RSI	# spill
> 00e     negl    RAX	# int
> 010     addq    rsp, 16	# Destroy frame
> 	popq    rbp
> 	cmpq     rsp, poll_offset[r15_thread] 
> 	ja       #safepoint_stub	# Safepoint: poll for GC
> 
> 022     ret
> 
> --------------------------------------------------------------------------------
> 
> ============================= C2-compiled nmethod ==============================
> ----------------------- MetaData before Compile_id = 4 ------------------------
> {method}
>  - this oop:          0x00007fe29f003710
>  - method holder:     'compiler/c2/TestAddXorIdeal'
>  - constants:         0x00007fe29f003248 constant pool [38] {0x00007fe29f003248} for 'compiler/c2/TestAddXorIdeal' cache=0x00007fe29f0038f0
>  - access:            0x81000009  public static 
>  - name:              'test4'
>  - signature:         '(J)J'
>  - max stack:         5
>  - max locals:        2
>  - size of params:    2
>  - method size:       13
>  - vtable index:      -2
>  - i2i entry:         0x00007fe2fd23fc00
>  - adapters:          AHE at 0x00007fe30829c8f0: 0xbe i2c: 0x00007fe2fd2d88e0 c2i: 0x00007fe2fd2d89c0 c2iUV: 0x00007fe2fd2d898a c2iNCI: 0x00007fe2fd2d89fd
>  - compiled entry     0x00007fe2fd2d89c0
>  - code size:         8
>  - code start:        0x00007fe29f003700
>  - code end (excl):   0x00007fe29f003708
>  - method data:       0x00007fe29f0075a0
>  - checked ex length: 0
>  - linenumber start:  0x00007fe29f003708
>  - localvar length:   0
> 
> ------------------------ OptoAssembly for Compile_id = 4 -----------------------
> #
> #  long/half ( long, half )
> #
> #r018 rsi:rsi   : parm 0: long
> # -- Old rsp -- Framesize: 32 --
> #r591 rsp+28: in_preserve
> #r590 rsp+24: return address
> #r589 rsp+20: in_preserve
> #r588 rsp+16: saved fp register
> #r587 rsp+12: pad2, stack alignment
> #r586 rsp+ 8: pad2, stack alignment
> #r585 rsp+ 4: Fixed slot 1
> #r584 rsp+ 0: Fixed slot 0
> #
> 000     N1: #	out( B1 ) <- in( B1 )  Freq: 1
> 
> 000     B1: #	out( N1 ) <- BLOCK HEAD IS JUNK  Freq: 1
> 000     # stack bang (96 bytes)
> 	pushq   rbp	# Save rbp
> 	subq    rsp, #16	# Create frame
> 
> 00c     movq    RAX, RSI	# spill
> 00f     negq    RAX	# long
> 012     addq    rsp, 16	# Destroy frame
> 	popq    rbp
> 	cmpq     rsp, poll_offset[r15_thread] 
> 	ja       #safepoint_stub	# Safepoint: poll for GC
> 
> 024     ret

src/hotspot/share/opto/addnode.cpp line 1014:

> 1012:       return new SubINode(phase->makecon(TypeInt::ZERO), in1->in(1));
> 1013:     }
> 1014:   } else if (op2 == Op_AddI && phase->type(in1) == TypeInt::MINUS_1) {

Why do you need to check both inputs for constant -1? Shouldn't `AddNode::Ideal` canonicalize the inputs and ensure that constants are moved to the second input? 

https://github.com/openjdk/jdk/blob/599d07c0db9c85e4dae35d1c54a63407d32eaedd/src/hotspot/share/opto/addnode.hpp#L52-L54

test/hotspot/jtreg/compiler/c2/TestAddXorIdeal.java line 30:

> 28:  * @summary C2: Improve Add and Xor ideal optimizations
> 29:  * @library /test/lib
> 30:  * @run main/othervm -XX:-Inline -XX:-TieredCompilation -XX:TieredStopAtLevel=4 -XX:CompileCommand=compileonly,compiler.c2.TestAddXorIdeal::* compiler.c2.TestAddXorIdeal

What about `-XX:CompileCommand=dontinline,compiler.c2.TestAddXorIdeal::test*` Instead of disabling all inlining and limiting compilation?

test/hotspot/jtreg/compiler/c2/TestAddXorIdeal.java line 59:

> 57:             Asserts.assertTrue(test2(i - 7) == -(i - 7));
> 58:             Asserts.assertTrue(test3(i + 100) == -(i + 100));
> 59:             Asserts.assertTrue(test4(i - 1024) == -(i - 1024));

What about using random numbers for better coverage?

-------------

PR: https://git.openjdk.java.net/jdk/pull/5266


More information about the hotspot-compiler-dev mailing list