aarch64 AD-file / matching rule

Lindenmaier, Goetz goetz.lindenmaier at sap.com
Wed Apr 29 14:37:46 UTC 2015


Hi,

I am using PrintOptoAssembly in such cases.  This tells me how the IR is looking after
matching.  Together with PrintAssembly you can manage to locate the block
with the pattern.

With PrintIdeal you can see the graph before matching.  You should find the pattern
you described in the ad rule there.  Hard to read, though.

There is also the PrintIdealGraph flag, printing a graph you can visualize.
I didn't use that, though.  We have instrumented the opto compiler with
our own graph printer.

I could imagine that the AndI node has more than one usage/out edge.
Then it's not a tree-like subgraph, and the matcher can not apply the rule.
This is something you would check in the PrintIdeal output or in the last
Ideal graph before matching.

Best regards,
  Goetz.

From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Benedikt Wedenik
Sent: Mittwoch, 29. April 2015 14:50
To: hotspot-compiler-dev at openjdk.java.net
Cc: Dr. Philipp Tomsich; Benedikt Huber
Subject: aarch64 AD-file / matching rule

Hi!

I'm writing compiler-optimisations for the aarch64 port at the moment and I am using specjbb2005 for benchmarking.
One of the patterns I want to optimise is the following:

  0x0000007f8c2961b4: and w2, w2, #0x7ffff8
  0x0000007f8c2961b8: cmp w2, #0x0
  0x0000007f8c2961bc: b.eq     0x0000007f8c2968f4


Here I see an opportunity for ands, b.eq.

I created a new rule in the cpu/aarch64/vm/aarch64.ad file.
My matching looks like this:

instruct and_cmp_branch(cmpOp cmp, immI0 zero, iRegIorL2I src1, immILog src2, label lbl, rFlagsReg cr) %{
  match(If cmp (CmpI (AndI src1 src2) zero) );

  effect(USE lbl);
  ins_cost(0); // is zero at the moment to be sure the rule is triggered.

  ins_encode %{
    Label* L = $lbl$$label;
    Assembler::Condition cond = (Assembler::Condition)$cmp$$cmpcode;
    __ andsw(as_Register($src1$$reg),
        as_Register($src1$$reg),
        (unsigned long)($src2$$constant));
    __ br ((Assembler::Condition)$cmp$$cmpcode, *L);
  %}

  ins_pipe(pipe_cmp_branch); //TODO but not relevant yet
%}


As I don't know whether my matching-rule is wrong or something else stops the rule from getting emitted I wanted to find out which "and"-rule is triggered for this pattern.
I inserted some nop's to locate the according rule and I found out, that most of the emitted "and"s were surrounded by nop's except for my pattern and some few other ones like this one:

0x0000007f984bf568: eor   x1, x0, x1
0x0000007f984bf56c: and   x1, x1, #0xffffffffffffff87
0x0000007f984bf570: cbz   x1, 0x0000007f984bf664
0x0000007f984bf574: and   xscratch1, x1, #0x7
0x0000007f984bf578: cbnz  xscratch1, 0x0000007f984bf5f0
0x0000007f984bf57c: and   xscratch1, x1, #0x300
0x0000007f984bf580: cbnz  xscratch1, 0x0000007f984bf5b8
0x0000007f984bf584: mov   xscratch1, #0x37f                   // #895
0x0000007f984bf588: and   x0, x0, xscratch1
0x0000007f984bf58c: orr   x1, x0, xthread
0x0000007f984bf590: ldaxr xscratch1, [x3]
0x0000007f984bf594: cmp   xscratch1, x0
0x0000007f984bf598: b.ne  0x0000007f984bf5a8


Usually I call the program like this:

----
JAVA=/root/bwedenik/jdk8/jdk8/build/linux-aarch64-normal-server-release/jdk/bin/java

$JAVA -fullversion
$JAVA -server -XX:+AggressiveOpts -XX:+UseFastAccessorMethods -XX:+OptimizeStringConcat -XX:+UseBiasedLocking -XX:+UseParallelGC -XX:ParallelGCThreads=10 -XX:+UseParallelOldGC -XX:SurvivorRatio=8 -XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=15  -Xms10g -Xmx10g -Xmn4g -Xss64m -XX:+UnlockDiagnosticVMOptions -XX:CompileCommand='print,*DeliveryTransaction.preprocess' spec.jbb.JBBmain -propfile SPECjbb.props
----


I tried to figure out if this problem only occurs with c1, c2 or pure interpretation mode and these are the results (calling java as usual including the given arguments):

* [-Xint] : This gives me neither the inserted nop's nor the pattern I am searching for (as expected due to no compilation).
* [-client -Xcomp -XX:-TieredCompilation] : Here the cmp for #0x0 only occurs about 3 times in the whole disassembly, instead of about 200 times without these flags. In addition there are no of my inserted nop's in the disass.
* [-server -Xcomp -XX:-TieredCompilation] : Same as -client.


My question is now how to find out why the rule does not match / if the rule is correct and how to find the actual rule which emits the code of my desired pattern.

Thanks in advance,
Benedikt Wedenik, Theobroma-Systems.com<http://Theobroma-Systems.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150429/a628e9bf/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list