[AD file] Optimize emitted code by matching complex IR patterns
Gustavo Serra Scalet
gustavo.scalet at eldorado.org.br
Mon Oct 31 19:35:08 UTC 2016
Hi Goetz,
> -----Original Message-----
> From: Lindenmaier, Goetz [mailto:goetz.lindenmaier at sap.com]
> Sent: sexta-feira, 28 de outubro de 2016 11:42
> To: Gustavo Serra Scalet <gustavo.scalet at eldorado.org.br>; hotspot-
> compiler-dev at openjdk.java.net
> Subject: RE: [AD file] Optimize emitted code by matching complex IR
> patterns
>
> But matching LoadB and AddP in your picture will probably fail.
>
> The matcher only matches trees or DAGs. Here the problem is that
>
> the AddP has 3 more outs.
Ok, then it seems like no go for this optimization. I see other situations that I'd optimize but all of them have more outs.
Thanks for pointing that out.
> I think there are some special cases, but
>
> don't remember in detail. We overruled this once for DecodeN, but
>
> it was not a good idea because it increases register pressure (you will
> hold
>
> the narrow oop and the oop in two registers at the same time although
> they
>
> have the exact same bit pattern in 32-bit cOops mode.)
>
>
>
>
>
> You can avoid the nop by just leaving out the size(8) line. It then
> assumes
>
> varying size and checks on every emit.
>
> But maybe you can avoid the 'if' if you already check for constant '0'
> in
>
> the predicate of the operand.
>
>
>
> Best regards,
>
> Goetz.
>
>
>
> From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-
> bounces at openjdk.java.net] On Behalf Of Gustavo Serra Scalet
> Sent: Donnerstag, 27. Oktober 2016 21:35
> To: hotspot-compiler-dev at openjdk.java.net
> Subject: [AD file] Optimize emitted code by matching complex IR patterns
>
>
>
> Hi,
>
>
>
> I wanted to match this set of selected nodes:
>
>
>
>
>
> Which basically is, for ppc, doing an "add" followed by a "lbz".
> However, if the memory pointer of LoadB has a displacement of zero, it
> can be done with a single "lbzx" instruction.
>
>
>
> As I understood, I could manage doing it with a new instruction just
> like Igor did for cmpldi[1], but I don't see it working:
>
> instruct loadUB_indexed(iRegIdst dst, indirectMemory src1, iRegLsrc
> src2) %{
>
> // match-rule
>
> match(Set dst (LoadB (AddP src1 src2)));
>
> predicate(n->as_Load()->is_unordered() || followed_by_acquire(n));
>
> // Hint that lbzx is cheaper than add + lbz
>
> ins_cost(MEMORY_REF_COST_LOW);
>
> format %{ "LBZX $dst, $src1, $src2" %}
>
> size(8);
>
> ins_encode %{
>
> int Idisp = $src1$$disp + frame_slots_bias($src1$$base, ra_);
>
> if (Idisp) {
>
> __ add($dst$$Register, $src1$$base$$Register, $src2$$Register);
>
> __ lbz($dst$$Register, Idisp, $dst$$Register);
>
> } else {
>
> __ lbzx($dst$$Register, $src1$$base$$Register, $src2$$Register);
>
> __ nop();
>
> }
>
> %}
>
> ins_pipe(pipe_class_memory);
>
> %}
>
>
>
>
>
> Notes:
>
> 1) I would probably pack some of the ins_encode instructions to be used
> as expand. I left it there so it's easier to read.
>
> 2) Most of the code came from loadB_indirect_Ex so I expanded it to
> match a previous add.
>
> 3) The nop can probably be avoided somehow. I'd take a look once this
> feature actually works.
>
>
>
> Well, it compiles, but then when I run javac:
>
> o45 LoadB === _ o7 o44 [[o56 o46 8 ]]
> @java/lang/String:exact+20 *, name=coder, idx=4; #byte
>
> mach:
>
> 12 loadConL16 === _ [[ 11 ]] [6600012]
>
> o10 Parm === o3 [[o78 o72 o44 o44 o78 4 11 ]] Parm0:
> java/lang/String:NotNull:exact * Oop:java/lang/String:NotNull:exact *
>
> o7 Parm === o3 [[o161 o175 o79 o72 o119 o168 o98 o45 4 11 ]]
> Memory Memory: @BotPTR *+bot, idx=Bot;
>
> 11 loadUB_indexed === _ o7 o10 12 [[]]
>
> # To suppress the following error report, specify this argument
>
> # after -XX: or in .hotspotrc: SuppressErrorAt=/matcher.cpp:1694
>
> #
>
> # A fatal error has been detected by the Java Runtime Environment:
>
> #
>
> # Internal Error (/home/gut/hs-
> comp/hotspot/src/share/vm/opto/matcher.cpp:1694), pid=1649, tid=1734
>
> # assert(m->adr_type() == mach_at) failed: matcher should not change
> adr type
>
> #
>
> # JRE version: OpenJDK Runtime Environment (9.0) (slowdebug build 9-
> internal+0-2016-10-14-173706.gut.hs-comp)
>
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 9-internal+0-2016-10-14-
> 173706.gut.hs-comp, mixed mode, tiered, compressed oops, g1 gc, linux-
> ppc64le)
>
> # No core dump will be written. Core dumps have been disabled. To enable
> core dumping, try "ulimit -c unlimited" before starting Java again
>
> #
>
> # An error report file with more information is saved as:
>
> # /home/gut/hs-comp/hs_err_pid1649.log
>
> o34 LoadB === _ o7 o33 [[o45 ]] @java/lang/String:exact+20 *,
> name=coder, idx=4; #byte
>
> mach:
>
> 11 loadConL16 === _ [[ 10 ]] [6900011]
>
> o10 Parm === o3 [[o33 o33 10 ]] Parm0:
> java/lang/String:NotNull:exact * Oop:java/lang/String:NotNull:exact *
>
> o7 Parm === o3 [[o48 o34 10 ]] Memory Memory: @BotPTR *+bot,
> idx=Bot;
>
> 10 loadUB_indexed === _ o7 o10 11 [[]]
>
> [thread 1727 also had an error]
>
> #
>
> # Compiler replay data is saved as:
>
> # /home/gut/hs-comp/replay_pid1649.log
>
> #
>
> # If you would like to submit a bug report, please visit:
>
> # http://bugreport.java.com/bugreport/crash.jsp
> <http://bugreport.java.com/bugreport/crash.jsp>
>
> #
>
> Current thread is 1734
>
> Dumping core ...
>
> Aborted
>
>
>
>
>
> Even after investigating that assert (which checks for m-
> >in(MemNode::Address)->is_DecodeNarrowPtr()), I didn't quite understand
> it. If I add on my match rule a EncodeP/DecodeN between LoadB and AddP
> to satisfy the NarrowPtr check, it simply doesn't match anything so my
> expectation is that it's correct and matching.
>
>
>
> Could anybody please point out what I missed?
>
>
>
> I also wanted to ask if the other linked nodes of, e.g "339 Addp", is a
> concern? As the IdealGraphVisualizer points out, it doesn't have only
> the "357 LoadB" node attached to it and I wonder what it'd do if this
> new loadUB_indexed instruction was working.
>
>
>
>
>
> Thanks in advance,
>
> Gustavo Serra Scalet
>
>
>
> References:
>
> [1] http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/2016-
> October/002713.html <http://mail.openjdk.java.net/pipermail/ppc-aix-
> port-dev/2016-October/002713.html>
>
>
More information about the hotspot-compiler-dev
mailing list