[aarch64-port-dev ] RFR: 8169697: aarch64: vectorized MLA instruction not generated for some test cases
Roland Westrelin
rwestrel at redhat.com
Thu Nov 24 14:01:15 UTC 2016
>> There are two versions of TestSimdMlaInt.vectSumOfMulAdd1 compiled by C2.
>> Please check the OSR version (with %) which uses vector mla instruction
>> with my patch. Without my patch, vector mul and add instructions are used.
>
> OK, so you have the same problem that I do with this. I do not
> know why vectorized code is not being generated for the non-OSR
> case.
What about with the patch below?
It seems the problem is that c2 fails to recognize the reduction in the
loop because the test below is correct only for a node that is data node
(a Phi in the case of the OSR version of the method) but not for a
control node (a return in the normal compilation case).
Roland.
diff --git a/src/share/vm/opto/loopTransform.cpp b/src/share/vm/opto/loopTransform.cpp
--- a/src/share/vm/opto/loopTransform.cpp
+++ b/src/share/vm/opto/loopTransform.cpp
@@ -1742,7 +1742,7 @@
// The result of the reduction must not be used in the loop
for (DUIterator_Fast imax, i = def_node->fast_outs(imax); i < imax && ok; i++) {
Node* u = def_node->fast_out(i);
- if (has_ctrl(u) && !loop->is_member(get_loop(get_ctrl(u)))) {
+ if (!loop->is_member(get_loop(ctrl_or_self(u)))) {
continue;
}
if (u == phi) {
More information about the hotspot-compiler-dev
mailing list