[foreign-memaccess] RFR 8237082: Workaround C2 limitations when working with long loops
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Tue Jan 14 15:06:30 UTC 2020
Hi,
both C2 and Graal do not like for loops with long computations inside -
in C2, the bound-check-elimination code (BCE) is extremely sensitive to
which opcodes are used when putting together offsets in a linear
expression [1]. So, whenever C2 sees a long opcode being used (e.g.
LMUL, or LADD) the BCE logic bails out, which means that in such cases
we get no bounds check hoisting.
The medium/long term solution is, obviously, to fix C2 [2], so that it
works on long loops as well as with int loops - after all, this is where
many of the new APIs we are designing are headed.
As a short term boost, I've put together a patch which classifies
segments as either small or large; if a segment is small, then we can
use some trickery to force int opcodes to be used in offset computations
instead of their long counterparts. Doing so removes most of the
performance bottlenecks associated with indexed var handle access, some
of which were visible in the synthetic benchmarks (see
LoopOverNonConstant [3]).
Webrev:
http://cr.openjdk.java.net/~mcimadamore/panama/8237082_v2/
Cheers
Maurizio
[1] -
http://hg.openjdk.java.net/panama/dev/file/b94889c7e153/src/hotspot/share/opto/loopnode.cpp
[2] - https://bugs.openjdk.java.net/browse/JDK-8223051
[3] -
http://hg.openjdk.java.net/panama/dev/file/fb3c9c52fdff/test/micro/org/openjdk/bench/jdk/incubator/foreign/LoopOverNonConstant.java
More information about the panama-dev
mailing list