[foreign-memaccess] RFR 8237082: Workaround C2 limitations when working with long loops

Paul Sandoz paul.sandoz at oracle.com
Tue Jan 14 17:35:17 UTC 2020


Very nice. 

What Jorn said plus:

AddressVarHandleGenerator
—

- might wanna factor out the repeated MethodType construction as a static constant or within its own method.


MemorySegmentImpl
—

 202     private void checkBounds(long offset, long length) {
 203         if (isSmall()) {
 204             if ((int)length < 0 ||
 205                     (int)offset < 0 ||
 206                     (int)offset > (int)this.length - (int)length) { // careful of overflow
 207                 throw new IndexOutOfBoundsException(String.format("Out of bound access on segment %s; new offset = %d; new length = %d",
 208                         this, offset, length));
 209             }

For clarity suggest you push everything into it’s own (force-inlined?) method e.g.:

  If (isSmall()) {
    checkBoundsSmall((int)offset, (int)length);
  } else { …
    If (…) {
       throw outOfBoundsException(offset, length);
    }
  }

Paul.

> On Jan 14, 2020, at 7:06 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
> 
> Hi,
> both C2 and Graal do not like for loops with long computations inside - in C2, the bound-check-elimination code (BCE) is extremely sensitive to which opcodes are used when putting together offsets in a linear expression [1]. So, whenever C2 sees a long opcode being used (e.g. LMUL, or LADD) the BCE logic bails out, which means that in such cases we get no bounds check hoisting.
> 
> The medium/long term solution is, obviously, to fix C2 [2], so that it works on long loops as well as with int loops - after all, this is where many of the new APIs we are designing are headed.
> 
> As a short term boost, I've put together a patch which classifies segments as either small or large; if a segment is small, then we can use some trickery to force int opcodes to be used in offset computations instead of their long counterparts. Doing so removes most of the performance bottlenecks associated with indexed var handle access, some of which were visible in the synthetic benchmarks (see LoopOverNonConstant [3]).
> 
> Webrev:
> 
> http://cr.openjdk.java.net/~mcimadamore/panama/8237082_v2/
> 
> Cheers
> Maurizio
> 
> [1] - http://hg.openjdk.java.net/panama/dev/file/b94889c7e153/src/hotspot/share/opto/loopnode.cpp
> [2] - https://bugs.openjdk.java.net/browse/JDK-8223051
> [3] - http://hg.openjdk.java.net/panama/dev/file/fb3c9c52fdff/test/micro/org/openjdk/bench/jdk/incubator/foreign/LoopOverNonConstant.java
> 
> 



More information about the panama-dev mailing list