scalar replacement of arrays affected by minor changes to surrounding code

Fri Sep 20 16:59:51 UTC 2019

Hi Roland,

Thanks for looking into this. Some replies inline

> it seems we could have it take care of fully unrolling the loop
Great!

> once null from the initialization and then the newly created
> array
I was afraid something like this might be happening and I threw in the
final keyword on the array, but apparently that didn't factored in the
analysis.

>  but zeroing elimination fails with this code shape
I take this to mean that there's no code shape with a newly allocated
wrapper object+array that C2 can elide allocations for.

>> It’s more like a nice bonus optimization: great if it
>> happens, Ok if not.
> sums it up very well.
Yep that's a good general summary. However it's nice to start with the most
idiomatic code possible and work backwards. While I was aware of the corner
cases around partial escape etc... I was quite surprised to see it fail for
seemingly vanilla code (what else do people do with arrays other than loop
over them? :-)). I hope EA is more of a "first-class" citizen with graal.
FWIW I've raised this issue with them -
https://github.com/oracle/graal/issues/1697.

Thanks,
Govind

On Fri, Sep 20, 2019 at 10:12 AM Roland Westrelin <rwestrel at redhat.com>
wrote:

>
> > public class ArrayLoop {
> >     private static Random s_r = new Random();
> >     private static int next() { return s_r.nextInt() % 1000; }
> >
> >     private static int loop(int... arr) {
> >         int sum = 0;
> >         for (int i = arr.length - 1; i >= 0; sum += arr[i--]) { ; }
> >         return sum;
> >     }
> >
> >     @Benchmark
> >     public void loopSum(Blackhole bh) {
> >         bh.consume(loop(next(), next()));
> >     }
> > }
>
> The problem with that one is that EA would need the loop to be fully
> unrolled to eliminate the allocation but that only happens after EA. So
> it's a pass ordering problem. We already run a pass of loop
> optimizations before EA so it seems we could have it take care of fully
> unrolling the loop.
>
> > scalar replacement kicks in as expected. Then to try out a more realistic
> > scenario representing our usage, we added the following wrapper and
> > benchmarks
> >
> >     private static class ArrayWrapper {
> >         final int[] arr;
> >         ArrayWrapper(int... many) { arr = many; }
> >         int loopSum() { return loop(arr); }
> >         int unrolledSum() { return unrolled(arr); }
> >     }
> >
> >     @Benchmark
> >     public void wrappedUnrolledSum(Blackhole bh) {
> >         bh.consume(new ArrayWrapper(next(), next()).unrolledSum());
> >     }
> >
> >     @Benchmark
> >     public void wrappedLoopSum(Blackhole bh) {
> >         bh.consume(new ArrayWrapper(next(), next()).loopSum());
> >     }
>
> EA doesn't eliminate the array allocation because it's assigned to a
> field and C2 sees 2 assignments to that field: once null from the
> initialization and then the newly created array. That's sufficient to
> make EA bail out. C2 would usually eliminate the zeroing from
> initialization so this would not be a problem but zeroing elimination
> fails with this code shape. I don't see an immediate way to fix this by
> either improving zeroing elimination or EA itself. It could well be that
> EA could proceed in this case and it's simply too conservative but
> making sure of that would require further investigation.
>
> Anyway, I think Vitaly's advice:
>
> > But beyond the intellectual curiosity angle behind your question, you
> > really shouldn’t rely on EA to recover performance (or GC pressure) in
> > critical scenarios.  It’s more like a nice bonus optimization: great if
> it
> > happens, Ok if not.
>
> sums it up very well.
>
> Roland.
>