Optimization question

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Dec 24 02:22:32 UTC 2015


On 12/23/15 6:13 PM, Vitaly Davidovich wrote:
> Hi Vladimir,
>
> On Wednesday, December 23, 2015, Vladimir Kozlov <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>
>     Unfortunately whole loop unrolling happens after Escape analysis is done.
>     As result we can't eliminate allocations since we don't know which element of arrays is referenced in loop:
>
>     JavaObject NoEscape(NoEscape) NSR [ 397F 275F 276F 398F [ 197 202 ]]   185      AllocateArray   ===  127  124  178
>     8  1 ( 111  99  20  98  1  72  1  1  130  1
>
>     NSR - Non Scalar Replaceable.
>
>     After loop is unrolled the result is calculated but arrays are still allocated.
>
>
> Ah ok, that's what Kris was saying as well.  But why does unrolling matter for this purpose? Even if loop is not
> unrolled is it not known which elements are accessed?

If loop is not whole unrolled we can't eliminate load instructions sine we don't which element is loaded:

         for(int i = 0; i < array.length; i++) {
             sum += array[i] * weights[i];

>
> Also, what do you mean by "result is calculated"? What result? :)

After loop is unrolled mean()is collapsed to return pre-calculated result if we have both allocation inlined:

     static double result;

     static void test() {
         double[] d = {1,2,3,4};
         result = mean(d);
     }

10c     # MachConstantBaseNode (empty encoding)
10c     movsd   XMM0, [constant table base + #0]        # load from constant table: double=#2.500000
114     movq    R10, java/lang/Class:exact *    # ptr
11e     movsd   [R10 + #104 (8-bit)], XMM0      # double ! Field: TestFillArray.result
124     addq    rsp, 16 # Destroy frame
         popq   rbp
         testl  rax, [rip + #offset_to_poll_page]        # Safepoint: poll for GC
12f     ret

Vladimir

>
>
>     We do remove NSR allocations for boxing objects but not regular allocations:
>
>        // Eliminate boxing allocations which are not used
>        // regardless scalar replaceable status.
>        bool boxing_alloc = C->eliminate_boxing() &&
>                            tklass->klass()->is_instance_klass()  &&
>                            tklass->klass()->as_instance_klass()->is_box_klass();
>        if (!alloc->_is_scalar_replaceable && (!boxing_alloc || (res != NULL))) {
>          return false;
>        }
>
>     Only allocation followed by arraycopy skips zeroing, not by fill() call. Arrays.fill() is implemented as loop.
>
>        if (init != NULL && init->is_complete_with_arraycopy() &&
>            k->is_type_array_klass()) {
>          // Don't zero type array during slow allocation in VM since
>          // it will be initialized later by arraycopy in compiled code.
>          slow_call_address = OptoRuntime::new_array_nozero_Java();
>
>
> Hmm, I'm pretty sure fill() following an allocation had the same zeroing elision applied to it.  Kris notes the opto is
> turned off due to implementation issues, which would explain why I still see zeroing.
>
> Thanks
>
>
>     Regards,
>     Vladimir
>
>     On 12/23/15 4:56 PM, Vitaly Davidovich wrote:
>
>         Hi guys,
>
>         Consider code like this:
>
>         static double mean(double[] array, double[] weights) {
>                if (array.length != weights.length) throw ...;
>                double sum = 0;
>                double wsum = 0;
>                for(int i = 0; i < array.length; i++) {
>                     sum += array[i] * weights[i];
>                     wsum += weights[i];
>                 }
>                 return sum / wsum;
>         }
>
>         static double mean(double[] array) {
>                return mean(array, allOnes(array.length));
>         }
>
>         static double[] allOnes(int n) {
>                double[] d = new double[n];
>                Arrays.fill(d, 1);
>                return d;
>         }
>
>         Now suppose I call mean(double[]) overload like this:
>
>         double[] d = {1,2,3,4};
>
>         Using 8u51 with C2 compiler:
>
>         1) it looks like the array allocation from allOnes isn't eliminated.
>         2) moreover it looked like array was zeroed (rep stosd with rax holding zero).  Unless I misread the asm, I
>         thought an
>         allocation followed by Arrays.fill skips the zeroing?
>         3) ideally, this case would reduce to code that just does a plain unweighted mean with no multiplication by the
>         weight
>         and no summation for the weighted sum (weight sum is just array length).  Is this simply too much analysis to
>         ask for?
>
>         Thanks
>
>
>         --
>         Sent from my phone
>
>
>
> --
> Sent from my phone


More information about the hotspot-compiler-dev mailing list