Optimization question

Thu Dec 24 02:30:13 UTC 2015

On Wednesday, December 23, 2015, Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> On 12/23/15 6:13 PM, Vitaly Davidovich wrote:
>
>> Hi Vladimir,
>>
>> On Wednesday, December 23, 2015, Vladimir Kozlov <
>> vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>
>>     Unfortunately whole loop unrolling happens after Escape analysis is
>> done.
>>     As result we can't eliminate allocations since we don't know which
>> element of arrays is referenced in loop:
>>
>>     JavaObject NoEscape(NoEscape) NSR [ 397F 275F 276F 398F [ 197 202 ]]
>>  185      AllocateArray   ===  127  124  178
>>     8  1 ( 111  99  20  98  1  72  1  1  130  1
>>
>>     NSR - Non Scalar Replaceable.
>>
>>     After loop is unrolled the result is calculated but arrays are still
>> allocated.
>>
>>
>> Ah ok, that's what Kris was saying as well.  But why does unrolling
>> matter for this purpose? Even if loop is not
>> unrolled is it not known which elements are accessed?
>>
>
> If loop is not whole unrolled we can't eliminate load instructions sine we
> don't which element is loaded:

Ok.  Can't say I understand why it needs to be fully unrolled to determine
which elements will be accessed, but that's fine - thanks.

>
>         for(int i = 0; i < array.length; i++) {
>             sum += array[i] * weights[i];
>
>
>> Also, what do you mean by "result is calculated"? What result? :)
>>
>
> After loop is unrolled mean()is collapsed to return pre-calculated result
> if we have both allocation inlined:
>
>     static double result;
>
>     static void test() {
>         double[] d = {1,2,3,4};
>         result = mean(d);
>     }
>
> 10c     # MachConstantBaseNode (empty encoding)
> 10c     movsd   XMM0, [constant table base + #0]        # load from
> constant table: double=#2.500000
> 114     movq    R10, java/lang/Class:exact *    # ptr
> 11e     movsd   [R10 + #104 (8-bit)], XMM0      # double ! Field:
> TestFillArray.result
> 124     addq    rsp, 16 # Destroy frame
>         popq   rbp
>         testl  rax, [rip + #offset_to_poll_page]        # Safepoint: poll
> for GC
> 12f     ret

Interesting, I'm pretty sure I didn't see a precomputed constant returned
but I'll double check again tomorrow.

What's this pseudo assembly above? Is that available in product builds or
is it some debug build output?

Thanks for the replies.

>
> Vladimir
>
>
>>
>>     We do remove NSR allocations for boxing objects but not regular
>> allocations:
>>
>>        // Eliminate boxing allocations which are not used
>>        // regardless scalar replaceable status.
>>        bool boxing_alloc = C->eliminate_boxing() &&
>>                            tklass->klass()->is_instance_klass()  &&
>>
>>  tklass->klass()->as_instance_klass()->is_box_klass();
>>        if (!alloc->_is_scalar_replaceable && (!boxing_alloc || (res !=
>> NULL))) {
>>          return false;
>>        }
>>
>>     Only allocation followed by arraycopy skips zeroing, not by fill()
>> call. Arrays.fill() is implemented as loop.
>>
>>        if (init != NULL && init->is_complete_with_arraycopy() &&
>>            k->is_type_array_klass()) {
>>          // Don't zero type array during slow allocation in VM since
>>          // it will be initialized later by arraycopy in compiled code.
>>          slow_call_address = OptoRuntime::new_array_nozero_Java();
>>
>>
>> Hmm, I'm pretty sure fill() following an allocation had the same zeroing
>> elision applied to it.  Kris notes the opto is
>> turned off due to implementation issues, which would explain why I still
>> see zeroing.
>>
>> Thanks
>>
>>
>>     Regards,
>>     Vladimir
>>
>>     On 12/23/15 4:56 PM, Vitaly Davidovich wrote:
>>
>>         Hi guys,
>>
>>         Consider code like this:
>>
>>         static double mean(double[] array, double[] weights) {
>>                if (array.length != weights.length) throw ...;
>>                double sum = 0;
>>                double wsum = 0;
>>                for(int i = 0; i < array.length; i++) {
>>                     sum += array[i] * weights[i];
>>                     wsum += weights[i];
>>                 }
>>                 return sum / wsum;
>>         }
>>
>>         static double mean(double[] array) {
>>                return mean(array, allOnes(array.length));
>>         }
>>
>>         static double[] allOnes(int n) {
>>                double[] d = new double[n];
>>                Arrays.fill(d, 1);
>>                return d;
>>         }
>>
>>         Now suppose I call mean(double[]) overload like this:
>>
>>         double[] d = {1,2,3,4};
>>
>>         Using 8u51 with C2 compiler:
>>
>>         1) it looks like the array allocation from allOnes isn't
>> eliminated.
>>         2) moreover it looked like array was zeroed (rep stosd with rax
>> holding zero).  Unless I misread the asm, I
>>         thought an
>>         allocation followed by Arrays.fill skips the zeroing?
>>         3) ideally, this case would reduce to code that just does a plain
>> unweighted mean with no multiplication by the
>>         weight
>>         and no summation for the weighted sum (weight sum is just array
>> length).  Is this simply too much analysis to
>>         ask for?
>>
>>         Thanks
>>
>>
>>         --
>>         Sent from my phone
>>
>>
>>
>> --
>> Sent from my phone
>>
>

-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20151223/3851f017/attachment.html>