Optimization question

Vitaly Davidovich vitalyd at gmail.com
Thu Dec 24 02:56:40 UTC 2015


On Wednesday, December 23, 2015, Krystal Mok <rednaxelafx at gmail.com> wrote:

> Resending to the list...
>
> On Wed, Dec 23, 2015 at 6:17 PM, Vitaly Davidovich <vitalyd at gmail.com
> <javascript:_e(%7B%7D,'cvml','vitalyd at gmail.com');>> wrote:
>
>> Ok so Vladimir mentioned the same thing about any access that EA doesn't
>> know about.  I guess I'm still unclear on why unrolling needs to happen
>> given array length can be deduced, loop stride is constant, and loop body
>> shows which arrays and indices are accessed and in what manner (read,
>> write, both).  It seems like all the info is there even without unrolling.
>> Is this just an implementation detail or am I missing something fundamental?
>>
>>
> It's an implementation detail of HotSpot C2.
>
> The loop structure information that you're talking about are actually not
> available until when loop optimization kicks in:
> - Counted loops are discovered (which computes the loop stride and bounds)
> - Loops are then unrolled (and only then the indices become constants)
>
> Unfortunately that happens after EA.
>
> Now, suppose EA tries to handle the "array" in your example, and scalar
> replaces its elements into local variable e0, e1, e2 and e3.
> Then what does "array[i]" translate to? If the index "i" is known as a
> constant, e.g. 2, then array[2] would be translated to e2.
> Otherwise there no straightforward way to translate it, since the 4 new
> local variable are "unrelated" (not guaranteed to be packed together
> anymore), you can't even try to efficiently make an interior pointer to
> dynamically point to them.
>

Yeah I see how piggybacking on unrolling helps.  My thinking was you know
the range of indices that will be accessed (assuming the constant array
length is propagated); including the number of items.  If number of
accesses is within some threshold for scalar replacement, you could then
allocate that many locals and assign them from the array based on the range
and stride.

I should also mention that Kris told me about PrintOptoAssembly, which is
what Vladimir's output was; alas, it's nonproduct flag so I'll be sticking
to raw full assembly reading :).

At any rate, I think you guys have fully answered my original questions, so
thanks very much for taking the time to do that.


> - Kris
>


-- 
Sent from my phone
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20151223/4a591037/attachment-0001.html>


More information about the hotspot-compiler-dev mailing list