SIMD auto-vectorization in hotspot
Vitaly Davidovich
vitalyd at gmail.com
Tue Jan 17 10:12:08 PST 2012
Great, thanks for that - I think vector unit math will bring some serious
perf gains on quite a bit of code.
Regards,
Vitaly
Sent from my phone
On Jan 17, 2012 1:09 PM, "Vladimir Kozlov" <vladimir.kozlov at oracle.com>
wrote:
> Vitaly Davidovich wrote:
>
>> Awesome, that's great to hear. Is there a release tentatively scheduled
>> to have this or it's still beyond the current horizon?
>>
>
> It targets jdk 8 and one of future jdk 7 updates. I can't say more precise.
>
>
>> If I may I'd like to petition for a flag being available, ideally in
>> product, to trace diagnostics on this part as well; TraceSuperWord or a new
>> companion, whichever you think makes most sense.
>>
>
> We can do that.
>
> Vladimir
>
>
>> Regards,
>> Vitaly
>>
>> On Jan 17, 2012 3:50 AM, "Vladimir Kozlov" <vladimir.kozlov at oracle.com<mailto:
>> vladimir.kozlov@**oracle.com <vladimir.kozlov at oracle.com>>> wrote:
>>
>> Vector arithmetic are not generated yet (we have plan to do it).
>> Only array initialization and array copy is done. I recently added
>> AVX support to generated 3-operands instructions but only scalar one.
>>
>> Vladimir
>>
>> On 1/16/12 11:03 PM, Vitaly Davidovich wrote:
>>
>> Thanks Kris, I'll take a look. I should've been clearer in my
>> question though -- I'm mostly interested in packed math
>> (integer and floating point).
>>
>> Cheers
>>
>> On Tue, Jan 17, 2012 at 1:36 AM, Krystal Mok
>> <rednaxelafx at gmail.com <mailto:rednaxelafx at gmail.com>
>> <mailto:rednaxelafx at gmail.com <mailto:rednaxelafx at gmail.com>**>__>
>> wrote:
>>
>> Hi,
>>
>> Apparently, the source code to read is
>> opto/superword.[hpp|cpp] [1][2]. The flag to trace it is
>> -XX:+TraceSuperWord.
>> There's a paper mentioned in [1], which is the origin of the
>> algorithm used.
>>
>> An example to see superword in action is a simple array copy
>> loop:
>>
>> // byte[] src = ...
>> // byte[] dest = ...
>> // assert(src.length == dest.length)
>> for (int i = 0; i < src.length; i++) {
>> dest[i] = src[i];
>> }
>>
>> This loop will be vectorized by superword and then unrolled a
>> bit, turning the actual copy into something like: (on
>> 32-bit x86)
>>
>> movq 0xc(%ecx,%edi,4),%xmm0
>> movq %xmm0,0xc(%edx,%edi,4)
>>
>> HTH,
>> - Kris
>>
>> [1]:
>> http://hg.openjdk.java.net/__**hsx/hotspot-main/hotspot/file/**
>> __tip/src/share/vm/opto/__**superword.hpp<http://hg.openjdk.java.net/__hsx/hotspot-main/hotspot/file/__tip/src/share/vm/opto/__superword.hpp>
>> <http://hg.openjdk.java.net/**hsx/hotspot-main/hotspot/file/**
>> tip/src/share/vm/opto/**superword.hpp<http://hg.openjdk.java.net/hsx/hotspot-main/hotspot/file/tip/src/share/vm/opto/superword.hpp>
>> >
>> [2]:
>> http://hg.openjdk.java.net/__**hsx/hotspot-main/hotspot/file/**
>> __tip/src/share/vm/opto/__**superword.cpp<http://hg.openjdk.java.net/__hsx/hotspot-main/hotspot/file/__tip/src/share/vm/opto/__superword.cpp>
>> <http://hg.openjdk.java.net/**hsx/hotspot-main/hotspot/file/**
>> tip/src/share/vm/opto/**superword.cpp<http://hg.openjdk.java.net/hsx/hotspot-main/hotspot/file/tip/src/share/vm/opto/superword.cpp>
>> >
>> <http://hg.openjdk.java.net/__**
>> hsx/hotspot-main/hotspot/file/**__tip/src/share/vm/opto/__**superword.hpp<http://hg.openjdk.java.net/__hsx/hotspot-main/hotspot/file/__tip/src/share/vm/opto/__superword.hpp>
>> <http://hg.openjdk.java.net/**hsx/hotspot-main/hotspot/file/**
>> tip/src/share/vm/opto/**superword.hpp<http://hg.openjdk.java.net/hsx/hotspot-main/hotspot/file/tip/src/share/vm/opto/superword.hpp>
>> >>
>>
>>
>> On Tue, Jan 17, 2012 at 1:45 PM, Vitaly Davidovich
>> <vitalyd at gmail.com <mailto:vitalyd at gmail.com>
>> <mailto:vitalyd at gmail.com <mailto:vitalyd at gmail.com>>> wrote:
>>
>> Hi guys,
>>
>> Does the -XX:+UseSuperWord (I see it's set to true on
>> 6u23) optimization enable auto-vectorization of certain
>> loops? If so, are there any switches that can be turned
>> on (in product VM) to see if vectorization is being
>> applied or not? In general, would someone be able to
>> describe the current state of loop vectorization (i.e.
>> implementation limits, restrictions, etc)? Which ISAs are
>> supported (i.e. SSE, AVX)? Any other info on this
>> subject would be greatly appreciated ...
>>
>> Thanks very much,
>>
>> Vitaly
>>
>>
>>
>>
>>
>> --
>> Vitaly
>> 617-548-7007 <tel:617-548-7007> (mobile)
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20120117/63d0a9b6/attachment.html
More information about the hotspot-compiler-dev
mailing list