Vectorized vs. unvectorized array access

Wed Aug 17 17:22:03 UTC 2016

Hi everyone,

I have a basic kernel with a "per bloc" access pattern. Only one of the
following implementations is vectorized by C2 (java 1.8.0_51).

Does anyone know the reason why ? What Is the rule to avoid those
vectorization barriers ?

The code loops linearly over the primitive arrays a & b by packet of size
PACKET_SIZE (here the per-bloc access is useless but simplifies the code).

Loop constants are in capital letters.

*Vectorized*
for (int ibeg=0; ibeg<(NB_PACKETS*PACKET_SIZE); ibeg+=PACKET_SIZE)
    for (int i=ibeg; i<(ibeg+PACKET_SIZE); ++i)
        a[i] += b[i];

*Unvectorized*
for (int ibeg=0; ibeg<(NB_PACKETS*PACKET_SIZE); ibeg+=PACKET_SIZE)
    for (int off=0; off<PACKET_SIZE; ++off)
        a[ibeg+off] += b[ibeg+off];

*Unvectorized*

for (int packet=0; packet<NB_PACKETS; ++packet)
    for (int off=0; off<PACKET_SIZE; ++off)
        a[(packet*PACKET_SIZE)+off] += b[(packet_id*PACKET_SIZE)+off];

Thanks a lot,
Nassim.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20160817/19ad0c39/attachment.html>