[aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization support for aarch64

Edward Nevill edward.nevill at linaro.org
Tue May 26 16:33:37 UTC 2015


Hi,

The following webrev

http://cr.openjdk.java.net/~enevill/8079565/webrev.00/

adds support for vectorization on aarch64.

This is an initial pass at adding vectorization. There are a number of limitations.

- Only 128 bit vectors are supported.

The current implementation only supports vectors which are exactly 128 bits in length. Support needs to be added for shorter vectors (64 / 32??).

- The pack/unpack vectorizations are missing

- The Replicate opcode is suboptimal.

Currently it just uses a sequence of MOVI/ORRI or MVNI/BICI instructions to replicate an immediate value across a vector. This can take up to 4 instructions to replicate a 32 bit value across the vector (1 MOVI and 3 ORRIs ot 1 MVNI and 3 BICIs).

- The cost model needs tuning.

At the moment most vectorizations are just costed at 1 X instruction cost.

- It needs benchmarking and tuning across different partners hardware.

For example, on some partners hardware it may not be worthwhile performing the Long or Double vectorizations.

I have done some performance testing on one partners hardware using the hotspot vector tests with the following results:-

Byte Vectors: approx 3-4 X improvement
Short Vectors: approx 2.5-3.25 X improvement
Int Vectors: approx 1.5-2.5 X improvement
Long Vectors: approx 1.0-1.33 X improvement
Float Vectors: approx 1.4-1.8 X improvement
Double Vectors: approx 0.85-1.25 X improvement

I have also implemented the scalar reduction optimization with the following results:-

Scalar Sum Reduction Int: ~4.6 X improvement
Scalar Sum Reduction Float: ~2.3 X improvement
Scalar Sum Reduction Double: ~1.9 X improvement
Scalar Product Reduction Int: ~1.2 X improvement
Scalar Product Reduction Float: ~1.1 X improvement
Scalar Product Reduction Double: ~0.8 X improvement

Tested with JTreg hotspot

Original: Test results: passed: 814; failed: 32; error: 3
Revised : Test results: passed: 814; failed: 32; error: 3
Langtools: Test results: passed: 3,222; error: 11

Please review,

Thanks,
Ed.




More information about the aarch64-port-dev mailing list