[aarch64-port-dev ] RFR: 8079565: aarch64: Add vectorization support for aarch64
Edward Nevill
edward.nevill at linaro.org
Tue May 26 16:33:37 UTC 2015
Hi,
The following webrev
http://cr.openjdk.java.net/~enevill/8079565/webrev.00/
adds support for vectorization on aarch64.
This is an initial pass at adding vectorization. There are a number of limitations.
- Only 128 bit vectors are supported.
The current implementation only supports vectors which are exactly 128 bits in length. Support needs to be added for shorter vectors (64 / 32??).
- The pack/unpack vectorizations are missing
- The Replicate opcode is suboptimal.
Currently it just uses a sequence of MOVI/ORRI or MVNI/BICI instructions to replicate an immediate value across a vector. This can take up to 4 instructions to replicate a 32 bit value across the vector (1 MOVI and 3 ORRIs ot 1 MVNI and 3 BICIs).
- The cost model needs tuning.
At the moment most vectorizations are just costed at 1 X instruction cost.
- It needs benchmarking and tuning across different partners hardware.
For example, on some partners hardware it may not be worthwhile performing the Long or Double vectorizations.
I have done some performance testing on one partners hardware using the hotspot vector tests with the following results:-
Byte Vectors: approx 3-4 X improvement
Short Vectors: approx 2.5-3.25 X improvement
Int Vectors: approx 1.5-2.5 X improvement
Long Vectors: approx 1.0-1.33 X improvement
Float Vectors: approx 1.4-1.8 X improvement
Double Vectors: approx 0.85-1.25 X improvement
I have also implemented the scalar reduction optimization with the following results:-
Scalar Sum Reduction Int: ~4.6 X improvement
Scalar Sum Reduction Float: ~2.3 X improvement
Scalar Sum Reduction Double: ~1.9 X improvement
Scalar Product Reduction Int: ~1.2 X improvement
Scalar Product Reduction Float: ~1.1 X improvement
Scalar Product Reduction Double: ~0.8 X improvement
Tested with JTreg hotspot
Original: Test results: passed: 814; failed: 32; error: 3
Revised : Test results: passed: 814; failed: 32; error: 3
Langtools: Test results: passed: 3,222; error: 11
Please review,
Thanks,
Ed.
More information about the aarch64-port-dev
mailing list