RFR: 8256058: Improve vector register handling in RegisterMap::pd_location() on x86
Vladimir Ivanov
vlivanov at openjdk.java.net
Mon Nov 9 19:01:01 UTC 2020
`RegisterMap` handles registers on a per-slot basis: every register is split into slot-sized (32-bit) parts that are tracked independently. On x86 vector registers are up to 512-bit in size and occupy up to 16 slots. In order to save on constructing RegisterMaps, `RegisterMap`s are sparsely populated: location of the first slot is recorded and the rest is derived from it and `RegisterMap::pd_location()` is used to compute the address of a particular slot if it is missing in the map.
As I mentioned in #1131, frame layout for vector registers is quite complicated: ZMM0-15 are split in 3 parts (2 x 128-bit + 1 x 256-bit) when saved while ZMM16-31 are stored in full.
Proposed patch reifies those details in `RegisterMap::pd_location()` logic and it becomes possible to initialize just 3 slot locations for ZMM0-15 to be able to recover every slot location inside the register while for ZMM16-31 initializing a single (base) slot is enough.
Testing (along with some other relevant patches):
- [x] jdk/incubator/vector w/ -XX:+DeoptimizeALot and -XX:UseAVX=3 on AVX512-capable hardware
- [x] hs-precheckin-comp, hs-tier1, hs-tier2
-------------
Commit messages:
- Improve vector register handling in RegisterMap::pd_location()
Changes: https://git.openjdk.java.net/jdk/pull/1132/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1132&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8256058
Stats: 35 lines in 1 file changed: 19 ins; 7 del; 9 mod
Patch: https://git.openjdk.java.net/jdk/pull/1132.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/1132/head:pull/1132
PR: https://git.openjdk.java.net/jdk/pull/1132
More information about the hotspot-compiler-dev
mailing list