Question about RegMask::is_aligned_sets()
Corey Ashford
cjashfor at linux.ibm.com
Thu Mar 11 03:13:51 UTC 2021
Hello Vladimir,
Currently I'm looking at the register mask defined for the Power64-LE
machine. There are 64 128-bit registers defined via reg_def, e.g.:
reg_def VSR25 ( SOC, SOC, Op_VecX, 25, NULL);
But only the 20 of those vector registers that are declared as part of
"reg_class vs_reg( ... )" end up with a mask in the generated
ad_ppc_expand.cpp source file, and further, each of those registers is
allocated just a single bit in the register mask:
const RegMask _VS_REG_mask( 0x0, 0x0, 0x0, 0x0, 0x0, 0xfffff00, 0x0,
0x0, 0x0, 0x0 );
I would have expected that since Op_VecX is a 128-bit type, it would
have received four bits per register in the mask.
On x86, each of the 512-bit vector registers is declared using 16
Op_RegF register declarations (which makes sense - 16 x 32 = 512), but
on aarch64, which can have up to a 1024-bit vector register, vector
registers are declared using just 8 x Op_RegF (8 x 32 = 256 bits).
There is an extensive comment in the aarch64.ad about this, but it seems
to imply that the 32-bits-per-slot rule is not rigid (just as on PPC64)
---
As to your comment about not needing to use a vector register for the
boolean vectors, that's quite interesting. So for all vector types
except for tiny integers, I should be able to use a 64-bit bit general
purpose register. I'm very new to all of this that I'm not clear how
easy it will be to mix and match register types like this, but I will
start experimenting with the idea.
If you have any further thoughts, I'd appreciate hearing them.
Kind Regards,
- Corey
On 3/5/21 3:19 PM, Vladimir Ivanov wrote:
> Hi Corey,
>
>> I'd like to understand the concept of "aligned sets" in RegMask. I
>> believe I understand the RegMask idea overall, but I don't understand
>> the idea of alignment of sets (actually the concept of sets in this
>> context is also fuzzy). I've looked at the code that implements
>> is_aligned_sets, and I just can't yet seem to grok what requirement it
>> is trying to verify. I read RegMask.hpp's comments on the method
>> protoype, and it didn't help me much, I'm afraid. If someone could
>> give a paragraph or two of explanation, I'd really appreciate it.
>
> A register in RegMask is comprised of packed bits each representing a
> 32-bit slot. So, a VecX register occupies 4 bits (128 = 4 x 32) while
> VecZ needs 16 (512 = 16 x 32).
>
> Some code relies on the alignment when recovering base register from VMReg:
>
> https://github.com/openjdk/jdk/blob/e1cad97049642ab201d53ff608937f7e7ef3ff3e/src/hotspot/cpu/x86/registerMap_x86.cpp#L29
>
>
> src/hotspot/cpu/x86/registerMap_x86.cpp
>
> 29 address RegisterMap::pd_location(VMReg reg) const {
> 30 if (reg->is_XMMRegister()) {
> 31 int reg_base = reg->value() - ConcreteRegisterImpl::max_fpr;
> 32 int base_reg_enc = (reg_base /
> XMMRegisterImpl::max_slots_per_register);
> 33 assert(base_reg_enc >= 0 && base_reg_enc <
> XMMRegisterImpl::number_of_registers, "invalid XMMRegister: %d",
> base_reg_enc);
> 34 VMReg base_reg = as_XMMRegister(base_reg_enc)->as_VMReg();
>
>> We have started working on adding support to the PPC64-LE hotspot code
>> for the Vector API. In order to support Vector Masks, it seems we
>> need to change our current support for fixed-length, 128-bit vectors
>> to something that can be as short as two booleans. To do that we have
>> changed the function min_vector_size in hotspot/cpu/ppc.ad to return 2
>> when the type is T_BOOLEAN, otherwise it still returns 16.
>>
>> My first task was to add support for vector masks, and so I added a
>> new instruct to cpu/ppc/ppc.ad to match VectorLoadMask, which then
>> necessitated adding some instructs for LoadVector and StoreVector of
>> the appropriate lengths.
>
> I don't know much about PPC64-LE, but you don't have to use boolean
> vectors. FTR masks have the same type as the vectors they are applied
> to. Until recently (when work on predicated registers started), it was
> the only mask representation in Ideal IR.
>
> Best regards,
> Vladimir Ivanov
>
>> I have a test case that loads a vector mask for a vector of shorts:
>>
>> import jdk.incubator.vector.ShortVector;
>> import jdk.incubator.vector.VectorSpecies;
>> import jdk.incubator.vector.VectorMask;
>> import java.util.Random;
>>
>>
>> class TestVectorMaskShort {
>> private static final VectorSpecies<Short> SPECIES =
>> ShortVector.SPECIES_128;
>>
>> public static VectorMask<Short> test(boolean[] bary) {
>> VectorMask<Short> vmask = VectorMask.fromArray(SPECIES, bary, 0);
>> return vmask;
>> }
>>
>> public static void main(String args[]) {
>> Random ran = new Random(100);
>> int counter = 0;
>> boolean[] bary = new boolean[8];
>> for (int i = 0; i < 20_000; i++) {
>> for (int j = 0; j < bary.length; j++) {
>> bary[j] = ran.nextBoolean();
>> }
>> VectorMask<Short> vmask = test(bary);
>> if (vmask.allTrue()) {
>> counter++;
>> }
>> }
>> System.out.printf("counter = %d\n", counter);
>> }
>> }
>>
>>
>> When I run this test case, I get a runtime error:
>>
>> # Internal Error
>> (/home/cjashfor/git-trees/jdk/src/hotspot/share/opto/chaitin.cpp:951),
>> pid=1341588, tid=1341601
>> # assert(lrgmask.is_aligned_sets(RegMask::SlotsPerVecX)) failed:
>> vector should be aligned
>>
>>
>> - Corey
>>
>> Corey Ashford
>> Software Engineer
>> IBM Systems, LTC OpenJDK team
>>
>> IBM
More information about the hotspot-compiler-dev
mailing list