Vector API Get Implementation patch
Kandu, Rahul
rahul.kandu at intel.com
Wed Jun 20 23:17:03 UTC 2018
I left those out initially because extract1D/1L is going to be scalar, I can include them.
I think extract1D may look something like this:
instruct extract1d(regD dst, vecX src, immI idx) %{
predicate(UseAVX > 0 && n->as_Vector()->length() == 1 && n->bottom_type()->basic_type() == T_DOUBLE);
match(Set dst (ExtractD src idx));
ins_encode %{
int vector_len = 0;
int midx = 0x1 & $idx$$constant;
__ vmovdqu($dst$$XMMRegister, $src$$XMMRegister);
%}
ins_pipe( pipe_slow );
%}
-----Original Message-----
From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
Sent: Wednesday, June 20, 2018 3:38 PM
To: Kandu, Rahul <rahul.kandu at intel.com>; panama-dev at openjdk.java.net
Subject: Re: Vector API Get Implementation patch
> I'll add the extract2f. Thanks for the suggestion. Let me send another webrev shortly with the extract2f.
What about extract1d & extract1l? Is there any reason why they aren't needed?
> I'm not quite sure why missing intrinsic(s) are not caught by tests- even so earlier when ExtractI rules were missing. I can find out.
Yes, please.
Best regards,
Vladimir Ivanov
> -----Original Message-----
> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
> Sent: Wednesday, June 20, 2018 2:54 PM
> To: Kandu, Rahul <rahul.kandu at intel.com>; panama-dev at openjdk.java.net
> Subject: Re: Vector API Get Implementation patch
>
> Rahul,
>
>> Please find the updated patch for the Get Implementation
>> http://cr.openjdk.java.net/~vdeshpande/webrevGetImplF620/webrev/
>>
>> in this patch, I made the following modifications.
>>
>> 1. fixed AVX-512 rules with predicate "UseAVX >2"
>> 2. added ExtractI rules
>> 3. fixed tabulation issues in the instructs in the x86.ad file.
>> 4. Includes all test types- all tests passed, including Int128/256/512 etc.
>> 5. removed extract4b rule
>
> Thanks. I noticed that some 64-bit variants are missing as well while JDK implementation still calls the intrinsic [1]:
>
> extract1d, extract2f, extract1l
>
> Do you have an idea why the tests don't catch that? I believe intinsics don't kick in, but I don't see why (just by staring at the code).
>
> Otherwise, looks good.
>
>> This patch covers for AVX rules.
>>
>> If its fine, I can add the Non-AVX rules (UseAVX==0 && UseSSE>3) in a patch for all the types and submit it separately.
>
> Sounds good.
>
> Best regards,
> Vladimir Ivanov
>
> [1]
> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Double64Vector.java:
>
> public double get(int i) {
> if (i < 0 || i >= LENGTH) {
> throw new IllegalArgumentException("Index " + i + " must be zero or positive, and less than " + LENGTH);
> }
> long bits = (long) VectorIntrinsics.extract(
> Double64Vector.class, double.class, LENGTH,
> this, i,
> (vec, ix) -> {
> double[] vecarr = vec.getElements();
> return (long)Double.doubleToLongBits(vecarr[ix]);
> });
> return Double.longBitsToDouble(bits);
> }
>
>
>
>> -----Original Message-----
>> From: panama-dev [mailto:panama-dev-bounces at openjdk.java.net] On
>> Behalf Of Kandu, Rahul
>> Sent: Monday, June 18, 2018 8:58 AM
>> To: Vladimir Ivanov <vladimir.x.ivanov at oracle.com>;
>> panama-dev at openjdk.java.net
>> Subject: RE: Vector API Get Implementation patch
>>
>> Vladimir,
>>
>> Thanks for your input. I'll upload the updated patch shortly.
>>
>> Rahul
>>
>> -----Original Message-----
>> From: Vladimir Ivanov [mailto:vladimir.x.ivanov at oracle.com]
>> Sent: Thursday, June 14, 2018 9:44 AM
>> To: Kandu, Rahul <rahul.kandu at intel.com>; panama-dev at openjdk.java.net
>> Subject: Re: Vector API Get Implementation patch
>>
>> Rahul,
>>
>> Some comments/questions/ suggestions:
>>
>> (1) I don't see any non-AVX rules. Do you plan to add them later?
>>
>>
>> (2) No matcher rules for ExtractI as well. Same question: left for later?
>>
>>
>> (3) AVX-512 rules should be guarded by "UseAVX > 2" predicate:
>>
>> +instruct extract32s(rRegI dst, vecZ src, vecZ tmp, immI idx) %{
>> + predicate(UseAVX > 0 && n->as_Vector()->length() == 32 &&
>> n->bottom_type()->basic_type() == T_SHORT);
>>
>>
>> (4) Is extract4b ever used? I don't see any 32-bit vectors present.
>>
>> +instruct extract4b(rRegI dst, vecX src, immI idx) %{
>> + predicate(UseAVX > 0 && n->as_Vector()->length() == 4 &&
>> n->bottom_type()->basic_type() == T_BYTE);
>>
>>
>> (5) I suggest to migrate to immU? (recently introduced) and get
>> rid of index masking (see [1] for examples):
>> + int midx = 0x3F & $idx$$constant;
>>
>> Applying similar refactorings to rvinsert64B, the following:
>>
>> +instruct extract64b(rRegI dst, vecZ src, vecZ tmp, immI idx) %{
>> ...
>> + int midx = 0x3F & $idx$$constant;
>> + if (midx == 0) {
>> + __ movdl($dst$$Register, $src$$XMMRegister);
>> + __ movsbl($dst$$Register, $dst$$Register);
>> + }
>> + else if (midx >= 1 && midx <= 15) {
>> + __ pextrb($dst$$Register, $src$$XMMRegister, midx);
>> + __ movsbl($dst$$Register, $dst$$Register);
>> + }
>> + else {
>> + int extr_idx1 = midx / 16;
>> + int extr_idx2 = midx % 16;
>> + __ vextracti32x4($tmp$$XMMRegister, $src$$XMMRegister, extr_idx1);
>> + __ pextrb($dst$$Register, $tmp$$XMMRegister, extr_idx2);
>> + __ movsbl($dst$$Register, $dst$$Register);
>> + }
>>
>> can be converted to into:
>>
>> +instruct extract64b(rRegI dst, vecZ src, vecZ tmp, immU6 idx) %{
>> ...
>> + uint x_idx = $idx$$constant & right_n_bits(4);
>> + uint z_idx = ($idx$$constant >> 4) & 2;
>> + if (z_idx == 0) {
>> + if (x_idx == 0) {
>> + __ movdl($dst$$Register, $src$$XMMRegister);
>> + } else {
>> + __ pextrb($dst$$Register, $src$$XMMRegister, x_idx);
>> + }
>> + } else { // z_idx > 0
>> + __ vextracti32x4($tmp$$XMMRegister, $src$$XMMRegister, z_idx);
>> + __ pextrb($dst$$Register, $tmp$$XMMRegister, x_idx);
>> + }
>> + __ movsbl($dst$$Register, $dst$$Register);
>>
>> or even:
>>
>> + uint x_idx = $idx$$constant & right_n_bits(4);
>> + uint z_idx = ($idx$$constant >> 4) & 2;
>> + Register r_i32 = $src$$XMMRegister;
>> + if (z_idx > 0) {
>> + r_i32 = $tmp$$XMMRegister;
>> + __ vextracti32x4(r_i32, $src$$XMMRegister, z_idx);
>> + }
>> + if (x_idx == 0) {
>> + __ movdl($dst$$Register, r_i32);
>> + } else {
>> + __ pextrb($dst$$Register, r_i32, x_idx);
>> + }
>> + __ movsbl($dst$$Register, $dst$$Register);
>>
>>
>> Also, please, fix code formatting. Many lines have wrong indentation.
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] http://hg.openjdk.java.net/panama/dev/rev/e00750ed585a
>>
>> On 14/06/2018 03:10, Kandu, Rahul wrote:
>>> Hi All,
>>>
>>> Please find the Get Implementation patch for the Vector API. The following patch also includes Tests for Get operations.
>>>
>>> http://cr.openjdk.java.net/~rlupusoru/panama/webrevrGetImplF/
>>>
>>> regards,
>>> Rahul
>>>
More information about the panama-dev
mailing list