Rough prototype of a Vector API
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Thu Mar 3 21:38:12 UTC 2016
Ian,
> This is pretty cool! I've been working on an int[] view over Long2 that I can put up once it's a little less rough. It looks fairly similar to your approach. I will try to harmonize it with your naming style. Additionally, I've tweaked my Vector interface to include a few more arithmetic and logical methods, but I can pare that back for the time being (or permanently!). What would be a good/preferred method for me to share what I've got with this list when I am ready?
It would be great if you could just push it into panama repository.
There's no strict integration rules in the project. Or sharing patches
as webrevs also works fine.
Meanwhile, until you aren't Panama Committer, I (or anybody with enough
privileges) can assist you (with sponsoring your changes or putting
webrevs on cr.ojn server until you get an account). Just send patches my
way.
> One quick thought to the note about imm's being hard-coded into snippets.
>
>> - Instructions requiring immediate values have to be hard coded in a code snippet. For insertion or extraction of a byte value to/from a Vector<Byte, Shapes.S128Bit> i defer to the “pinsrb” and “pextrb” instructions respectively. I create 16 MHs for each kind :-) where the index is encoded as an immediate. This obviously does not scale, nor is it visible to the JIT.
>
> Perhaps the Patchable CodeSnippet approach is robust enough to hoist whole values into an imm position? I don't want to put too much weight on what is already a pretty clean interface, but it does seem like it could work.
I haven't thought it through, but there are some complications. Though
it looks simple for patchable snippets, the hard part is stand-alone
versions for interpreter/C1. You have to dispatch on the parameter value
and generate all necessary specializations.
I'd prefer to see that logic in a Java library than in machine code or JVM.
So far, I explicitly instantiated all specializations [1]:
static final MethodHandle MHm128_insert_epi32_0 =
make_m128_insert_epi32(0);
static final MethodHandle MHm128_insert_epi32_1 =
make_m128_insert_epi32(1);
static final MethodHandle MHm128_insert_epi32_2 =
make_m128_insert_epi32(2);
static final MethodHandle MHm128_insert_epi32_3 =
make_m128_insert_epi32(3);
and dispatched between them in the library function:
public static Long2 insert_epi32(Long2 v, int x, int off) {
try {
MHm128_insert_epi32[off].invokeExact(v, x);
switch(off) {
case 0: return (Long2)
MHm128_insert_epi32_0.invokeExact(v, x);
case 1: return (Long2)
MHm128_insert_epi32_1.invokeExact(v, x);
case 2: return (Long2)
MHm128_insert_epi32_2.invokeExact(v, x);
case 3: return (Long2)
MHm128_insert_epi32_3.invokeExact(v, x);
default: throw new IllegalArgumentException(""+off);
}
} catch (Throwable e) {
throw new Error(e);
}
}
It is easily optimizable by JITs.
A cleaner way would be to put them all in the @Stable array:
static @Stable final MethodHandle[] MHm128_insert_epi32 = new
MethodHandle[] {
make_m128_insert_epi32(0),
make_m128_insert_epi32(1),
make_m128_insert_epi32(2),
make_m128_insert_epi32(3)};
public static Long2 insert_epi32_1(Long2 v, int x, int off) {
try {
return (Long2) MHm128_insert_epi32[off].invokeExact(v, x);
} catch (Throwable e) {
throw new Error(e);
}
}
It also allows to lazily instantiate specializations and JIT can easily
constant-fold and inline snippets through the @Stable array when offset
is compile-time constant.
But @Stable is part of JDK internals, so I decided to keep away from it
for now.
Best regards,
Vladimir Ivanov
[1]
http://hg.openjdk.java.net/panama/panama/jdk/file/2dddc1a1b55d/test/panama/snippets/VectorUtils.java#l260
More information about the panama-dev
mailing list