On generalizing vector intrinsics

Wed Nov 15 19:14:03 UTC 2017

Thanks for the feedback, John!

>> The downside is that it requires additional non-trivial steps to find 
>> exact vector box class (see get_exact_klass_for_vector_box() and 
>> ctx->find_klass(vector_klass_name) there): the fact that vector 
>> classes aren't part of java.base, but jdk.incubator.vector complicates 
>> class lookup a bit.
> 
> I don't like (3) because it seems to extract information from context,
> not from the actual call.  I'd rather be able to document the call's
> semantics in terms of arguments, not contextual types.
> 
> The trade-off between (1) and (2) might be between how easy
> it is to extract the implicit data from the explicit data.  Does the
> intrinsic need access to all three of (a) vector size, (b) lane type,
> and (c) vector class or shape?  In that case, both options leave
> some info implicit and other info explicit.
> 
> If so, consider (4) passing all needed data directly and explicitly
> as explicit arguments:

Agree with you: (4) is better. And I like how it shapes out in the code:

http://cr.openjdk.java.net/~vlivanov/panama/vector.generalized_intrinsics.01/ 

> P.S. I wonder if "saturating" is a modifier bit to add to operation codes.
> Maybe "unsigned" vs. "signed" vs. "float" vs. various "fixN" are also.

Considering "saturating" and "signed"/"unsigned" is specific to 
arithmetic operations, I thought about just adding separate opcodes:
   VECTOR_OP_ADD
   VECTOR_OP_ADD_SATURATED
   VECTOR_OP_ADD_UNSIGNED
   ...

Adding a dedicated "axis" for operation flags can be generous. But it 
should definitely reduce the number of opcodes. Worth a try.

> I also wonder if lane type factors into lane size and arithmetic format,
> so that the interesting information about the vector op breaks apart
> into five parts:  1. vector size (256), 2. lane size (32), 3. lane type 
> (int),
> 4. basic op (+), 5. op flags (unsigned, sat).  In that case, there is
> probably an omnibus encoding that fits in one sparse 64-bit descriptor,
> or even (with log-scales) a 32-bit one.  Just a thought.

That's definitely an option. 32-bit should be enough to encode all 
necessary info.

I'll experiment with that once the

Best regards,
Vladimir Ivanov