[lworld+fp16] RFR: 8341414: Add support for FP16 conversion routines [v2]

Fri Nov 8 07:59:26 UTC 2024

On Thu, 7 Nov 2024 07:24:42 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Bhavana Kilambi has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Remove intrinsification of conversion methods in Float16
>
> My bad, I meant the other way round i.e. integral to float16 conversion case, which takes a slow path route currently.  Consider the following micro kernel:-
> 
> 
> public class float16_allocation {
>    public static float micro(int value) {
>        Float16 val = Float16.valueOf(value); // [a]
>        return val.floatValue();              // [b]
>    }
> 
>    public static void main(String [] args) {
>        float res = 0.0f;
>        for (int i = 0; i < 100000; i++) {
>            res += micro(i);
>        }
>        System.out.println("[res]" + res);
>    }
> }
> 
> 
> Here, the integer parameter is first converted to float16 value [a],  valueOf routine first typecast integer value to double type and then passes it to Float16.valueOf(double) routine resulting in a bulky JIT sequence. 
> 
> We can outline the following code [c] into a new leaf routine returning a short value, and directly pass it to the Float16 constructor similar to https://github.com/openjdk/valhalla/blob/lworld%2Bfp16/src/java.base/share/classes/java/lang/Float16.java#L411
> 
> New routine can then be intrinsified to yield ConvI2HF IR, which then gets boxed as a value object. Since Float16 is a value type, it will scalarize its field accesses, thus directly forwarding HF ('short') value to subsequent ConvHF2F [b].  On mainline where Float16 is a value-based class we can bank on escape analysis to eliminate redundant boxing allocations. 
>  
>  
> 
>     public static Float16 valueOf(int value) {
>         // int -> double conversion is exact
>         return valueOf((double)value);     // [c] 
>     }
> 
> 
> 
> We can spill this over to another patch if you suggest it, kindly let me know your views.
> 
> Best Regards,
> Jatin

> Hi @jatin-bhateja , Thanks for the reminder. I remember asking you in a previous email about the reverse conversions and I forgot about that myself. I am thinking if we have to intrinsify, can we not directly intrinsify Float16 valueOf() routines in Float16 

Idea here is to avoid complexifying scalar intrinsic by delegating boxing to expander, otherwise we will also have to pass additional box type argument. Instead, we can rely explicit boxing happening in Java side and bank on escape analysis for its elimination thus directly exposing ConvI2HF to its user.

> new routine in Integer.java and then calling it in the Float16.valueOf() method and intrinsifying the one in Integer.java?

No, I am not suggesting to add <Primitive_Box_Type>.float16Value() API in existing primitive classes for time being, let Joe decide that. If you intrinsify leaf level wrapper routine, then we just need to plug that into Integer.float16Value(), we will lose this flexibility if we intrinsify Float16.valueOf(int).

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/1283#issuecomment-2463986341