x86 Intrinsics for fma in Math Library
Deshpande, Vivek R
vivek.r.deshpande at intel.com
Wed Jul 20 20:18:05 UTC 2016
Hi Vladimir, Joe
Thanks for the review and comments. I will work on these changes and work on adding a test.
Regards,
Vivek
-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
Sent: Wednesday, July 20, 2016 1:09 PM
To: Joseph D. Darcy; Deshpande, Vivek R; dmitrij pochepko; hotspot compiler
Subject: Re: x86 Intrinsics for fma in Math Library
Hi Joe,
Yes, the intrinsic (vfmadd231sd/ss asm instructions) are used in all cases based on changes: Interpreter and C1,C2 compiled code.
The only thing is worrying me is a constant folding (when all arguments are constants) in C2 which uses libm fma() method (in subnode.cpp):
return TypeD::make(fma(d1, d2, d3));
It may produce different result than vfmadd231sd instruction so I would like to remove this optimization (leave only TOP checks).
Also FmaFNode::Value() has copy-paste error using Double method and result.
Vivek, you also don't need specialized match_edge() methods since it is called after restructuring nodes to binary tree.
Otherwise looks good except as was pointed to have more tests (for constants too).
Thanks,
Vladimir
On 7/19/16 7:04 PM, Joseph D. Darcy wrote:
> Hello,
>
> The existing fma tests are far from exhaustive and would benefit from
> being augmented by additional test cases, in particular some test
> cases which exercise the harder rounding cases for a hardware implementation.
>
> The changes to java.lang.Math look fine. I'm not a HotSpot developer
> so can't comment on the particulars of the HotSpot code, but I have a
> few general comments. For a intrinsic like this to be "performance correct"
> the intrinsic should be used vs not-used consistently independent of
> whether or not code is running in the interpreter, C1, or C2. From
> what I can gather, this seems to be the case with your patch. (We've
> had other math library functions which initially were not intrinsified
> under the interpreter and were intrinsified, say, under an OSR in C2
> and using different implementations of a method could cause consistency issues.
> That is less of a case here since the results of fma are exactly
> specified, but it is still better to have a consistent performance
> model for using fma.)
>
> Thanks,
>
> -Joe
>
> On 7/19/2016 9:38 AM, Deshpande, Vivek R wrote:
>>
>> Hi Dimitrij
>>
>>
>>
>> I used jdk/test/java/lang/Math/FusedMacTests.java for correctness
>> testing,
>>
>> added with this https://bugs.openjdk.java.net/browse/JDK-4851642
>> <https://bugs.openjdk.java.net/browse/JDK-4851642>
>>
>>
>>
>> Regards,
>>
>> Vivek
>>
>>
>>
>> *From:*dmitrij pochepko [mailto:dmitrij.pochepko at oracle.com]
>> *Sent:* Saturday, July 16, 2016 1:14 PM
>> *To:* Deshpande, Vivek R <vivek.r.deshpande at intel.com>; hotspot
>> compiler <hotspot-compiler-dev at openjdk.java.net>
>> *Cc:* vladimir.kozlov at oracle.com; Joseph D. Darcy
>> <joe.darcy at oracle.com>
>> *Subject:* Re: x86 Intrinsics for fma in Math Library
>>
>>
>>
>> Hi,
>>
>> What about at least simplest tests included like tests for other
>> Intrinsics at hotspot/test/compiler/intrinsics/* ?
>>
>> Thanks,
>> Dmitrij
>>
>>
>>
>> On 14.07.2016 20:42, Deshpande, Vivek R wrote:
>>
>> Hi
>>
>>
>>
>> I would like to contribute a patch for scalar x86 intrinsic
>> support for fma operations in jdk9.1
>>
>> Could you please review the patch.
>>
>> We see significant performance gain over library implementation.
>>
>>
>>
>> Bug ID:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8154122
>> hotspot changes:
>> http://cr.openjdk.java.net/~vdeshpande/FMA/8154122/hotspot/webrev.00/
>> <http://cr.openjdk.java.net/%7Evdeshpande/FMA/8154122/hotspot/webrev.00/>
>> jdk changes:
>> http://cr.openjdk.java.net/~vdeshpande/FMA/8154122/jdk/webrev.00/
>>
>> <http://cr.openjdk.java.net/%7Evdeshpande/FMA/8154122/jdk/webrev.00/>
>>
>>
>>
>> Regards,
>>
>> Vivek
>>
>>
>>
>
More information about the hotspot-compiler-dev
mailing list