[10] RFR (S): C2: missing strength reduction of Math.pow(x, 2.0D) to x*x

Wed Dec 13 16:28:51 UTC 2017

Thanks, Vladimir!

Best regards,
Vladimir Ivanov

On 12/13/17 7:25 PM, Vladimir Kozlov wrote:
> On 12/13/17 6:58 AM, Vladimir Ivanov wrote:
>> Thanks for review, Vladimir.
>>
>>> I agree with suggested fix. What check we have before 9 and where was 
>>> it?
>>
>> It was quite complicated. There were LibraryCallKit::inline_pow() [1] 
>> & LibraryCallKit::finish_pow_exp() [2] with a runtime check for 2.0 
>> exponent, multiple PowDs on fast paths, and a call into 
>> SharedRuntime::dpow on slow path. PowD had matcher rules [3] which 
>> produced x87-based implementation.
> 
> Okay.
> 
>>
>>> Why do you need to move value to tmp1?
>>>
>>> +  movdq(tmp1, xmm1);
>>> +  cmp64(tmp1, ExternalAddress(DOUBLE2));
>>
>> We want to compare 64-bit values and move the exponent from XMM to GP 
>> register.
> 
> Of cause, you need to use GP register to compare. I thought tmp1 also xmm.
> 
> Looks good.
> 
> Thanks,
> Vladimir
> 
>>
>> void Assembler::movdq(Register dst, XMMRegister src) {
>> void MacroAssembler::cmp64(Register src1, AddressLiteral src2) {
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l22.72
>> [2] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l22.16
>> [3] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l13.7
>>
>>> On 12/12/17 11:16 AM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8190869/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8190869
>>>>
>>>> Math.pow() was completely rewritten on 9 and important special case
>>>> (exponent == 2.0) was missed. It caused severe performance regression
>>>> (up to 10x) on some numeric code.
>>>>
>>>> The fix is two-fold:
>>>>    * add special case in C2 to transform pow(x,2.0) into x*x;
>>>>    * for consistency of results between different execution modes
>>>> (interpreter, C1, and C2), add special case into generated stub at
>>>> MacroAssembler::fast_pow().
>>>>
>>>> Some clarifications follow.
>>>>
>>>> Testing:
>>>>    * microbenchmark
>>>>    * hs-precheckin-comp, hs-tier1, hs-tier2 (in progress)
>>>>    * test case from 8189172 on consistency of results
>>>>
>>>> The fix doesn't cover all possible cases (e.g., when exponent is not a
>>>> constant during parsing, but becomes such later). I experimented with
>>>> more elaborate fix, but wasn't satisfied with the results and chose
>>>> simpler route.
>>>>
>>>> I see 2 alternatives:
>>>>
>>>> (1) Introduce macro node PowD and either transform it into MulD if
>>>> exponent == 2.0D or expand it into CallLeafNode. The downside is that
>>>> PowD should be CallNode and replacing it with a MulD requires some IR
>>>> tweaking around it.
>>>>
>>>> (2) Introduce a runtime check before calling the stub (prototype [1]).
>>>> If exponent becomes 2.0 later, the check is folded and stub call is
>>>> removed. The downside is that when exponent doesn't turn into 2.0, the
>>>> check stays and duplicates the check in the stub. Unfortunately, 
>>>> there's
>>>> no way to remove it from the stub, because interpreter & C1 relies on
>>>> it. Otherwise, inconsistent computation results become possible (like
>>>> JDK-8189172 [2]).
>>>>
>>>> So, I decided to go with a simple fix and cover most common case
>>>> (explicit usage of 2.0 with Math.pow) for now.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1] http://cr.openjdk.java.net/~vlivanov/8190869/webrev.runtime_check
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8189172