[10] RFR (S): C2: missing strength reduction of Math.pow(x, 2.0D) to x*x
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Dec 13 16:28:51 UTC 2017
Thanks, Vladimir!
Best regards,
Vladimir Ivanov
On 12/13/17 7:25 PM, Vladimir Kozlov wrote:
> On 12/13/17 6:58 AM, Vladimir Ivanov wrote:
>> Thanks for review, Vladimir.
>>
>>> I agree with suggested fix. What check we have before 9 and where was
>>> it?
>>
>> It was quite complicated. There were LibraryCallKit::inline_pow() [1]
>> & LibraryCallKit::finish_pow_exp() [2] with a runtime check for 2.0
>> exponent, multiple PowDs on fast paths, and a call into
>> SharedRuntime::dpow on slow path. PowD had matcher rules [3] which
>> produced x87-based implementation.
>
> Okay.
>
>>
>>> Why do you need to move value to tmp1?
>>>
>>> + movdq(tmp1, xmm1);
>>> + cmp64(tmp1, ExternalAddress(DOUBLE2));
>>
>> We want to compare 64-bit values and move the exponent from XMM to GP
>> register.
>
> Of cause, you need to use GP register to compare. I thought tmp1 also xmm.
>
> Looks good.
>
> Thanks,
> Vladimir
>
>>
>> void Assembler::movdq(Register dst, XMMRegister src) {
>> void MacroAssembler::cmp64(Register src1, AddressLiteral src2) {
>>
>> Best regards,
>> Vladimir Ivanov
>>
>> [1] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l22.72
>> [2] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l22.16
>> [3] http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/94849fb8ce93#l13.7
>>
>>> On 12/12/17 11:16 AM, Vladimir Ivanov wrote:
>>>> http://cr.openjdk.java.net/~vlivanov/8190869/webrev.00/
>>>> https://bugs.openjdk.java.net/browse/JDK-8190869
>>>>
>>>> Math.pow() was completely rewritten on 9 and important special case
>>>> (exponent == 2.0) was missed. It caused severe performance regression
>>>> (up to 10x) on some numeric code.
>>>>
>>>> The fix is two-fold:
>>>> * add special case in C2 to transform pow(x,2.0) into x*x;
>>>> * for consistency of results between different execution modes
>>>> (interpreter, C1, and C2), add special case into generated stub at
>>>> MacroAssembler::fast_pow().
>>>>
>>>> Some clarifications follow.
>>>>
>>>> Testing:
>>>> * microbenchmark
>>>> * hs-precheckin-comp, hs-tier1, hs-tier2 (in progress)
>>>> * test case from 8189172 on consistency of results
>>>>
>>>> The fix doesn't cover all possible cases (e.g., when exponent is not a
>>>> constant during parsing, but becomes such later). I experimented with
>>>> more elaborate fix, but wasn't satisfied with the results and chose
>>>> simpler route.
>>>>
>>>> I see 2 alternatives:
>>>>
>>>> (1) Introduce macro node PowD and either transform it into MulD if
>>>> exponent == 2.0D or expand it into CallLeafNode. The downside is that
>>>> PowD should be CallNode and replacing it with a MulD requires some IR
>>>> tweaking around it.
>>>>
>>>> (2) Introduce a runtime check before calling the stub (prototype [1]).
>>>> If exponent becomes 2.0 later, the check is folded and stub call is
>>>> removed. The downside is that when exponent doesn't turn into 2.0, the
>>>> check stays and duplicates the check in the stub. Unfortunately,
>>>> there's
>>>> no way to remove it from the stub, because interpreter & C1 relies on
>>>> it. Otherwise, inconsistent computation results become possible (like
>>>> JDK-8189172 [2]).
>>>>
>>>> So, I decided to go with a simple fix and cover most common case
>>>> (explicit usage of 2.0 with Math.pow) for now.
>>>>
>>>> Best regards,
>>>> Vladimir Ivanov
>>>>
>>>> [1] http://cr.openjdk.java.net/~vlivanov/8190869/webrev.runtime_check
>>>> [2] https://bugs.openjdk.java.net/browse/JDK-8189172
More information about the hotspot-compiler-dev
mailing list