Request for reviews (M): 6987135: Performance regression on Intel platform with 32-bits edition between 6u13 and 6u14.
Vladimir Kozlov
vladimir.kozlov at oracle.com
Fri Oct 29 17:14:19 PDT 2010
Tom suggested to use second TEMP register instead of push/pop. Here is new webrev:
http://cr.openjdk.java.net/~kvn/6987135/webrev.02
Vladimir
Vladimir Kozlov wrote:
> Thank you, Tom
>
> I updated webrev:
>
> http://cr.openjdk.java.net/~kvn/6987135/webrev.01
>
> Tom Rodriguez wrote:
>> On Oct 22, 2010, at 6:25 PM, Vladimir Kozlov wrote:
>>
>>> http://cr.openjdk.java.net/~kvn/6987135/webrev.00
>>>
>>> Fixed 6987135: Performance regression on Intel platform with 32-bits
>>> edition between 6u13 and 6u14.
>>>
>>> Changes for 6603011 added the conversion of long
>>> division by constant to the code with multiply.
>>> But some modern cpus improved DIV instruction
>>> performance. Use it for long division by constant
>>> when it is faster than code with multiply.
>>
>> The formats in x86_32.ad don't match the code.
>
> Fixed.
>
>> In modL_eReg_imm32, why can't the value be 0 or -1?
>
> There are Ideal transformations for such divisor values,
> DivL and ModL will never go to matcher with such divisor.
> Asserts verify it.
>
>> Why don't you use an immL definition that ensures that? If imm is
>> MININT then the pcon calculation will go wrong.
>
> Matcher::use_asm_for_ldiv_by_con() has check for MININT.
> I added verification check into asserts.
>
>>
>> I believe you could do the register declarations like this:
>
> I updated both ModL and DivL code to use only dst.
>
> Thanks,
> Vladimir
>
>>
>> + instruct modL_eReg_imm32( eADXRegL dst, eRegL src, immL32 imm, eRegI
>> tmp, eFlagsReg cr ) %{
>> + match(Set dst (ModL src imm));
>> + effect(TEMP dst, TEMP tmp, KILL cr );
>>
>> to leave the src and tmp unbound which would give the RA a little more
>> freedom. Actually wouldn't connecting src and dst directly result in
>> fewer moves in the normal case? You might need a new temp but it
>> seems like there are quite a few moves of src into dst for the idivl.
>>
>> + instruct modL_eReg_imm32( eADXRegL dst, immL32 imm, eSIRegI tmp,
>> eFlagsReg cr ) %{
>> + match(Set dst (ModL dst imm));
>> + effect( KILL cr );
>>
>> tom
>>
>>> Tested on US3, T1, T2, Sparc64, AMD and Intel latest cpus.
>>>
>>
More information about the hotspot-compiler-dev
mailing list