RFR: 8290917: x86: Memory-operand arithmetic instructions have too low costs [v2]
Sandhya Viswanathan
sviswanathan at openjdk.org
Wed Sep 14 16:35:53 UTC 2022
On Wed, 14 Sep 2022 02:48:57 GMT, Quan Anh Mai <duke at openjdk.org> wrote:
>> I don't think we should replace the inc/dec by add.
>>
>> On my desktop, I see the following:
>> Before:
>> Benchmark Mode Cnt Score Error Units
>> BasicRules.add_mem_con avgt 3 132.268 ± 0.599 ns/op
>> BasicRules.inc_mem avgt 3 169.980 ± 0.617 ns/op
>>
>> After:
>> Benchmark Mode Cnt Score Error Units
>> BasicRules.add_mem_con avgt 3 117.426 ± 0.128 ns/op
>> BasicRules.inc_mem avgt 3 182.907 ± 0.277 ns/op
>>
>> The inc_mem jmh performance is worse after the patch.
>>
>> There is already UseIncDec option which is set appropriately to select whether to generate inc/dec or the add/sub instruction.
>
> @sviswa7 Thanks a lot for your review, I have reverted that change. I don't understand why, though, it does not seem that the bottleneck is in the predecoder.
@merykitty Thanks for reverting those changes. Could you please also add jmh tests for the following:
1) AndL with 255
2) AndL with 65535
3) DivL by 10
For 1) and 2) we are changing the instruction from q version to l version, so want to make sure the performance is at par atleast.
For 3) it will be good to check that the compiler is optimizing divide by 10 for long data type as well now.
-------------
PR: https://git.openjdk.org/jdk/pull/9791
More information about the hotspot-compiler-dev
mailing list