[aarch64-port-dev ] RFR (S) 8131682: C1 should use multibyte nops everywhere

Fri Jul 24 10:38:10 UTC 2015

Hi Aleksey,

thanks for pointing us to that change!
Looks good, but does not compile. Default arg should only be in the header.
See below.

Ppc part reviewed and I don’t need a new webrev.

Best regards,
  Goetz.

--- a/src/cpu/ppc/vm/assembler_ppc.inline.
+++ b/src/cpu/ppc/vm/assembler_ppc.inline.hpp
@@ -210,7 +210,7 @@
 inline void Assembler::extsw(   Register a, Register s)                { emit_int32(EXTSW_OPCODE   | rta(a) | rs(s) | rc(0)); }

 // extended mnemonics
-inline void Assembler::nop()                              { Assembler::ori(R0, R0, 0); }
+inline void Assembler::nop(int count)                     { for (int i = 0; i < count; i++) { Assembler::ori(R0, R0, 0); } }
 // NOP for FP and BR units (different versions to allow them to be in one group)
 inline void Assembler::fpnop0()                           { Assembler::fmr(F30, F30); }
 inline void Assembler::fpnop1()                           { Assembler::fmr(F31, F31); }


g++ 4.8.3:
In file included from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/share/vm/asm/assembler.inline.hpp:43:0,                                                                                   
                 from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/cpu/ppc/vm/macroAssembler_ppc.inline.hpp:29,                                                                              
                 from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/share/vm/asm/macroAssembler.inline.hpp:43,                                                                                
                 from ../generated/adfiles/ad_ppc_64.cpp:56:                                                                                                                                 
/sapmnt/home1/d045726/oJ/8131682-hs-comp/src/cpu/ppc/vm/assembler_ppc.inline.hpp:213:41: error: default argument given for parameter 1 of void Assembler::nop(int) [-fpermissive]          
 inline void Assembler::nop(int count = 1)                 { for(int i = 0; i < count; i++)                                                                                                  
                                         ^                                                                                                                                                   
In file included from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/share/vm/asm/assembler.hpp:434:0,                                                                                         
                 from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/cpu/ppc/vm/nativeInst_ppc.hpp:29,                                                                                         
                 from /sapmnt/home1/d045726/oJ/8131682-hs-comp/src/share/vm/code/nativeInst.hpp:41,                                                                                          
                 from ../generated/adfiles/ad_ppc_64.hpp:57,                                                                                                                                 
                 from ../generated/adfiles/ad_ppc_64.cpp:54:                                                                                                                                 
/sapmnt/home1/d045726/oJ/8131682-hs-comp/src/cpu/ppc/vm/assembler_ppc.hpp:1383:15: error: after previous specification in void Assembler::nop(int) [-fpermissive]                          
   inline void nop(int count = 1);                                                                                                                                                           
               ^     


-----Original Message-----
From: hotspot-compiler-dev [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Aleksey Shipilev
Sent: Friday, July 24, 2015 11:49 AM
To: Dean Long; hotspot compiler
Cc: ppc-aix-port-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Subject: Re: RFR (S) 8131682: C1 should use multibyte nops everywhere

* PGP Signed by an unknown key

(explicitly cc'ing AArch64 and PPC folks)

Thanks,
-Aleksey

On 22.07.2015 11:10, Aleksey Shipilev wrote:
> Thanks for review, Dean!
> 
> I'd like to hear the opinions of AArch64 and Power folks, since we
> contaminate their assemblers a bit to gain access to x86 fat nops.
> 
> -Aleksey
> 
> On 21.07.2015 23:28, Dean Long wrote:
>> This version looks good.
>>
>> dl
>>
>> On 7/20/2015 7:51 AM, Aleksey Shipilev wrote:
>>> Hi Dean,
>>>
>>> Thanks for taking a look!
>>>
>>> Silly me, I should have left the call patching cases intact, because
>>> you're right, we should be able to patch the nops partially while still
>>> producing the correct instruction stream. Therefore, I reverted the
>>> cases where we do nop-ing for *instruction* patching, and added the
>>> comment there.
>>>
>>> Other places seem to use the nop sequences to provide the alignment, not
>>> for the general patching. Especially interesting for us is the case of
>>> aligning the patcheable immediate in the existing call. C2 does the nops
>>> in these cases.
>>>
>>> New webrev:
>>>    http://cr.openjdk.java.net/~shade/8131682/webrev.01/
>>>
>>> Testing:
>>>    * JPRT -testset hotspot on open platforms;
>>>    * Targeted benchmarks, plus eyeballing the assembly;
>>>
>>> Thanks,
>>> -Aleksey
>>>
>>> On 18.07.2015 10:51, Dean Long wrote:
>>>> I think we should distinguish the different uses and treat them
>>>> accordingly:
>>>>
>>>> 1) padding nops for patching, executed
>>>>
>>>> We need to be careful about inserting a fat nop here, if later patching
>>>> overwrites only part of the fat nop, resulting in an illegal intruction.
>>>>
>>>> 2) padding nops for patching, never executed
>>>>
>>>> It should be safe insert a fat nop here, but there's no point if the
>>>> nops are not reachable and never executed.
>>>>
>>>>
>>>> 3) alignment nops, never patched, executed
>>>>
>>>> Fat nops are fine, but on some CPUs branching may be even better, so I
>>>> suggest using align() for this, and letting align() decide what to
>>>> generate.  The change in check_icache() could use a version of align
>>>> that takes the target  offset as an argument:
>>>>
>>>> 348 align(CodeEntryAlignment,__ offset() + ic_cmp_size);
>>>>
>>>> 4) alignment nops, never patched, never executed
>>>>
>>>> Doesn't matter what we emit here, but we might as well make it
>>>> understandable by humans using a debugger.
>>>>
>>>>
>>>> I believe the patching nops in c1_CodeStubs_x86.cpp and
>>>> c1_LIRAssembler.cpp are patched concurrently while the code is running,
>>>> not at a safepoint, so it's not clear to me if it's safe to use fat nops
>>>> on x86.  I would consider those changes unsafe on x86 without further
>>>> analysis of what happens during patching.
>>>>
>>>> dl
>>>>
>>>> On 7/17/2015 6:29 AM, Aleksey Shipilev wrote:
>>>>> Hi there,
>>>>>
>>>>> C1 is not very good at inlining and intrisifying methods, and hence the
>>>>> call performance is important there. One nit that we can see in the
>>>>> generated code on x86 is that C1 uses the single-byte nops, even for
>>>>> long nop strides.
>>>>>
>>>>> This improvement fixes that:
>>>>>     https://bugs.openjdk.java.net/browse/JDK-8131682
>>>>>     http://cr.openjdk.java.net/~shade/8131682/webrev.00/
>>>>>
>>>>> Testing:
>>>>>     - JPRT -testset hotspot on open platforms
>>>>>     - eyeballing the generated assembly with -XX:TieredStopAtLevel=1
>>>>>
>>>>> (I understand the symmetric change is going to be needed in closed
>>>>> parts, but let's polish the open part first).
>>>>>
>>>>> Thanks,
>>>>> -Aleksey
>>>>>
>>>
>>
> 
> 



* Unknown Key
* 0x62A119A7