RFR (S) 8131682: C1 should use multibyte nops everywhere

Wed Jul 22 08:10:51 UTC 2015

Thanks for review, Dean!

I'd like to hear the opinions of AArch64 and Power folks, since we
contaminate their assemblers a bit to gain access to x86 fat nops.

-Aleksey

On 21.07.2015 23:28, Dean Long wrote:
> This version looks good.
> 
> dl
> 
> On 7/20/2015 7:51 AM, Aleksey Shipilev wrote:
>> Hi Dean,
>>
>> Thanks for taking a look!
>>
>> Silly me, I should have left the call patching cases intact, because
>> you're right, we should be able to patch the nops partially while still
>> producing the correct instruction stream. Therefore, I reverted the
>> cases where we do nop-ing for *instruction* patching, and added the
>> comment there.
>>
>> Other places seem to use the nop sequences to provide the alignment, not
>> for the general patching. Especially interesting for us is the case of
>> aligning the patcheable immediate in the existing call. C2 does the nops
>> in these cases.
>>
>> New webrev:
>>    http://cr.openjdk.java.net/~shade/8131682/webrev.01/
>>
>> Testing:
>>    * JPRT -testset hotspot on open platforms;
>>    * Targeted benchmarks, plus eyeballing the assembly;
>>
>> Thanks,
>> -Aleksey
>>
>> On 18.07.2015 10:51, Dean Long wrote:
>>> I think we should distinguish the different uses and treat them
>>> accordingly:
>>>
>>> 1) padding nops for patching, executed
>>>
>>> We need to be careful about inserting a fat nop here, if later patching
>>> overwrites only part of the fat nop, resulting in an illegal intruction.
>>>
>>> 2) padding nops for patching, never executed
>>>
>>> It should be safe insert a fat nop here, but there's no point if the
>>> nops are not reachable and never executed.
>>>
>>>
>>> 3) alignment nops, never patched, executed
>>>
>>> Fat nops are fine, but on some CPUs branching may be even better, so I
>>> suggest using align() for this, and letting align() decide what to
>>> generate.  The change in check_icache() could use a version of align
>>> that takes the target  offset as an argument:
>>>
>>> 348 align(CodeEntryAlignment,__ offset() + ic_cmp_size);
>>>
>>> 4) alignment nops, never patched, never executed
>>>
>>> Doesn't matter what we emit here, but we might as well make it
>>> understandable by humans using a debugger.
>>>
>>>
>>> I believe the patching nops in c1_CodeStubs_x86.cpp and
>>> c1_LIRAssembler.cpp are patched concurrently while the code is running,
>>> not at a safepoint, so it's not clear to me if it's safe to use fat nops
>>> on x86.  I would consider those changes unsafe on x86 without further
>>> analysis of what happens during patching.
>>>
>>> dl
>>>
>>> On 7/17/2015 6:29 AM, Aleksey Shipilev wrote:
>>>> Hi there,
>>>>
>>>> C1 is not very good at inlining and intrisifying methods, and hence the
>>>> call performance is important there. One nit that we can see in the
>>>> generated code on x86 is that C1 uses the single-byte nops, even for
>>>> long nop strides.
>>>>
>>>> This improvement fixes that:
>>>>     https://bugs.openjdk.java.net/browse/JDK-8131682
>>>>     http://cr.openjdk.java.net/~shade/8131682/webrev.00/
>>>>
>>>> Testing:
>>>>     - JPRT -testset hotspot on open platforms
>>>>     - eyeballing the generated assembly with -XX:TieredStopAtLevel=1
>>>>
>>>> (I understand the symmetric change is going to be needed in closed
>>>> parts, but let's polish the open part first).
>>>>
>>>> Thanks,
>>>> -Aleksey
>>>>
>>
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/attachments/20150722/97fd7fc4/signature-0001.asc>