RFR (S) 8131682: C1 should use multibyte nops everywhere

Tue Jul 21 20:28:12 UTC 2015

This version looks good.

dl

On 7/20/2015 7:51 AM, Aleksey Shipilev wrote:
> Hi Dean,
>
> Thanks for taking a look!
>
> Silly me, I should have left the call patching cases intact, because
> you're right, we should be able to patch the nops partially while still
> producing the correct instruction stream. Therefore, I reverted the
> cases where we do nop-ing for *instruction* patching, and added the
> comment there.
>
> Other places seem to use the nop sequences to provide the alignment, not
> for the general patching. Especially interesting for us is the case of
> aligning the patcheable immediate in the existing call. C2 does the nops
> in these cases.
>
> New webrev:
>    http://cr.openjdk.java.net/~shade/8131682/webrev.01/
>
> Testing:
>    * JPRT -testset hotspot on open platforms;
>    * Targeted benchmarks, plus eyeballing the assembly;
>
> Thanks,
> -Aleksey
>
> On 18.07.2015 10:51, Dean Long wrote:
>> I think we should distinguish the different uses and treat them
>> accordingly:
>>
>> 1) padding nops for patching, executed
>>
>> We need to be careful about inserting a fat nop here, if later patching
>> overwrites only part of the fat nop, resulting in an illegal intruction.
>>
>> 2) padding nops for patching, never executed
>>
>> It should be safe insert a fat nop here, but there's no point if the
>> nops are not reachable and never executed.
>>
>>
>> 3) alignment nops, never patched, executed
>>
>> Fat nops are fine, but on some CPUs branching may be even better, so I
>> suggest using align() for this, and letting align() decide what to
>> generate.  The change in check_icache() could use a version of align
>> that takes the target  offset as an argument:
>>
>> 348 align(CodeEntryAlignment,__ offset() + ic_cmp_size);
>>
>> 4) alignment nops, never patched, never executed
>>
>> Doesn't matter what we emit here, but we might as well make it
>> understandable by humans using a debugger.
>>
>>
>> I believe the patching nops in c1_CodeStubs_x86.cpp and
>> c1_LIRAssembler.cpp are patched concurrently while the code is running,
>> not at a safepoint, so it's not clear to me if it's safe to use fat nops
>> on x86.  I would consider those changes unsafe on x86 without further
>> analysis of what happens during patching.
>>
>> dl
>>
>> On 7/17/2015 6:29 AM, Aleksey Shipilev wrote:
>>> Hi there,
>>>
>>> C1 is not very good at inlining and intrisifying methods, and hence the
>>> call performance is important there. One nit that we can see in the
>>> generated code on x86 is that C1 uses the single-byte nops, even for
>>> long nop strides.
>>>
>>> This improvement fixes that:
>>>     https://bugs.openjdk.java.net/browse/JDK-8131682
>>>     http://cr.openjdk.java.net/~shade/8131682/webrev.00/
>>>
>>> Testing:
>>>     - JPRT -testset hotspot on open platforms
>>>     - eyeballing the generated assembly with -XX:TieredStopAtLevel=1
>>>
>>> (I understand the symmetric change is going to be needed in closed
>>> parts, but let's polish the open part first).
>>>
>>> Thanks,
>>> -Aleksey
>>>
>