PING: RFR: 8164652: c1 port, code patching
Andrew Haley
aph at redhat.com
Mon Oct 3 16:13:36 UTC 2016
On 29/09/16 11:51, Andrew Haley wrote:
> There has still not been a reply to this that I can find.
>
> It is a serious issue that should be addressed.
>
> Andrew.
>
>
> On 31/08/16 17:58, Andrew Haley wrote:
>> It all looks pretty good.
>>
>> I now understand how the C1 patching code works, and it's not as bad
>> as I'd feared. :-)
>>
>> I'm think that it'll work on existing ARM processors, given that none
>> of the copying of code buffers which x86 does is used at all.
>>
>> For the benefit of onlookers: when we have to patch, say, an
>> instruction of the form
>>
>> mov r2, #nn
>>
>> but nn is unknown at compilation time, we lay down a stub of the form
>>
>> stub0:
>> movw r2, #0
>> movt r2, #0
>> nop
>> b back
>>
>> and at the "call site" we do
>>
>> b <fixup>
>> nop
>> nop
>> back:
>>
>> Patching code first fixes up the stub:
>>
>> stub0:
>> movw r2, #088
>> movt r2, #0
>> nop
>> b back
>>
>> and then the call site to jump to the stub:
>>
>> b stub0
>> nop
>> nop
>> back:
>>
>> With all of the icache flushing done in the correct order I think this
>> is OK.
>>
>> Having said that, I want to sound a note of caution.
>>
>> There is a compatibility problem in that the architecture
>> specification says
>>
>> ...once the modified instructions are observable, each PE that is
>> executing the modified instructions *must* (my emphasis) issue the
>> following instruction to ensure execution of the modified
>> instructions:
>>
>> ISB; Synchronize fetched instruction stream
>>
>> But we are not doing that on all PEs. For strict correctness we
>> should execute an ISB on every PE after a stub has been written and
>> before it is executed. In practice I don't think that it matters
>> because the instruction cache on the other PEs probably won't already
>> contain a stale copy of the stub. But it is possible: with a long
>> instruction cache it may do if the line has already been fetched for
>> an adjacent stub. Without an ISB the PE may not re-fetch its icache
>> line and may then execute the stale stub code.
>>
>> The patching doesn't need to be done in this way. It could have been
>> done as
>>
>> b <fixup>
>> ldr r2, <constant pool + offset>
>>
>> and the fixup could have initialized the constant pool entry and then
>> overwritten the b <fixup> to a NOP .
>>
>> [ As an aside: I don't think that the trampolines on AArch64 are
>> vulnerable in the same way because the stubs are of the form
>>
>> ldr r2, data0
>> b r2
>> data0:
>> xword 0
>>
>> the patching code does not modify an instruction except for the branch
>> at the call site, so it is legal. Having said that, perhaps we need
>> an acquiring load in this AArch64 stub instead of a simple LDR. ]
>>
>> Andrew.
>>
>
More information about the aarch32-port-dev
mailing list