RFR: 8164652: c1 port, code patching
Andrew Haley
aph at redhat.com
Thu Sep 29 10:51:45 UTC 2016
There has still not been a reply to this that I can find.
It is a serious issue that should be addressed.
Andrew.
On 31/08/16 17:58, Andrew Haley wrote:
> It all looks pretty good.
>
> I now understand how the C1 patching code works, and it's not as bad
> as I'd feared. :-)
>
> I'm think that it'll work on existing ARM processors, given that none
> of the copying of code buffers which x86 does is used at all.
>
> For the benefit of onlookers: when we have to patch, say, an
> instruction of the form
>
> mov r2, #nn
>
> but nn is unknown at compilation time, we lay down a stub of the form
>
> stub0:
> movw r2, #0
> movt r2, #0
> nop
> b back
>
> and at the "call site" we do
>
> b <fixup>
> nop
> nop
> back:
>
> Patching code first fixes up the stub:
>
> stub0:
> movw r2, #088
> movt r2, #0
> nop
> b back
>
> and then the call site to jump to the stub:
>
> b stub0
> nop
> nop
> back:
>
> With all of the icache flushing done in the correct order I think this
> is OK.
>
> Having said that, I want to sound a note of caution.
>
> There is a compatibility problem in that the architecture
> specification says
>
> ...once the modified instructions are observable, each PE that is
> executing the modified instructions *must* (my emphasis) issue the
> following instruction to ensure execution of the modified
> instructions:
>
> ISB; Synchronize fetched instruction stream
>
> But we are not doing that on all PEs. For strict correctness we
> should execute an ISB on every PE after a stub has been written and
> before it is executed. In practice I don't think that it matters
> because the instruction cache on the other PEs probably won't already
> contain a stale copy of the stub. But it is possible: with a long
> instruction cache it may do if the line has already been fetched for
> an adjacent stub. Without an ISB the PE may not re-fetch its icache
> line and may then execute the stale stub code.
>
> The patching doesn't need to be done in this way. It could have been
> done as
>
> b <fixup>
> ldr r2, <constant pool + offset>
>
> and the fixup could have initialized the constant pool entry and then
> overwritten the b <fixup> to a NOP .
>
> [ As an aside: I don't think that the trampolines on AArch64 are
> vulnerable in the same way because the stubs are of the form
>
> ldr r2, data0
> b r2
> data0:
> xword 0
>
> the patching code does not modify an instruction except for the branch
> at the call site, so it is legal. Having said that, perhaps we need
> an acquiring load in this AArch64 stub instead of a simple LDR. ]
>
> Andrew.
>
More information about the aarch32-port-dev
mailing list