[8u] RFR: nativeInstruction atomicity and correctness of patching

Anton Kozlov akozlov at azul.com
Fri Apr 1 18:12:42 UTC 2016


Hi,

Life is simplier if we allow stale code observed. Then, we really not
have to care about Multiprocessing Extensions, their presence and their
affects on cache maintance operations. Thanks for responses!

PS. I can't agree with some statements bellow, so I've added a couple
of notes inline. Sorry if it's became offtopic.

Thanks,
Anton

On Fri, 2016-04-01 at 13:51 +0100, Andrew Haley wrote:
> Hi,
> 
> On 04/01/2016 11:41 AM, Anton Kozlov wrote:
> > ICIMVAU and DCCMVAU used also in synchonization sequence in
> > "Concurrent
> > modification and execution of instructions (A3.5.4)".
> > It's
> >         DCCMVAU [instruction location] ; Clean data cache by MVA to
> > point of unification
> >         DSB ; Ensure visibility of the data cleaned from the cache
> >         ICIMVAU [instruction location] ; Invalidate instruction
> > cache
> > by MVA to PoU
> >         BPIMVAU [instruction location] ; Invalidate branch
> > predictor by
> > MVA to PoU
> >         DSB ; Ensure completion of the invalidations
> > for modifying site and
> >         ISB ; Synchronize fetched instruction stream
> > 
> > This explicitly names 2 concurrently executing threads. If
> > DCCMVAU/ICIMVAU not affects other CPU cache, then the sequence
> > works
> > for 1 thread only. Did I misunderstand the manual?
> 
> I think so.  When you modify instruction memory you must clean the
> local dcache and then broadcast the icache clean message to all other
> cores.  But they also must do some kind of memory barrier and ISB to
> update their local caches.
As far as I understand, icache clean from modifying site will be
applied eventually, regardless of barrier like dmb or dsb presence on
executing site, and new instructions will be loaded from memory.
Barrier will only stop the execution till clean be finished. It's
required in general, if we have strict dependency on instruction
modifications order, of course.

> > And as far as I understand, without this behavior of
> > DCCMVAU/ICIMVAU
> > executing site will have to flush own icache everytime it executes
> > instructions that may be patched, and we clearly don't do this.
> 
> I think we do enough.  We evacuate the code cache at a safepoint,
> which is a synchronization event.  Newly-generated code shouldn't be
> in anyone's icache, so isn't a problem.
> 
> (By the way, I have recently discussed with an ARM engineer, and we
> think it's OK.  However, he's going to discuss it with the
> architecture
> people to be sure.)
> 
> > However, in my kernel sources, there is no Data cache invalidation,
> > but I hoped it should be.
> 
> Data cache invalidation syncs the local cache with shared memory.  It
> does not affect any other core's data cache.  Other cores may still
> read stale data until they do some kind of memory barrier.
This is true in general, but I beleive that we could in theory
invalidate other CPU dcache, if we have Multiprocessing Extensions
implemented. Cite from "Effect of the Multiprocessing Extensions on
operations to the point of coherency" section:
    The Multiprocessing Extensions add requirements for the scope of the
    following operations, that affect data and
    unified caches to the point of coherency:
    * invalidate data, or unified, cache line by MVA to the point of
    coherency, DCIMVAC
    For Normal memory that is not Inner Non-cacheable, Outer Non
    -cacheable, these instructions must affect the caches
    of other processors in the shareability domain described by the
    shareability attributes of the MVA supplied with the
    operation.
It's unavailable in userspace on linux, however.

> > Without data cache invalidation, trampoline could load old call
> > target regardless if __clear_cache called or not, so really useless
> > here, you're right.
> 
> That's right.  And it doesn't matter.
> 
> > I care more about code modifiers, like Runtime1::patch_code. After
> > taking Patching_lock, we are checking instr_pc. If we are 2nd
> > thread
> > that came after patching occured, but our dcache have stale value
> > for instr_pc, then we'll perform the code patching again, and this
> > is not good.
> 
> It's harmless IMO.
> 
> Andrew.


More information about the aarch32-port-dev mailing list