[cacao] RFC: CACAO PR157 ARM memory barrier patch

Andrew Haley aph at redhat.com
Fri Mar 11 08:52:25 PST 2011


On 03/11/2011 04:38 PM, Robert Lougher wrote:
> On 11 March 2011 16:31, Andrew Haley <aph at redhat.com> wrote:
>> On 03/11/2011 04:27 PM, Robert Lougher wrote:
>>> On 11 March 2011 15:55, Andrew Haley <aph at redhat.com> wrote:
>>>> On 03/11/2011 03:48 PM, Robert Lougher wrote:
>>>>>
>>>>> This is how I define MBARRIER for ARM in JamVM:
>>>>>
>>>>> #ifdef __ARM_ARCH_7A__
>>>>> #define MBARRIER() __asm__ __volatile__ ("dmb" ::: "memory")
>>>>> #else
>>>>> #define MBARRIER() __asm__ __volatile__ ("" ::: "memory")
>>>>> #endif
>>>>
>>>> But that's wrong for GNU/Linux binaries, surely.
>>>
>>> Ubuntu defaults to ARMv7 and Thumb2, so this is OK for Ubuntu.  But
>>> yes, I believe Debian builds binaries for ARMv4t.  I can understand
>>> this in the past, as older binaries would run with no problem on later
>>> chips.  But ARMv7 needs proper memory barriers, so I don't think the
>>> Debian policy makes any sense here.  Using the kernel helper like this
>>> just seems nasty to me (defining a function pointer via __kernel_dmb
>>> (*(__kernel_dmb_t *) 0xffff0fa0)).
>>
>> In what way is it nasty?  You get the correct behaviour with very little
>> overhead.
> 
> These barriers are used in the thin-locking code.  The fast path in
> the lock should only be a few instructions.  This will add significant
> overhead.

Do you actually know that?  It's impossible to tell how long DMB
stalls the pipeline for because it depends on the state of the memory
system.  However, A correctly-predicted BL takes a cycle on new ARMs,
as does a correctly-predicted return.  In other words, I suspect you
wouldn't even be able to measure the difference, assuming that the
kernel simply placed a DMB; BX pair at 0xffff0fa0.  I suppose it would
be an extra instruction cache line, which might make a difference in
some pathological cases.

Andrew.



More information about the distro-pkg-dev mailing list