[cacao] RFC: CACAO PR157 ARM memory barrier patch
Andrew Haley
aph at redhat.com
Fri Mar 11 08:52:25 PST 2011
On 03/11/2011 04:38 PM, Robert Lougher wrote:
> On 11 March 2011 16:31, Andrew Haley <aph at redhat.com> wrote:
>> On 03/11/2011 04:27 PM, Robert Lougher wrote:
>>> On 11 March 2011 15:55, Andrew Haley <aph at redhat.com> wrote:
>>>> On 03/11/2011 03:48 PM, Robert Lougher wrote:
>>>>>
>>>>> This is how I define MBARRIER for ARM in JamVM:
>>>>>
>>>>> #ifdef __ARM_ARCH_7A__
>>>>> #define MBARRIER() __asm__ __volatile__ ("dmb" ::: "memory")
>>>>> #else
>>>>> #define MBARRIER() __asm__ __volatile__ ("" ::: "memory")
>>>>> #endif
>>>>
>>>> But that's wrong for GNU/Linux binaries, surely.
>>>
>>> Ubuntu defaults to ARMv7 and Thumb2, so this is OK for Ubuntu. But
>>> yes, I believe Debian builds binaries for ARMv4t. I can understand
>>> this in the past, as older binaries would run with no problem on later
>>> chips. But ARMv7 needs proper memory barriers, so I don't think the
>>> Debian policy makes any sense here. Using the kernel helper like this
>>> just seems nasty to me (defining a function pointer via __kernel_dmb
>>> (*(__kernel_dmb_t *) 0xffff0fa0)).
>>
>> In what way is it nasty? You get the correct behaviour with very little
>> overhead.
>
> These barriers are used in the thin-locking code. The fast path in
> the lock should only be a few instructions. This will add significant
> overhead.
Do you actually know that? It's impossible to tell how long DMB
stalls the pipeline for because it depends on the state of the memory
system. However, A correctly-predicted BL takes a cycle on new ARMs,
as does a correctly-predicted return. In other words, I suspect you
wouldn't even be able to measure the difference, assuming that the
kernel simply placed a DMB; BX pair at 0xffff0fa0. I suppose it would
be an extra instruction cache line, which might make a difference in
some pathological cases.
Andrew.
More information about the distro-pkg-dev
mailing list