[aarch64-port-dev ] RFR: 8080293: AARCH64: Remove unnecessary dmbs from generated CAS code

Andrew Dinn adinn at redhat.com
Tue Aug 25 08:20:47 UTC 2015


Hi Vladimir,

Thank you very much for your review.

On 25/08/15 01:24, Vladimir Kozlov wrote:
> I did not look deep on code logic.
> Few comments only:
> 
> Use {} for all conditional code (it cause a lot of pain in the past):
> 
>       if (is_cas)
>         return NULL;

Ok, I'll correct all of those.

> You don't need #ifndef PRODUCT:
> 
> +#ifndef PRODUCT
> +#ifdef ASSERT

Ah, that's good to know. Thanks.

> New mach instructs missing predicate:
> 
>  predicate(needs_acquiring_load_exclusive(n));
> 
> You use higher ins_cost to avoid their generation when predicate is
> false.  So why not explicit predicate?

I had two separate reasons for not repeating the predicates:

  1 They do quite a lot of work crawling the graph. So, calling the
predicate in the lower cost case and omitting it in the higher cost case
attempts to avoid the expense of executing it twice in some cases.

  2 The ins_cost for the lower and higher cost cases is meant to reflect
a difference in the expected execution cost associated with the instruction.

[n.b. I adopted the same strategy for the new membar generation rules
which were added as part of the volatile put optimization patch --
compare membar_release and unnecessary_membar_release]

However, looking again at the code I believe I have the costs (and hence
the predicates) attached to the wrong rules in each pair. For example,
currently the rules include the following details

  compareAndSwapIAcq -- does not emit dmb instructions
    no predicate
    cost  (2 * VOLATILE_REF_COST )

  compareAndSwapI -- emits dmb instructions
    predicate(!needs_acquiring_load_exclusive(n))
    cost VOLATILE_REF_COST

The patch presumes that the first rule will have a lower (or at least no
higher) cost than the second. So a correct version would be either,
calling the predicate once:

A:

  compareAndSwapIAcq -- does not emit dmb instructions
    predicate(needs_acquiring_load_exclusive(n))
    cost VOLATILE_REF_COST

  compareAndSwapI-- emits dmb instructions
    no predicate
    cost (2 * VOLATILE_REF_COST)

or, calling the predicate twice:

B:

  compareAndSwapIAcq -- does not emit dmb instructions
    predicate(needs_acquiring_load_exclusive(n))
    cost VOLATILE_REF_COST

  compareAndSwapI-- emits dmb instructions
    predicate(!needs_acquiring_load_exclusive(n))
    cost (2 * VOLATILE_REF_COST)

I would prefer to retain A over B. A is guaranteed to give the same
result as B with potentially less execution overhead. However, if you
feel B is required I will adopt that format for all rules.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in UK and Wales under Company Registration No. 3798903
Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
(USA), Michael O'Neill (Ireland)



More information about the aarch64-port-dev mailing list