[aarch64-port-dev ] RFR: 8080293: AARCH64: Remove unnecessary dmbs from generated CAS code

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Aug 25 17:14:19 UTC 2015


Okay, I agree to have only one predicate. So I am fine with version A).

Thanks,
Vladimir

PS: "first rule will have a lower" - should compareAndSwapI be first then?

 > However, looking again at the code I believe I have the costs (and hence
 > the predicates) attached to the wrong rules in each pair. For example,
 > currently the rules include the following details
 >
 >    compareAndSwapIAcq -- does not emit dmb instructions
 >      no predicate
 >      cost  (2 * VOLATILE_REF_COST )
 >
 >    compareAndSwapI -- emits dmb instructions
 >      predicate(!needs_acquiring_load_exclusive(n))
 >      cost VOLATILE_REF_COST
 >
 > The patch presumes that the first rule will have a lower (or at least no
 > higher) cost than the second. So a correct version would be either,
 > calling the predicate once:



On 8/25/15 1:20 AM, Andrew Dinn wrote:
> Hi Vladimir,
>
> Thank you very much for your review.
>
> On 25/08/15 01:24, Vladimir Kozlov wrote:
>> I did not look deep on code logic.
>> Few comments only:
>>
>> Use {} for all conditional code (it cause a lot of pain in the past):
>>
>>        if (is_cas)
>>          return NULL;
>
> Ok, I'll correct all of those.
>
>> You don't need #ifndef PRODUCT:
>>
>> +#ifndef PRODUCT
>> +#ifdef ASSERT
>
> Ah, that's good to know. Thanks.
>
>> New mach instructs missing predicate:
>>
>>   predicate(needs_acquiring_load_exclusive(n));
>>
>> You use higher ins_cost to avoid their generation when predicate is
>> false.  So why not explicit predicate?
>
> I had two separate reasons for not repeating the predicates:
>
>    1 They do quite a lot of work crawling the graph. So, calling the
> predicate in the lower cost case and omitting it in the higher cost case
> attempts to avoid the expense of executing it twice in some cases.
>
>    2 The ins_cost for the lower and higher cost cases is meant to reflect
> a difference in the expected execution cost associated with the instruction.
>
> [n.b. I adopted the same strategy for the new membar generation rules
> which were added as part of the volatile put optimization patch --
> compare membar_release and unnecessary_membar_release]
>
> However, looking again at the code I believe I have the costs (and hence
> the predicates) attached to the wrong rules in each pair. For example,
> currently the rules include the following details
>
>    compareAndSwapIAcq -- does not emit dmb instructions
>      no predicate
>      cost  (2 * VOLATILE_REF_COST )
>
>    compareAndSwapI -- emits dmb instructions
>      predicate(!needs_acquiring_load_exclusive(n))
>      cost VOLATILE_REF_COST
>
> The patch presumes that the first rule will have a lower (or at least no
> higher) cost than the second. So a correct version would be either,
> calling the predicate once:
>
> A:
>
>    compareAndSwapIAcq -- does not emit dmb instructions
>      predicate(needs_acquiring_load_exclusive(n))
>      cost VOLATILE_REF_COST
>
>    compareAndSwapI-- emits dmb instructions
>      no predicate
>      cost (2 * VOLATILE_REF_COST)
>
> or, calling the predicate twice:
>
> B:
>
>    compareAndSwapIAcq -- does not emit dmb instructions
>      predicate(needs_acquiring_load_exclusive(n))
>      cost VOLATILE_REF_COST
>
>    compareAndSwapI-- emits dmb instructions
>      predicate(!needs_acquiring_load_exclusive(n))
>      cost (2 * VOLATILE_REF_COST)
>
> I would prefer to retain A over B. A is guaranteed to give the same
> result as B with potentially less execution overhead. However, if you
> feel B is required I will adopt that format for all rules.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
> (USA), Michael O'Neill (Ireland)
>


More information about the aarch64-port-dev mailing list