[aarch64-port-dev ] RFR: 8080293: AARCH64: Remove unnecessary dmbs from generated CAS code
Vladimir Kozlov
vladimir.kozlov at oracle.com
Tue Aug 25 17:14:19 UTC 2015
Okay, I agree to have only one predicate. So I am fine with version A).
Thanks,
Vladimir
PS: "first rule will have a lower" - should compareAndSwapI be first then?
> However, looking again at the code I believe I have the costs (and hence
> the predicates) attached to the wrong rules in each pair. For example,
> currently the rules include the following details
>
> compareAndSwapIAcq -- does not emit dmb instructions
> no predicate
> cost (2 * VOLATILE_REF_COST )
>
> compareAndSwapI -- emits dmb instructions
> predicate(!needs_acquiring_load_exclusive(n))
> cost VOLATILE_REF_COST
>
> The patch presumes that the first rule will have a lower (or at least no
> higher) cost than the second. So a correct version would be either,
> calling the predicate once:
On 8/25/15 1:20 AM, Andrew Dinn wrote:
> Hi Vladimir,
>
> Thank you very much for your review.
>
> On 25/08/15 01:24, Vladimir Kozlov wrote:
>> I did not look deep on code logic.
>> Few comments only:
>>
>> Use {} for all conditional code (it cause a lot of pain in the past):
>>
>> if (is_cas)
>> return NULL;
>
> Ok, I'll correct all of those.
>
>> You don't need #ifndef PRODUCT:
>>
>> +#ifndef PRODUCT
>> +#ifdef ASSERT
>
> Ah, that's good to know. Thanks.
>
>> New mach instructs missing predicate:
>>
>> predicate(needs_acquiring_load_exclusive(n));
>>
>> You use higher ins_cost to avoid their generation when predicate is
>> false. So why not explicit predicate?
>
> I had two separate reasons for not repeating the predicates:
>
> 1 They do quite a lot of work crawling the graph. So, calling the
> predicate in the lower cost case and omitting it in the higher cost case
> attempts to avoid the expense of executing it twice in some cases.
>
> 2 The ins_cost for the lower and higher cost cases is meant to reflect
> a difference in the expected execution cost associated with the instruction.
>
> [n.b. I adopted the same strategy for the new membar generation rules
> which were added as part of the volatile put optimization patch --
> compare membar_release and unnecessary_membar_release]
>
> However, looking again at the code I believe I have the costs (and hence
> the predicates) attached to the wrong rules in each pair. For example,
> currently the rules include the following details
>
> compareAndSwapIAcq -- does not emit dmb instructions
> no predicate
> cost (2 * VOLATILE_REF_COST )
>
> compareAndSwapI -- emits dmb instructions
> predicate(!needs_acquiring_load_exclusive(n))
> cost VOLATILE_REF_COST
>
> The patch presumes that the first rule will have a lower (or at least no
> higher) cost than the second. So a correct version would be either,
> calling the predicate once:
>
> A:
>
> compareAndSwapIAcq -- does not emit dmb instructions
> predicate(needs_acquiring_load_exclusive(n))
> cost VOLATILE_REF_COST
>
> compareAndSwapI-- emits dmb instructions
> no predicate
> cost (2 * VOLATILE_REF_COST)
>
> or, calling the predicate twice:
>
> B:
>
> compareAndSwapIAcq -- does not emit dmb instructions
> predicate(needs_acquiring_load_exclusive(n))
> cost VOLATILE_REF_COST
>
> compareAndSwapI-- emits dmb instructions
> predicate(!needs_acquiring_load_exclusive(n))
> cost (2 * VOLATILE_REF_COST)
>
> I would prefer to retain A over B. A is guaranteed to give the same
> result as B with potentially less execution overhead. However, if you
> feel B is required I will adopt that format for all rules.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
> (USA), Michael O'Neill (Ireland)
>
More information about the aarch64-port-dev
mailing list