[aarch64-port-dev ] RFR: 8080293: AARCH64: Remove unnecessary dmbs from generated CAS code

Vladimir Kozlov vladimir.kozlov at oracle.com
Tue Aug 25 00:24:29 UTC 2015


I did not look deep on code logic.
Few comments only:

Use {} for all conditional code (it cause a lot of pain in the past):

       if (is_cas)
         return NULL;

You don't need #ifndef PRODUCT:

+#ifndef PRODUCT
+#ifdef ASSERT

New mach instructs missing predicate:

  predicate(needs_acquiring_load_exclusive(n));

You use higher ins_cost to avoid their generation when predicate is 
false.  So why not explicit predicate?

Thanks,
Vladimir

On 8/24/15 7:31 AM, Andrew Dinn wrote:
> The following webrev against hs-comp head fixes 8080293
>
>    http://cr.openjdk.java.net/~adinn/8080293/webrev.00/
>
> It is a follow on to the prior volatile object patch
>
>    8078743: AARCH64: Extend use of stlr to cater for volatile object stores
>    http://cr.openjdk.java.net/~adinn/8078743/webrev.04/
>
> and requires that previous patch to be applied first.
>
> Testing
> -------
>
> The patch is sensitive to GC configuration so it was tested against 5
> relevant configs
>
>    G1
>   CMS+UseCondCardMark
>   CMS-UseCondCardMark
>   Par+UseCondCardMark
>   Par-UseCondCardMark
>
> The validity of the transformation was verified by:
>
>    generating and eyeballing compiled code for simple test programs
>    successfully running a fairly large program (netbeans)
>    generating and eyeballing HashMap code compiled on a fairly large
> program run
>
> The fix was performance tested on 2 implementations of the AArch64
> architecture (more details below). On an O-O-O CPU it gave no noticeable
> benefit. On a simple pipeline CPU it gave a very significant benefit in
> specific cases.
>
> regards,
>
>
> Andrew Dinn
> -----------
> Senior Principal Software Engineer
> Red Hat UK Ltd
> Registered in UK and Wales under Company Registration No. 3798903
> Directors: Michael Cunningham (USA), Matt Parson (USA), Charlie Peters
> (USA), Michael O'Neill (Ireland)
>
> The Test
> --------
>
> As with the prior patch I tested the original vs new code generation
> strategy by running a jmh test first with -XX:+UseBarriersForVolatile
> and then with -XX:+UseBarriersForVolatile. Four different test programs
> ran in all 5 GC configs executing. Each test executed repeated CAS
> operations to an object field in a single thread with a BlackHole
> backoff between CASes varying from 0 to 64.
>
> Test one performed a CAS guaranteed to fail; test two performed a
> successful CAS from a fixed object to null and then back; test three
> performed a successful CAS from a fixed object to another fixed object
> and then back; test four performed a successful CAS from a fixed
> object to a newly allocated object and then back. The average time per
> CAS operation (ns/op) -- actually per 2 CAS operations for the latter 3
> tests -- was used as a score.
>
> The Results
> -----------
>
> On an O-O-O CPU there was no significant difference in the time taken.
>
> On a simple pipeline CPU the optimization gave a very significant
> benefit for the Fail tests on all GC configurations except CMS
> + UseCondCardMark. In all other cases there was no significant
> measurable benefit.
>
> Example Test
> ------------
>
> package org.openjdk;
>
> import org.openjdk.jmh.annotations.*;
> import org.openjdk.jmh.infra.Blackhole;
>
> import java.util.concurrent.TimeUnit;
> import java.util.concurrent.atomic.AtomicReference;
>
> @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)
> @Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
> @Fork(3)
> @BenchmarkMode(Mode.AverageTime)
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @State(Scope.Benchmark)
> public class CasNull {
>
>      Object tombstone;
>
>      AtomicReference<Object> ref;
>
>      @Param({"0", "1", "2", "4", "8", "16", "32", "64"})
>      int backoff;
>
>      @Setup
>      public void setup() {
>          tombstone = new Object();
>
>          ref = new AtomicReference<>();
>          ref.set(tombstone);
>      }
>
>      @Benchmark
>      public boolean test() {
>          Blackhole.consumeCPU(backoff);
>          ref.compareAndSet(tombstone, null);
>          ref.compareAndSet(null, tombstone);
> 	return true;
>      }
> }
>


More information about the aarch64-port-dev mailing list