RFR: 8151502: aarch64: optimize pd_disjoint_words and pd_conjoint_words

9 Mar 2016

      Hi,

Please review the following webrev

http://cr.openjdk.java.net/~enevill/8151502/webrev/

This optimizes Copy::pd_disjoint_words and Copy::pd_conjoint_words using inline assembler.

These routines are heavily used in GC and the aim is to improve the overall performance of GC.

Tested in JMH using the following GCStress program.

http://cr.openjdk.java.net/~enevill/8151502/JMHSample_97_GCStress.java

JMH jar file: http://cr.openjdk.java.net/~enevill/8151502/benchmarks.jar

The following are the results I get

Original:

/home/ed/images/jdk9-orig/bin/java -jar target/benchmarks.jar -i 5 -wi 5 -f 5

Result "gcstress":
  24636979.087 ?(99.9%) 267838.773 us/op [Average]
  (min, avg, max) = (24102797.710, 24636979.087, 25372022.370), stdev = 357557.099
  CI (99.9%): [24369140.314, 24904817.860] (assumes normal distribution)

# Run complete. Total time: 00:20:55

Benchmark                       Mode  Cnt         Score        Error  Units
JMHSample_97_GCStress.gcstress  avgt   25  24636979.087 ? 267838.773  us/op

---------------------------------------------------------------------------

Optimized:

/home/ed/images/jdk9-test/bin/java -jar target/benchmarks.jar -i 5 -wi 5 -f 5

Result "gcstress":
  20164420.762 ?(99.9%) 280305.425 us/op [Average]
  (min, avg, max) = (19738992.960, 20164420.762, 21137460.090), stdev = 374199.723
  CI (99.9%): [19884115.337, 20444726.188] (assumes normal distribution)

# Run complete. Total time: 00:17:06

Benchmark                       Mode  Cnt         Score        Error  Units
JMHSample_97_GCStress.gcstress  avgt   25  20164420.762 ? 280305.425  us/op

This shows approx 22% performance improvement on this benchmark.

I have also included a small bug fix to Array copy when using -XX:+UseSIMDForMemoryOps. I had fixed this previously, but somehow it fell out.

All the best,
Ed

Edward Nevill

Andrew Haley

Edward Nevill

Andrew Haley

Edward Nevill

tags

participants (2)