performance degradation in Array::newInstance on -XX:TieredStopAtLevel=1

Сергей Цыпанов sergei.tsypanov at yandex.ru
Wed Jan 2 21:56:46 UTC 2019


Hello,

-XX:TieredStopAtLevel=1 flag is often used in some applications (e.g. Spring Boot based) to reduce start-up time.

With this flag I've spotted huge performance degradation of Array::newInstance comparing to plain constructor call.

I've used this benchmark

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class ArrayInstantiationBenchmark {

  @Param({"10", "100", "1000"})
  private int length;

  @Benchmark
  public Object newInstance() {
    return Array.newInstance(Object.class, length);
  }

  @Benchmark
  public Object constructor() {
    return new Object[length];
  }

}

On C2 (JDK 11) both methods perform the same:

Benchmark                                (length)  Mode  Cnt    Score    Error  Units
ArrayInstantiationBenchmark.constructor        10  avgt   50   11,557 ±  0,316  ns/op
ArrayInstantiationBenchmark.constructor       100  avgt   50   86,944 ±  4,945  ns/op
ArrayInstantiationBenchmark.constructor      1000  avgt   50  520,722 ± 28,068  ns/op

ArrayInstantiationBenchmark.newInstance        10  avgt   50   11,899 ±  0,569  ns/op
ArrayInstantiationBenchmark.newInstance       100  avgt   50   86,805 ±  5,103  ns/op
ArrayInstantiationBenchmark.newInstance      1000  avgt   50  488,647 ± 20,829  ns/op

On C1 however there's a huge difference (approximately 8 times!) for length = 10:

Benchmark                                (length)  Mode  Cnt    Score    Error  Units
ArrayInstantiationBenchmark.constructor        10  avgt   50   11,183 ±  0,168  ns/op
ArrayInstantiationBenchmark.constructor       100  avgt   50   92,215 ±  4,425  ns/op
ArrayInstantiationBenchmark.constructor      1000  avgt   50  838,303 ± 33,161  ns/op

ArrayInstantiationBenchmark.newInstance        10  avgt   50   86,696 ±  1,297  ns/op
ArrayInstantiationBenchmark.newInstance       100  avgt   50  106,751 ±  2,796  ns/op
ArrayInstantiationBenchmark.newInstance      1000  avgt   50  840,582 ± 24,745  ns/op

Pay attention that performance for length = {100, 1000} is almost the same.

I suppose it's a bug somewhere on VM because both methods just allocate memory and do zeroing elimination and subsequently there shouldn't be such a huge difference between them.

Sergey Tsypanov




More information about the core-libs-dev mailing list