Bimodal compilation

Remi Forax forax at univ-mlv.fr
Wed Apr 5 20:56:04 UTC 2023


Hi all,
for Devoxx France, me and José Paumard are giving a talk about Valhalla and Amber with several benchmarks mixing the two.

One problem we have is that the way pattern matching is compiled by javac and later JIT compiled leads to bimodal performance.
Depending on the day (exactly, depending on the JIT threads scheduling), either the method containing the switch is compiled as a whole method (good day) or only the method handle (corresponding to the invokedynamic) is compiled and when the whole method is compiled, the assembly code corresponding to the method handle is considered as too big thus not inlined (bad day).

The example is using arrays of non null instance of value classes with a default instance (the kind of value classes that are flattened in memory) and a sealed interface, if the method is not fully inlined (good day) performance have very good and if not performance are terrible (bad day).

A cascade of instanceof while slightly less fast does not exhibit that issue.

In our example, this bimodal performance issue is hidden when using identity classes, because of the cache misses become the bottleneck. 

I'm a little worry here because I do not see how to fix this bug without changing the way switch on types are compiled by javac, so this problem has to be tackled before the JDK 21 is released otherwise, people will have to recompile their application to fix that bug.

regards,
Rémi

---

  public @ZeroDefault @Value record Population(int amount) {
    public static Population zero() {
      return new Population(0);
    }
    public Population add(Population other) {
      return new Population(this.amount + other.amount);
    }
  }
  public sealed interface Populated permits City, Department, Region {}
  public @ZeroDefault @Value record City(String name, @NonNull Population population) implements Populated {}
  public @ZeroDefault @Value record Department(String name, City[] cities) implements Populated {
    public Department {
      cities = Arrays.stream(cities).toArray(size -> RT.newNonNullArray(City.class, size));
    }
  }
  public @ZeroDefault @Value record Region(String name, Department[] departments) implements Populated {
    public Region {
      departments = Arrays.stream(departments).toArray(size -> RT.newNonNullArray(Department.class, size));
    }
  }

  public static Population sumPopulationOf(Populated populated) {
    return switch (populated) {
      case City(var name, var population) -> population;
      case Department(var name, var cities) -> {
        var sum = Population.zero();
        for(var city: cities) {
          sum = sum.add(sumPopulationOf(city));
        }
        yield sum;
      }
      case Region(var name, var departments) -> {
        var sum = Population.zero();
        for(var department: departments) {
          sum = sum.add(sumPopulationOf(department));
        }
        yield sum;
      }
    };
  }

  public static Population sumPopulationOf(Populated[] populateds) {
    var sum = Population.zero();
    for(var populated: populateds) {
      sum = sum.add(sumPopulationOf(populated));
    }
    return sum;
  }

---

public class BenchDOP {
    private Region[] regions;

    @Setup
    public void init() {
        var data = Data.readCities();
        regions = data.regions().toArray(size -> RT.newNonNullArray(Region.class, size));
    }

    @Benchmark
    public Population sumPopulations() {
        return Data.sumPopulationOf(regions);
    }
}


---



# Benchmark: org.paumard.amber.model.cityvaluenonnullablearraydrecords.BenchDOP.sumPopulations

# Run progress: 0.00% complete, ETA 00:02:06
# Fork: 1 of 3
# Warmup Iteration   1: 47.845 us/op
# Warmup Iteration   2: 38.199 us/op
# Warmup Iteration   3: 38.130 us/op
# Warmup Iteration   4: 37.909 us/op
# Warmup Iteration   5: 38.345 us/op
Iteration   1: 38.581 us/op
Iteration   2: 37.946 us/op
Iteration   3: 37.837 us/op
Iteration   4: 38.013 us/op
Iteration   5: 37.885 us/op
Iteration   6: 37.853 us/op
Iteration   7: 37.931 us/op
Iteration   8: 37.874 us/op
Iteration   9: 37.828 us/op
Iteration  10: 37.925 us/op   <--- good day

# Run progress: 8.33% complete, ETA 00:02:00
# Fork: 2 of 3
# Warmup Iteration   1: 2871.011 us/op
# Warmup Iteration   2: 2761.856 us/op
# Warmup Iteration   3: 2759.977 us/op
# Warmup Iteration   4: 2761.045 us/op
# Warmup Iteration   5: 2756.167 us/op
Iteration   1: 2755.180 us/op
Iteration   2: 2781.178 us/op
Iteration   3: 2759.068 us/op
Iteration   4: 2755.737 us/op
Iteration   5: 2755.112 us/op
Iteration   6: 2754.553 us/op
Iteration   7: 2761.759 us/op
Iteration   8: 2750.829 us/op
Iteration   9: 2751.265 us/op
Iteration  10: 2749.668 us/op   <--- bad day

# Run progress: 16.67% complete, ETA 00:01:48
# Fork: 3 of 3
# Warmup Iteration   1: 42.359 us/op
# Warmup Iteration   2: 38.322 us/op
# Warmup Iteration   3: 38.311 us/op
# Warmup Iteration   4: 37.990 us/op
# Warmup Iteration   5: 37.988 us/op
Iteration   1: 38.139 us/op
Iteration   2: 38.052 us/op
Iteration   3: 37.959 us/op
Iteration   4: 38.037 us/op
Iteration   5: 37.997 us/op
Iteration   6: 37.957 us/op
Iteration   7: 37.977 us/op
Iteration   8: 37.905 us/op
Iteration   9: 37.913 us/op
Iteration  10: 37.976 us/op   <--- good day


More information about the amber-dev mailing list