Bimodal compilation
Remi Forax
forax at univ-mlv.fr
Wed Apr 5 21:05:54 UTC 2023
oops, please do not take into account this message !
I've not read the LogCompilation file correctly, there is a bimodal issue but due to "sumPopulationOf(Populated[] populateds)" being inlinined or not, not due to how the switch is compiled.
regards,
Rémi
----- Original Message -----
> From: "Remi Forax" <forax at univ-mlv.fr>
> To: "amber-dev" <amber-dev at openjdk.org>
> Cc: "jan lahoda" <jan.lahoda at oracle.com>
> Sent: Wednesday, April 5, 2023 10:56:04 PM
> Subject: Bimodal compilation
> Hi all,
> for Devoxx France, me and José Paumard are giving a talk about Valhalla and
> Amber with several benchmarks mixing the two.
>
> One problem we have is that the way pattern matching is compiled by javac and
> later JIT compiled leads to bimodal performance.
> Depending on the day (exactly, depending on the JIT threads scheduling), either
> the method containing the switch is compiled as a whole method (good day) or
> only the method handle (corresponding to the invokedynamic) is compiled and
> when the whole method is compiled, the assembly code corresponding to the
> method handle is considered as too big thus not inlined (bad day).
>
> The example is using arrays of non null instance of value classes with a default
> instance (the kind of value classes that are flattened in memory) and a sealed
> interface, if the method is not fully inlined (good day) performance have very
> good and if not performance are terrible (bad day).
>
> A cascade of instanceof while slightly less fast does not exhibit that issue.
>
> In our example, this bimodal performance issue is hidden when using identity
> classes, because of the cache misses become the bottleneck.
>
> I'm a little worry here because I do not see how to fix this bug without
> changing the way switch on types are compiled by javac, so this problem has to
> be tackled before the JDK 21 is released otherwise, people will have to
> recompile their application to fix that bug.
>
> regards,
> Rémi
>
> ---
>
> public @ZeroDefault @Value record Population(int amount) {
> public static Population zero() {
> return new Population(0);
> }
> public Population add(Population other) {
> return new Population(this.amount + other.amount);
> }
> }
> public sealed interface Populated permits City, Department, Region {}
> public @ZeroDefault @Value record City(String name, @NonNull Population
> population) implements Populated {}
> public @ZeroDefault @Value record Department(String name, City[] cities)
> implements Populated {
> public Department {
> cities = Arrays.stream(cities).toArray(size -> RT.newNonNullArray(City.class,
> size));
> }
> }
> public @ZeroDefault @Value record Region(String name, Department[] departments)
> implements Populated {
> public Region {
> departments = Arrays.stream(departments).toArray(size ->
> RT.newNonNullArray(Department.class, size));
> }
> }
>
> public static Population sumPopulationOf(Populated populated) {
> return switch (populated) {
> case City(var name, var population) -> population;
> case Department(var name, var cities) -> {
> var sum = Population.zero();
> for(var city: cities) {
> sum = sum.add(sumPopulationOf(city));
> }
> yield sum;
> }
> case Region(var name, var departments) -> {
> var sum = Population.zero();
> for(var department: departments) {
> sum = sum.add(sumPopulationOf(department));
> }
> yield sum;
> }
> };
> }
>
> public static Population sumPopulationOf(Populated[] populateds) {
> var sum = Population.zero();
> for(var populated: populateds) {
> sum = sum.add(sumPopulationOf(populated));
> }
> return sum;
> }
>
> ---
>
> public class BenchDOP {
> private Region[] regions;
>
> @Setup
> public void init() {
> var data = Data.readCities();
> regions = data.regions().toArray(size -> RT.newNonNullArray(Region.class,
> size));
> }
>
> @Benchmark
> public Population sumPopulations() {
> return Data.sumPopulationOf(regions);
> }
> }
>
>
> ---
>
>
>
> # Benchmark:
> org.paumard.amber.model.cityvaluenonnullablearraydrecords.BenchDOP.sumPopulations
>
> # Run progress: 0.00% complete, ETA 00:02:06
> # Fork: 1 of 3
> # Warmup Iteration 1: 47.845 us/op
> # Warmup Iteration 2: 38.199 us/op
> # Warmup Iteration 3: 38.130 us/op
> # Warmup Iteration 4: 37.909 us/op
> # Warmup Iteration 5: 38.345 us/op
> Iteration 1: 38.581 us/op
> Iteration 2: 37.946 us/op
> Iteration 3: 37.837 us/op
> Iteration 4: 38.013 us/op
> Iteration 5: 37.885 us/op
> Iteration 6: 37.853 us/op
> Iteration 7: 37.931 us/op
> Iteration 8: 37.874 us/op
> Iteration 9: 37.828 us/op
> Iteration 10: 37.925 us/op <--- good day
>
> # Run progress: 8.33% complete, ETA 00:02:00
> # Fork: 2 of 3
> # Warmup Iteration 1: 2871.011 us/op
> # Warmup Iteration 2: 2761.856 us/op
> # Warmup Iteration 3: 2759.977 us/op
> # Warmup Iteration 4: 2761.045 us/op
> # Warmup Iteration 5: 2756.167 us/op
> Iteration 1: 2755.180 us/op
> Iteration 2: 2781.178 us/op
> Iteration 3: 2759.068 us/op
> Iteration 4: 2755.737 us/op
> Iteration 5: 2755.112 us/op
> Iteration 6: 2754.553 us/op
> Iteration 7: 2761.759 us/op
> Iteration 8: 2750.829 us/op
> Iteration 9: 2751.265 us/op
> Iteration 10: 2749.668 us/op <--- bad day
>
> # Run progress: 16.67% complete, ETA 00:01:48
> # Fork: 3 of 3
> # Warmup Iteration 1: 42.359 us/op
> # Warmup Iteration 2: 38.322 us/op
> # Warmup Iteration 3: 38.311 us/op
> # Warmup Iteration 4: 37.990 us/op
> # Warmup Iteration 5: 37.988 us/op
> Iteration 1: 38.139 us/op
> Iteration 2: 38.052 us/op
> Iteration 3: 37.959 us/op
> Iteration 4: 38.037 us/op
> Iteration 5: 37.997 us/op
> Iteration 6: 37.957 us/op
> Iteration 7: 37.977 us/op
> Iteration 8: 37.905 us/op
> Iteration 9: 37.913 us/op
> Iteration 10: 37.976 us/op <--- good day
More information about the amber-dev
mailing list