Optimization (?) of HashSet(Collection)
Сергей Цыпанов
sergei.tsypanov at yandex.ru
Mon Oct 4 11:57:17 UTC 2021
Hello,
in the code of HashSet(Collection) we have an optimization opportunity:
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
addAll(c);
}
instead of using addAll() inherited from j.u.Collection we can use c.forEach(this::add):
public HashSet(Collection<? extends E> c) {
map = new HashMap<>(Math.max((int) (c.size()/.75f) + 1, 16));
c.forEach(this::add);
}
This allows to rid Iterator and use counting loops for e.g. ArrayList instead of hasNext()/next().
To me the most common use cases of HashSet(Collection) are those:
- ArrayList/Arrays.asList() is passed to convert List->Set or/and excluded duplicates
- HashSet is passed to have a copy of original Set
We are also likely to benefit from this change in case the argument collection is synchronized as it's going to be locked only once.
I've used the benchmark [1] to analyze the impact and it demonstrates measurable improvement [2].
What puzzles we a bit is the result for j.u.ArrayList. At warm-up time it demonstrates better results than at measurement time:
length = 10
# Run progress: 4,44% complete, ETA 00:21:41
# Fork: 3 of 5
# Warmup Iteration 1: 134,699 ns/op
# Warmup Iteration 2: 115,391 ns/op
# Warmup Iteration 3: 130,110 ns/op
# Warmup Iteration 4: 114,465 ns/op <----
# Warmup Iteration 5: 114,849 ns/op
# Warmup Iteration 6: 115,073 ns/op
# Warmup Iteration 7: 113,988 ns/op
# Warmup Iteration 8: 115,301 ns/op
# Warmup Iteration 9: 115,452 ns/op
# Warmup Iteration 10: 124,493 ns/op <----
Iteration 1: 123,776 ns/op
Iteration 2: 123,719 ns/op
Iteration 3: 123,725 ns/op
Iteration 4: 126,892 ns/op
Iteration 5: 125,493 ns/op
Iteration 6: 124,661 ns/op
Iteration 7: 123,783 ns/op
Iteration 8: 123,975 ns/op
Iteration 9: 124,047 ns/op
Iteration 10: 123,899 ns/op
length = 100
# Run progress: 11,11% complete, ETA 00:20:10
# Fork: 1 of 5
# Warmup Iteration 1: 1236,695 ns/op
# Warmup Iteration 2: 1069,909 ns/op
# Warmup Iteration 3: 1243,852 ns/op
# Warmup Iteration 4: 1059,038 ns/op <----
# Warmup Iteration 5: 1057,670 ns/op
# Warmup Iteration 6: 1057,117 ns/op
# Warmup Iteration 7: 1056,932 ns/op
# Warmup Iteration 8: 1054,909 ns/op
# Warmup Iteration 9: 1058,106 ns/op
# Warmup Iteration 10: 1145,699 ns/op <----
Iteration 1: 1139,155 ns/op
Iteration 2: 1141,662 ns/op
Iteration 3: 1139,061 ns/op
Iteration 4: 1139,306 ns/op
Iteration 5: 1138,475 ns/op
Iteration 6: 1140,491 ns/op
Iteration 7: 1144,017 ns/op
Iteration 8: 1139,411 ns/op
Iteration 9: 1142,729 ns/op
Iteration 10: 1144,042 ns/op
Here I use 1 s warm-up iteration on 2s measurement iteration. It looks like at iteration 4 the code is top-optimized,
and later at the end of warm-up a kind of deoptimization happens. There's still visible improvement however.
The benchmark is run with 5 forks, and this deoptimization is visible for length = {10, 100}.
So my two question are:
- What is the reason of the deoptimization and can we do something about it?
- Whether suggested changes is worth implementations?
Regards,
Sergey Tsypanov
1.
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(jvmArgsAppend = {"-Xms2g", "-Xmx2g"})
public class HashSetBenchmark {
@Benchmark
public HashSet<Integer> fromArrayList(Data data) {
return new HashSet<>(data.arrayList);
}
@Benchmark
public HashSet<Integer> fromHashSet(Data data) {
return new HashSet<>(data.hashSet);
}
@Benchmark
public HashSet<Integer> fromArraysAsList(Data data) {
return new HashSet<>(data.arraysAsList);
}
@State(Scope.Thread)
public static class Data {
private ArrayList<Integer> arrayList;
private HashSet<Integer> hashSet;
private List<Integer> arraysAsList;
@Param({"10", "100", "1000"})
private int length;
@Setup
public void setup() throws IOException {
var array = new Integer[length];
for (int i = 0; i < length; i++) {
array[i] = i;
}
arraysAsList = Arrays.asList(array);
arrayList = new ArrayList<>(arraysAsList);
hashSet = new HashSet<>(arraysAsList);
}
}
}
2.
baseline
Benchmark (length) Mode Cnt Score Error Units
HashSetBenchmark.fromArrayList 10 avgt 50 127,838 ± 1,822 ns/op
HashSetBenchmark.fromArrayList 100 avgt 50 1209,543 ± 8,435 ns/op
HashSetBenchmark.fromArrayList 1000 avgt 50 11915,334 ± 155,908 ns/op
HashSetBenchmark.fromArraysAsList 10 avgt 50 130,049 ± 1,661 ns/op
HashSetBenchmark.fromArraysAsList 100 avgt 50 1138,407 ± 19,998 ns/op
HashSetBenchmark.fromArraysAsList 1000 avgt 50 11345,099 ± 83,086 ns/op
HashSetBenchmark.fromHashSet 10 avgt 50 184,559 ± 1,387 ns/op
HashSetBenchmark.fromHashSet 100 avgt 50 2001,577 ± 42,321 ns/op
HashSetBenchmark.fromHashSet 1000 avgt 50 17535,017 ± 199,892 ns/op
patched
Benchmark (length) Mode Cnt Score Error Units
HashSetBenchmark.fromArrayList 10 avgt 50 125,075 ± 0,907 ns/op
HashSetBenchmark.fromArrayList 100 avgt 50 1127,378 ± 19,368 ns/op
HashSetBenchmark.fromArrayList 1000 avgt 50 11455,172 ± 61,543 ns/op
HashSetBenchmark.fromArraysAsList 10 avgt 50 117,634 ± 3,641 ns/op
HashSetBenchmark.fromArraysAsList 100 avgt 50 1034,538 ± 11,958 ns/op
HashSetBenchmark.fromArraysAsList 1000 avgt 50 10786,914 ± 137,330 ns/op
HashSetBenchmark.fromHashSet 10 avgt 50 173,727 ± 2,177 ns/op
HashSetBenchmark.fromHashSet 100 avgt 50 1748,858 ± 12,535 ns/op
HashSetBenchmark.fromHashSet 1000 avgt 50 16474,208 ± 97,578 ns/op
More information about the core-libs-dev
mailing list