<i18n dev> RFR: 8263561: Re-examine uses of LinkedList [v2]
Сергей Цыпанов
github.com+10835776+stsypanov at openjdk.java.net
Tue May 25 11:42:41 UTC 2021
On Fri, 21 May 2021 23:16:57 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> I don't remember all the comments you have received in this thread but have you verified that arbitrarily changing `LinkedList` into `ArrayList` in these classes is not going to significantly increase the footprint in the case where lists are empty or contain only one or two elements?
>>
>> I am not convinced that a global replacement of `LinkedList` by `ArrayList` is necessarily good - even though I agree that `ArrayList` is generally more efficient.
>
> I second the footprint concerns from @dfuch. I've seen with first hand cases where widespread uses of array lists for holding 1-2-3 elements (e.g. think of a graph where each node might only have a very small number of outgoing edges) lead to massive memory over-utilization. If the average size is known, at the very least I'd argue to replace with an array list which is sized correctly.
>
> And, all this said, the general assumption implied in this PR which linked lists are just to be avoided doesn't match my experience. If you want a "pay only as much memory as you use" data structure, you don't care about random access (e.g. all accesses are linear walks), a linked list is a perfectly fine choice. If there are use cases in the JDK where a LinkedList is used in places where it shouldn't be, then obviously _those cases_ should be replaced.
Hi @mcimadamore, @dfuch after your review comments I've decided to do a deeper investigation for this. First, I've decided to check whether empty `LinkedList` is going to have smaller footprint (this could be fruitful for cases when the list is likely to remain empty in vast majority of cases, see e.g. `URLClassPath.closeLoaders()`), and apparently it isn't:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class EmptyListBenchmark {
@Benchmark
public Object emptyArrayList() { return new ArrayList<>(); }
@Benchmark
public Object emptyLinkedList() { return new LinkedList<>(); }
}
This benchmark with my ad-hoc JDK build yields the following results:
Benchmark Mode Cnt Score Error Units
EmptyListBenchmark.emptyArrayList avgt 20 4.605 ± 0.463 ns/op
EmptyListBenchmark.emptyArrayList:·gc.alloc.rate.norm avgt 20 24.002 ± 0.001 B/op
EmptyListBenchmark.emptyLinkedList avgt 20 4.324 ± 0.081 ns/op
EmptyListBenchmark.emptyLinkedList:·gc.alloc.rate.norm avgt 20 32.002 ± 0.001 B/op
After JDK-8011200 `ArrayList` instantiated with default constructor doesn't allocate underlying array any more.
However the things get more complicated when the list gets populated:
@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class NonEmptyListBenchmark {
@Param({"1", "2", "3", "4", "5"})
private int size;
@Benchmark
public Object arrayList() {
var list = new ArrayList<>();
for (int i = 0; i < size; i++) {
list.add(i);
}
return list;
}
@Benchmark
public Object linkedList() {
var list = new LinkedList<Object>();
for (int i = 0; i < size; i++) {
list.add(i);
}
return list;
}
}
Here indeed `ArrayList` looses in memory comapring to `LinkedList` in single-item case:
arrayList:·gc.alloc.rate.norm 1 avgt 40 80.005 ± 0.001 B/op
arrayList:·gc.alloc.rate.norm 2 avgt 40 80.006 ± 0.001 B/op
arrayList:·gc.alloc.rate.norm 3 avgt 40 80.007 ± 0.001 B/op
arrayList:·gc.alloc.rate.norm 4 avgt 40 80.008 ± 0.001 B/op
arrayList:·gc.alloc.rate.norm 5 avgt 40 80.008 ± 0.001 B/op
linkedList:·gc.alloc.rate.norm 1 avgt 40 56.004 ± 0.001 B/op
linkedList:·gc.alloc.rate.norm 2 avgt 40 80.005 ± 0.001 B/op
linkedList:·gc.alloc.rate.norm 3 avgt 40 104.007 ± 0.001 B/op
linkedList:·gc.alloc.rate.norm 4 avgt 40 128.009 ± 0.001 B/op
linkedList:·gc.alloc.rate.norm 5 avgt 40 152.010 ± 0.001 B/op
And indeed there's at least one usecase in affected files where this is real-life scenario - `JarIndex`. Below on screenshot I run Spring Boot tests with Gradle:

However, for the same scenario one node represented with `LinkedList` can hold dozens of items:

To fix this I propose to instantiate `ArrayList` with initial size = 1:
@Benchmark
public Object arrayList_sized() {
var list = new ArrayList<>(1);
list.add(new Object());
return list;
}
This reduces the footprint of a list with 1 element down to 48 bytes:
arrayList_sized:·gc.alloc.rate.norm 1 avgt 40 48.004 ± 0.001 B/op
-------------
PR: https://git.openjdk.java.net/jdk/pull/2744
More information about the i18n-dev
mailing list