RFR: 8263561: Re-examine uses of LinkedList [v2]

Сергей Цыпанов github.com+10835776+stsypanov at openjdk.java.net
Tue May 25 11:42:41 UTC 2021


On Fri, 21 May 2021 23:16:57 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> I don't remember all the comments you have received in this thread but have you verified that arbitrarily changing `LinkedList` into `ArrayList` in these classes is not going to significantly increase the footprint in the case where lists are empty or contain only one or two elements?
>> 
>> I am not convinced that a global replacement of  `LinkedList` by `ArrayList` is necessarily good - even though I agree that `ArrayList` is generally more efficient.
>
> I second the footprint concerns from @dfuch. I've seen with first hand cases where widespread uses of array lists for holding 1-2-3 elements (e.g. think of a graph where each node might only have a very small number of outgoing edges) lead to massive memory over-utilization. If the average size is known, at the very least I'd argue to replace with an array list which is sized correctly.
> 
> And, all this said, the general assumption implied in this PR which linked lists are just to be avoided doesn't match my experience. If you want a "pay only as much memory as you use" data structure, you don't care about random access (e.g. all accesses are linear walks), a linked list is a perfectly fine choice. If there are use cases in the JDK where a LinkedList is used in places where it shouldn't be, then obviously _those cases_ should be replaced.

Hi @mcimadamore, @dfuch after your review comments I've decided to do a deeper investigation for this. First, I've decided to check whether empty `LinkedList` is going to have smaller footprint (this could be fruitful for cases when the list is likely to remain empty in vast majority of cases, see e.g. `URLClassPath.closeLoaders()`), and apparently it isn't:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class EmptyListBenchmark {
    @Benchmark
    public Object emptyArrayList() { return new ArrayList<>(); }

    @Benchmark
    public Object emptyLinkedList() { return new LinkedList<>(); }
}

This benchmark with my ad-hoc JDK build yields the following results:

Benchmark                                                            Mode  Cnt     Score     Error   Units
EmptyListBenchmark.emptyArrayList                                    avgt   20     4.605 ±   0.463   ns/op
EmptyListBenchmark.emptyArrayList:·gc.alloc.rate.norm                avgt   20    24.002 ±   0.001    B/op
EmptyListBenchmark.emptyLinkedList                                   avgt   20     4.324 ±   0.081   ns/op
EmptyListBenchmark.emptyLinkedList:·gc.alloc.rate.norm               avgt   20    32.002 ±   0.001    B/op

After JDK-8011200 `ArrayList` instantiated with default constructor doesn't allocate underlying array any more.

However the things get more complicated when the list gets populated:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class NonEmptyListBenchmark {
    @Param({"1", "2", "3", "4", "5"})
    private int size;

    @Benchmark
    public Object arrayList() {
        var list = new ArrayList<>();
        for (int i = 0; i < size; i++) {
            list.add(i);
        }
        return list;
    }

    @Benchmark
    public Object linkedList() {
        var list = new LinkedList<Object>();
        for (int i = 0; i < size; i++) {
            list.add(i);
        }
        return list;
    }
}

Here indeed `ArrayList` looses in memory comapring to `LinkedList` in single-item case:

arrayList:·gc.alloc.rate.norm                       1  avgt   40    80.005 ±   0.001    B/op
arrayList:·gc.alloc.rate.norm                       2  avgt   40    80.006 ±   0.001    B/op
arrayList:·gc.alloc.rate.norm                       3  avgt   40    80.007 ±   0.001    B/op
arrayList:·gc.alloc.rate.norm                       4  avgt   40    80.008 ±   0.001    B/op
arrayList:·gc.alloc.rate.norm                       5  avgt   40    80.008 ±   0.001    B/op

linkedList:·gc.alloc.rate.norm                      1  avgt   40    56.004 ±   0.001    B/op
linkedList:·gc.alloc.rate.norm                      2  avgt   40    80.005 ±   0.001    B/op
linkedList:·gc.alloc.rate.norm                      3  avgt   40   104.007 ±   0.001    B/op
linkedList:·gc.alloc.rate.norm                      4  avgt   40   128.009 ±   0.001    B/op
linkedList:·gc.alloc.rate.norm                      5  avgt   40   152.010 ±   0.001    B/op

And indeed there's at least one usecase in affected files where this is real-life scenario - `JarIndex`. Below on screenshot I run Spring Boot tests with Gradle:

![image](https://d.radikal.ru/d26/2105/d0/a5d588f282e6.png)

However, for the same scenario one node represented with `LinkedList` can hold dozens of items:

![image](https://c.radikal.ru/c38/2105/da/80e6a78cec08.png)

To fix this I propose to instantiate `ArrayList` with initial size = 1:

@Benchmark
public Object arrayList_sized() {
  var list = new ArrayList<>(1);
  list.add(new Object());
  return list;
}

This reduces the footprint of a list with 1 element down to 48 bytes:

arrayList_sized:·gc.alloc.rate.norm                      1  avgt   40    48.004 ±   0.001    B/op

-------------

PR: https://git.openjdk.java.net/jdk/pull/2744


More information about the net-dev mailing list