Introduction Link to heading

This article is a follow-up to the Java Specialists’ Newsletter Issue 334 and to an email exchange with Heinz Kabutz. The issue is about a performance comparison between LinkedList and ArrayList and asks why ArrayList is less performant than LinkedList. The article presents a series of JMH benchmarks that progressively isolate and identify the actual cause of ArrayList’s apparent underperformance in the original demo.

The central question is “why in the benchmark of the javaspecialists issue ArrayList is not more performant than LinkedList for the add operation as expected?”


1. Setting Up the JMH Benchmark Link to heading

The benchmark targets List.add (appending to the end of the list), the same operation used in the original demo. To make JMH invocation overhead negligible, each benchmark invocation performs 10,648 add calls. The SingleShotTime mode combined with a large number of iterations (10,000) provides enough data for a statistically reliable analysis.

/******* Initial benchmark **********/

@BenchmarkMode(Mode.SingleShotTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 100, batchSize = 1)
@Measurement(iterations = 10_000, batchSize = 1)
@Fork(value = 1)
@State(Scope.Benchmark)
public class ListBenchmark {

    private static final String ARRAY_LIST = "arrayList";
    private static final String LINKED_LIST = "linkedList";
    static final int OPERATIONS_PER_INVOCATIONS = 10648;

    @Param({LINKED_LIST, ARRAY_LIST})
    private String type;

    private List<Integer> list;

    @Setup
    public void setup(){
        switch (type) {
            case ARRAY_LIST -> list = new ArrayList<>();
            case LINKED_LIST -> list = new LinkedList<>();
            default -> throw new IllegalStateException("Unexpected value: " + type);
        }
    }

    @Benchmark
    @OperationsPerInvocation(OPERATIONS_PER_INVOCATIONS)
    public int benchmarkList() {
        for(int i=0; i<OPERATIONS_PER_INVOCATIONS;i++ ){
        	list.add(42);
        }
        return list.size();
    }
}

2. Two Confounding Factors to Disentangle Link to heading

From observing the original demo, we know that the choice of GC can significantly affect the measured times. But before revisiting the GC question, there is a more fundamental issue to investigate first.

ArrayList’s backing array has a finite capacity. When that capacity is exceeded, ArrayList must allocate a new, larger array and copy all existing elements into it (see ArrayList#grow in the JDK source). In the original benchmark, the same list instance is reused across all iterations: the list grows throughout the warmup phase and keeps growing during measurement. Could these periodic array resizing operations be responsible for ArrayList’s poor showing?

To answer this question cleanly, we first eliminate GC pauses entirely by using EpsilonGC (the no-op GC) with a heap large enough to hold everything. This way, any measured performance difference between ArrayList and LinkedList is purely about their data structure operations, with no GC interference.

@Fork(value = 1, jvmArgsAppend = {"-XX:+UnlockExperimentalVMOptions", "-Xms13g", "-Xmx13g", "-XX:+UseEpsilonGC"})

3. First Results: ArrayList’s Variance is the Problem Link to heading

In this first benchmark, the same ArrayList and the same LinkedList are reused across all 10,000 measurement iterations, which forces ArrayList to repeatedly grow its backing array as elements accumulate. Here are the results:

Benchmark                        (type)  Mode    Cnt  Score   Error  Units
ListBenchmark.benchmarkList   arrayList    ss  10000  6,536 ± 6,109  ns/op
ListBenchmark.benchmarkList  linkedList    ss  10000  3,837 ± 0,097  ns/op

The most striking feature of these results is not the difference between the two implementations — it is that no conclusion can be drawn at all. ArrayList’s error margin is nearly as large as its mean score, meaning the measurements are wildly inconsistent. LinkedList, by contrast, is remarkably stable.

To understand why, we need to look beyond the mean and examine the full distribution of iteration times using the percentile output that JMH can produce:

ArrayList

  Percentiles, ns/op:
      p(0,0000) =      1,700 ns/op
     p(50,0000) =      2,172 ns/op
     p(90,0000) =      3,737 ns/op
     p(95,0000) =      3,992 ns/op
     p(99,0000) =      5,298 ns/op
     p(99,9000) =     33,640 ns/op
     p(99,9900) =  12041,024 ns/op
     p(99,9990) =  12041,297 ns/op
     p(99,9999) =  12041,297 ns/op
    p(100,0000) =  12041,297 ns/op
LinkedList

  Percentiles, ns/op:
      p(0,0000) =      2,712 ns/op
     p(50,0000) =      2,809 ns/op
     p(90,0000) =      8,771 ns/op
     p(95,0000) =      9,139 ns/op
     p(99,0000) =     11,343 ns/op
     p(99,9000) =     43,988 ns/op
     p(99,9900) =     47,424 ns/op
     p(99,9990) =     47,424 ns/op
     p(99,9999) =     47,424 ns/op
    p(100,0000) =     47,424 ns/op

ArrayList’s median (p50 = 2.17 ns/op) is actually better than LinkedList’s (p50 = 2.81 ns/op). However, beyond the 99.9th percentile, ArrayList’s times explode to over 12,000 ns/op — between 900 and 6,000 times the median. These extreme outliers are what drag the mean upward and inflate the error interval. LinkedList’s worst case, by comparison, is only about 17 times its median.

The six worst ArrayList iterations are:

Iteration 8874	12041,297
Iteration 3389	9306,72
Iteration 5583	9241,456
Iteration 1926	4142,329
Iteration 951	2670,946
Iteration 301	1824,781

If we recalculate the mean excluding these extreme values (using a 99.9% confidence interval, assuming a Gaussian distribution), ArrayList comes in at approximately 2.6 ± 0.1 ns/op — a meaningful, interpretable result that is comparable to LinkedList. This strongly suggests that the outliers belong to a different population of events.


4. Identifying the Root Cause: ArrayList.grow() Link to heading

To confirm the hypothesis, the benchmark was rerun using a patched ArrayList that logs a message to stdout each time grow() is called. The correlation between the outlier iterations and grow() calls is unambiguous:

Iteration 301: [ArrayList.grow] branch Arrays.copyOf, newCapacity=20767725
1556,732 ns/op

Iteration 951: [ArrayList.grow] branch Arrays.copyOf, newCapacity=31151587
2053,717 ns/op

Iteration 1926: [ArrayList.grow] branch Arrays.copyOf, newCapacity=46727380
3193,363 ns/op

Iteration 3389: [ArrayList.grow] branch Arrays.copyOf, newCapacity=70091070
4563,568 ns/op

Iteration 5583: [ArrayList.grow] branch Arrays.copyOf, newCapacity=105136605
6091,146 ns/op

Iteration 8874: [ArrayList.grow] branch Arrays.copyOf, newCapacity=157704907
10513,685 ns/op

(Note: the exact timings differ slightly from those above because this is a separate run using the patched implementation.)

Every outlier iteration is one in which grow() is triggered and calls Arrays.copyOf to allocate a new backing array 1.5× the size of the current one. By the time measurement begins, the list has already grown very large during warmup — so each copyOf is copying tens to hundreds of millions of elements. No wonder it takes thousands of nanoseconds.

Conclusion: in this benchmark, it is the periodic resizing of ArrayList’s backing array that causes its apparent underperformance relative to LinkedList.


5. A Fixed Benchmark: Controlling for Array Growth Link to heading

To properly compare the two implementations for List.add, we need to ensure that ArrayList never resizes during the measurement phase. The strategy is to reset the list before each iteration to a carefully chosen size: large enough that a single add call before the start of the iteration triggers one final grow() — after which the backing array is large enough to absorb all subsequent additions without further resizing during one iteration.

For LinkedList, naively rebuilding the list from scratch before each iteration would introduce its own noise (and consume an enormous amount of memory without GC). Instead, a lightweight “shallow copy” technique using VarHandle is used: a new LinkedList header is created that points to the same node chain as the pre-built reference list. This gives each iteration a fresh LinkedList view over the same underlying nodes, at minimal cost.


@BenchmarkMode(Mode.SingleShotTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 100, batchSize = 1)
@Measurement(iterations = 1000, batchSize = 1)
@Fork(value = 1, jvmArgsAppend = {"-XX:+UnlockExperimentalVMOptions", "--add-opens=java.base/java.util=ALL-UNNAMED", "--enable-preview", "-Xms5G", "-Xmx5G", "-XX:+UseEpsilonGC"})
@State(Scope.Benchmark)
public class ListBenchmark {

    private static final String ARRAY_LIST = "arrayList";
    private static final String LINKED_LIST = "linkedList";
    /*
     * Special size for pre-initial list which triggers an array grow when list.add is called
     * After the call, the length of array is 1_823_230 so `grow()` won't be called any more during benchmark
     *
     * Furthermore, the size is small enough to avoid an OOM without GC
     */
    private static final int BIG_LIST_INITIAL_SIZE = 1_215_487;

    private static final LazyConstant<ArrayList<Integer>> BIG_INITIAL_ARRAYLIST = LazyConstant.of(
            () -> generateListOfTypeAndSize(ArrayList::new, BIG_LIST_INITIAL_SIZE)
    );

    private static final LazyConstant<LinkedList<Integer>> BIG_INITIAL_LINKEDLIST = LazyConstant.of(
            () -> generateListOfTypeAndSize(LinkedList::new, BIG_LIST_INITIAL_SIZE + 1)
    );

    static final int OPERATIONS_PER_INVOCATIONS = 10648;

    @Param({LINKED_LIST, ARRAY_LIST})
    private String type;

    private List<Integer> list;
    private final VarHandle firstNodeVarHandle, lastNodeVarHandle, modCountVarHandle, sizeVarHandle, nextNodeVarHandle;

    public ListBenchmark() {
        try {
            MethodHandles.Lookup parentLookup = MethodHandles.lookup();
            MethodHandles.Lookup lookup = MethodHandles.privateLookupIn(LinkedList.class, parentLookup);
            firstNodeVarHandle = findVarHandleInLinkedList("first", lookup);
            lastNodeVarHandle = findVarHandleInLinkedList("last", lookup);
            modCountVarHandle = findVarHandle(AbstractList.class, "modCount", MethodHandles.privateLookupIn(AbstractList.class, parentLookup));
            sizeVarHandle = findVarHandleInLinkedList("size", lookup);
            Class<?> nodeClass = Class.forName("java.util.LinkedList$Node");
            nextNodeVarHandle = findVarHandle(nodeClass, "next", MethodHandles.privateLookupIn(nodeClass, parentLookup));
        } catch (NoSuchFieldException | IllegalAccessException | ClassNotFoundException e) {
            throw new RuntimeException(e);
        }
    }

    @Setup
    public void setup() {
        resetList();
    }

    @TearDown(Level.Iteration)
    public void tearDownAfterEachIteration() {
        resetList();
    }

    private void resetList() {
        switch (type) {
            case ARRAY_LIST -> {
                list = (List<Integer>) BIG_INITIAL_ARRAYLIST.get().clone();
                /* Trigger a call to ArrayList#grow which is going to increase the length of the backing array to 1_823_230 (with 607743 free cells for next adds)*/
                list.add(42);
            }
            case LINKED_LIST -> list = copyWithSameNodes(BIG_INITIAL_LINKEDLIST.get());
            default -> throw new IllegalStateException("Unexpected value: " + type);
        }
        /* To prevent GC pauses during iterations, we suggest that GC reclaims space from unused objects from the previous iteration before the next iteration starts*/
        System.gc();
    }

    @Benchmark
    @OperationsPerInvocation(OPERATIONS_PER_INVOCATIONS)
    public int benchmarkList() {
        for (int i = 0; i < OPERATIONS_PER_INVOCATIONS; i++) {
            list.add(42);
        }
        return list.size();
    }

    private static <T extends List<Integer>> T generateListOfTypeAndSize(Supplier<T> listTypeSupplier, int size) {
        return IntStream.range(0, size)
                .map(_ -> 42)
                .boxed()
                .collect(Collectors.toCollection(listTypeSupplier));
    }

    private static VarHandle findVarHandle(Class<?> clazz, String fieldName, MethodHandles.Lookup lookup) throws NoSuchFieldException, IllegalAccessException {
        return lookup.unreflectVarHandle(clazz.getDeclaredField(fieldName));
    }

    private static VarHandle findVarHandleInLinkedList(String fieldName, MethodHandles.Lookup lookup) throws NoSuchFieldException, IllegalAccessException {
        return findVarHandle(LinkedList.class, fieldName, lookup);
    }

    private LinkedList<Integer> copyWithSameNodes(LinkedList<Integer> linkedList) {
        LinkedList<Integer> clone = new LinkedList<>();
        setFirstNode(clone, linkedList);
        setLastNode(clone, linkedList);
        setModCount(clone, linkedList);
        setSize(clone, linkedList.size());
        return clone;
    }

    private void setSize(LinkedList<Integer> valueReceiver, int size) {
        this.sizeVarHandle.set(valueReceiver, size);
    }

    private void setModCount(LinkedList<Integer> valueReceiver, LinkedList<Integer> valueProvider) {
        setWithGivenVarHandle(valueReceiver, valueProvider, this.modCountVarHandle);
    }

    private void setLastNode(LinkedList<Integer> valueReceiver, LinkedList<Integer> valueProvider) {
        Object lastNode = this.lastNodeVarHandle.get(valueProvider);
        this.nextNodeVarHandle.set(lastNode, null);
        this.lastNodeVarHandle.set(valueReceiver, lastNode);
    }

    private void setFirstNode(LinkedList<Integer> valueReceiver, LinkedList<Integer> valueProvider) {
        setWithGivenVarHandle(valueReceiver, valueProvider, this.firstNodeVarHandle);
    }

    private static void setWithGivenVarHandle(LinkedList<Integer> valueReceiver, LinkedList<Integer> valueProvider, VarHandle varHandle) {
        varHandle.set(valueReceiver, varHandle.get(valueProvider));
    }

}

With array growth controlled for, the results are now both stable and interpretable:

Benchmark                        (type)  Mode   Cnt  Score   Error  Units
ListBenchmark.benchmarkList  linkedList    ss  1000  5,030 ± 0,381  ns/op
ListBenchmark.benchmarkList   arrayList    ss  1000  2,009 ± 0,066  ns/op

ArrayList is now clearly and consistently faster than LinkedList for appending elements, which is the expected result for List.add at the tail. ArrayList’s contiguous memory layout which is more cache-friendly might give it a structural advantage.


6. The Effect of the Garbage Collector Link to heading

During the investigation, the fixed benchmark was also run under several different GCs to observe how each affects the results. Since the benchmark uses a real heap and real object allocations, write barriers and GC overhead can influence the measured times even when no GC pause occurs.

GCArrayList score (ns/op)LinkedList score (ns/op)JVM options
Epsilon2,009 ± 0,0665,030 ± 0,381{"-XX:+UnlockExperimentalVMOptions", “–add-opens=java.base/java.util=ALL-UNNAMED”, “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseEpsilonGC”}
ZGC13,010 ± 0,1924,521 ± 0,053{"–add-opens=java.base/java.util=ALL-UNNAMED", “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseZGC”}
Shenandoah2,957 ± 0,0723,691 ± 0,091{"–add-opens=java.base/java.util=ALL-UNNAMED", “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseShenandoahGC”}
Serial2,790 ± 0,0603,763 ± 0,084{"–add-opens=java.base/java.util=ALL-UNNAMED", “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseSerialGC”}
Parallel2,398 ± 0,0763,637 ± 0,025{"–add-opens=java.base/java.util=ALL-UNNAMED", “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseParallelGC”}
G13,409 ± 0,0764,183 ± 0,092{"–add-opens=java.base/java.util=ALL-UNNAMED", “–enable-preview”, “-Xms5G”, “-Xmx5G”, “-XX:+UseG1GC”}

Two results stand out. First, EpsilonGC without GC overhead of any kind gives the cleanest baseline, as expected for ArrayList but not for LinkedList. Second, and more surprisingly, ZGC reverses the result: under ZGC, ArrayList (13.0 ns/op) is more than twice as slow as LinkedList (4.5 ns/op). This suggests that ZGC’s concurrent write barriers add significant overhead specifically for ArrayList’s array writes, while LinkedList’s pointer-linked structure is less affected.

I did not investigate further into the precise reasons for the differences between GCs or attempt to tune them — that alone would be another article.


7. Does the Result Hold at Larger List Sizes? Link to heading

Using ParallelGC — the best performer among the real GCs in these benchmarks — the fixed benchmark was rerun with progressively larger initial list sizes to verify that the conclusion holds at scale:

Initial sizeArrayList score (ns/op)LinkedList score (ns/op)
1,215,4872,398 ± 0,0763,637 ± 0,025
6,153,4002,286 ± 0,0573,737 ± 0,045
31,151,5872,601 ± 0,0964,164 ± 0,071
105,136,6052,422 ± 0,0494,413 ± 0,125

The conclusion is consistent across all tested sizes: as long as ArrayList does not need to perform an Arrays.copyOf, it is reliably faster than LinkedList for appending elements to the end of the list. The ratio stays roughly 2:1 in ArrayList’s favor regardless of list size.


Summary Link to heading

The original newsletter demo produced a misleading result because it benchmarked ArrayList under a condition where it was repeatedly copying arrays of tens to hundreds of millions of elements — an artifact of reusing the same growing list instance across thousands of iterations. These resize events are rare (only six in 10,000 iterations) but catastrophically expensive with arrays larger than 10_000_000, and they dominate the mean while hiding the true per-operation cost.

Once this artifact is controlled for, ArrayList is consistently faster than LinkedList for List.add, as cache-friendliness theory would predict.

The deeper lesson here is about benchmark methodology: the mean alone is not enough. A handful of extreme outliers — infrequent enough to seem negligible, but large enough to skew the average by 3× — can completely invert the apparent conclusion. Percentile analysis is essential, and understanding why those outliers exist is what turns a confusing result into a clear one.

As Heinz said: a surprising benchmark result is not necessarily a surprising truth. It might just be a benchmark doing what benchmarks do — showing us our own assumptions reflected back at us.