Discussion on: Linked Lists — BaseCS Video Series

View post

Excellent explanation as always. I use linked lists all the time since most FP lists are implemented as linked lists.

A word of caution though, you have to be careful of using linked lists in hot code. Even though they have O(n) characteristics, they can be significantly slower than arrays. Because of the way CPUs work, when it reads something from memory into CPU cache for processing, it reads surrounding data as well (cache line). Since array elements are stored next to each other, the CPU may be able to process multiple array elements with a single memory fetch. Whereas with linked lists, there is a high chance that every element will require a fetch from memory, greatly increasing processing time when iterating the list.

This is why in many languages, variable-size lists are actually backed with arrays. For instance in C#, the generic list List<T> allocates an array, and keeps track of how much free space is on the end. When you add an element and there is no free space, it allocates a new array of twice the size and copies the elements to it. The copy is expensive but, amortized over the life the object, is still much faster than using a linked list for many kinds of workflows.

I mention this because I have actually run into this issue when using an LRU cache in hot code. I tried both a linked-list implementation and an array-backed circular buffer, and there was a world of difference in the performance, as in the linked list became the bottleneck instead of the disk IO.