Several articles classify data structures into linear data structures and non-linear data structures like so:
- linear data structures: array, linked list, stack, queue
- non-linear data structures: tree, graph, hash table
That is wrong. Array is not a linear data structure.
It is not even clear how these articles define linear data structure versus non-linear data structure. I will try proposing several definitions, and show that none of them support categorizing array into linear data structure together with linked list.
Definition 1: The shape of an array in the memory looks like a line.
The obvious problem with this definition is that the shape of linked list in the memory can be really messy. If we go with this definition, linked list would become non-linear, while hash table with probing would become linear. A less obvious problem is that for stack, queue, tree, and graph, some implementations would be linear, while others would be non-linear.
Definition 2: We iterate through an array with length L like so:
0, 1, 2, 3, …, L - 1.
This definition fails to capture all the ways we iterate through an array other than in index order. Imagine we have an array and a sorted order of indexes to the array, we can iterate through the array in the sorted order of the indexes, maybe like so:
9, 2, 7, 5, 0, 1, 6, 3, 8, 4
Not to mention that iteration is only one operation, which in no way represents the full capability of array. We binary search through a sorted array like so:
L/2, L/4, L/4 + L/8, …
We traverse through a tree backed by an array like so:
0, 1, 3, 8, 17, …
In contrast, we cannot use a linked list like above. We cannot even use a tree like above: notice that the binary search and the tree traversal above would translate to two different tree structures, yet an array can switch in between simply by a different access pattern. Even though usually it is meaningless to switch in such a way on the same array of data, but nothing stops you.
Definition 3: We most often use an array like a sequential list.
This definition is a cultural one instead of a technical one. But do we?
By and large we are using hash tables backed by arrays more than anything nowadays due to its convenience, especially in scripting languages. Arrays also make tensors. In neural networks we can connect tensors in various ways thanks to the power of array. The more an array is trusted with heavy lifting, the less often it is used simply as a sequential container.
Why is it so? Because array is memory, and memory is array. The power of random access of array comes straight from the memory hardware. To use array is to use memory, and there are so many more ways we can use it other than linear.
I sincerely hope we all can realize the full potential of array in aid of our projects. That is, the full potential of memory.
References:
- https://medium.com/basecs/whats-a-linked-list-anyway-part-1-d8b7e6508b9d
- https://link.medium.com/jgBT49b4s1
- https://www.careerride.com/Data-structure-linear-and-non-linear.aspx
- https://codeandwork.github.io/courses/java/linearDataStructures.html
- https://techdifferences.com/difference-between-linear-and-non-linear-data-structure.html
Top comments (4)
A linear data structure has a head and a tail and the elements in between have exactly one predecessor and ond successor. This applies to arrays, right? The quality arrays excel in is that you can access them in a non-linear way even though they are linear data structures.
If you can access the elements in random order, why do you think they have exactly one predecessor and successor? Isn't any element the predecessor and successor of any other element?
..because arrays are indexed data structures and the index says so.
We need some inevitable structure in the index to describe what we want. But as I demonstrated under "Definition 2", we are free to choose what that structure is. We've agreed on using integer exactly because it is much more powerful than linear, thus convenient for indexing random access.