DEV Community

AlexMaryW
AlexMaryW

Posted on • Edited on

Indexable Containers in Haskell

While learning Haskell, you will come across List, the most popular indexable container available in this programming language. However, as you dive deeper into Haskell, List's limitations, e.g. slow indexing or immutability, will become more and more pervasive. Although we cannot fix List's flaws, we can use alternative modules to achieve better performance and/or functionality. In this post, we will discuss List and two modules you can use instead, Data.Array and Data.Vector, aiming to collect essential knowledge on them in one place and differentiate between them.

List

List is one of the most basic data structures in every programming language and, if Haskell isn't your first programming language, you might already be familiar with it. Yet, Haskell's built-in List is rather special for three reasons:

  • Firstly, it can contain exclusively same-type elements (contrary to, e.g., lists in Python).
  • Secondly, it has performance characteristics of linked lists.
  • Lastly, as everything in Haskell, it is immutable.
> let longList = [1..50000] -- A List of 50,000 elements
> let elemAt50000 = longList !! 50000 -- Indexing into the last, 50,000th element (the operation is slow!)
> elemAt50000
50000
Enter fullscreen mode Exit fullscreen mode

Listing 1: Creating a List containing 50,000 elements and indexing into the last element

Indexing into List can be slow because the program iterates from the first element to find the desired element. Interestingly, List can generate values as you index into it. This is caused by Haskell's being lazy and not evaluating elements until you try to read the value.

> let infiniteList = [1..] -- Creating an infinite List (you can do this because Haskell is a lazy programming language)
> let thisEvaluatesMillionElements = longList !! 1000000
> thisEvaluatesMillionElements
1000000
Enter fullscreen mode Exit fullscreen mode

Listing 2: Creating an infinite List containing 50,000 elements and indexing into the last element

List is convenient to use because of its simplicity but performing certain operations on it is much more complicated than it would be for an Array or a Vector, e.g. copying. Hence, compared to the other two, List is slow and best suited for simpler purposes. You can use List if you don't need immediate access to each element, want to iterate from the first element, to avoid any additional dependencies, or to iterate on an infinite number of elements.

Array

In contrast to List, Array is a mutable data structure (except immutable IArray) that allows immediate access to its elements. Its main advantage is the amount of control it gives you over indexing. For example, you can index into Array by Int, Int32, Int64, and even Char. Furthermore, it supports lower and upper bounds, meaning you can assign any position to its first element. Hence, the first element can hold the 5th position and to access it you would have to index into 5. Another advantage Array has over List is StorableArray, a type that can be used with C functions and is available in the Array module.

> import Data.Array (Array, listArray, array, (!!))
> let arr = listArray (5, 10) "abcde" :: Array Int Char -- Creating an Array of Chars, indexed by Ints
> arr ! 6
'b'
> let charArr = array ('a', 'c') [('a', "I'm at index a"), ('b', "I'm at index b")] -- Creating an Array of Strings, indexed by Chars
> charArr ! 'b'
"I'm at index b"
Enter fullscreen mode Exit fullscreen mode

Listing 3: Creating and reading Arrays with various index ranges and types

However, Array's advantages also lead to its disadvantages. Because Array grants you control over indexing, you cannot simply create an Array, you have to define its indexing as well. It is reasonable, then, that if you're working with an already existing Array, you need to always keep in mind which indexing types it implements.

> arr ! (4 :: Int)
*** Exception: Ix{Integer}.index: Index (4) out of range ((5,10))

> charArr ! (1 :: Int)
<interactive>: error:
     Couldn't match expected type Char with actual type Int
Enter fullscreen mode Exit fullscreen mode

Listing 4: Examples of incorrect indexing into an Array

Combining mutability and O(1) indexability, Array is a powerful container and can serve complex purposes. It is also built into Haskell and recommended if you consider additional dependencies less than desirable.

If you want to learn more about Array, this tutorial on MMHaskell might be helpful.

Vector

As a data structure, Vector combines the best qualities of List and Array: a simple API and O(1) access to elements. Similarly to List, Vector is immutable (except for MVector which you can read on here) and can be indexed only by Int.

> import Data.Vector (fromList)
> let vec = fromList [1, 2, 3]
> vec
[1,2,3]
Enter fullscreen mode Exit fullscreen mode

Listing 5: Creating a Vector from a List

However, Vector has features that distinguish it from List and Array:

> import Data.Vector.Mutable (new, write, read, grow)
> mvec <- new 5 :: IO (MVector RealWorld Char) -- Creating a Vector of Chars
> write mvec 6 'x'
*** Exception: index out of bounds (6,5)
> grown <- grow mvec 30                        -- Growing the Vector by 30 elements
> Data.Vector.Mutable.read grown 6
'x'
Enter fullscreen mode Exit fullscreen mode

Listing 6: Mutating and growing a Vector

Please note that Vector is not built into Haskell. The package is available on Hackage. Just like Array, the Vector module offers a Data.Vector.Storable that is compatible with C functions.

Using Vector is recommended if you need Array's O(1) access, don't mind extra dependencies and lack of generic indexing, and want to perform actions too complex for lists or are looking for a mutable container (only when using MVector interface).

Conclusion

Familiarizing yourself with the containers available in a given programming language is a necessary aspect of the learning process. This post gathers the basic information on all three indexable Haskell containers, each with a different purpose, and possibly useful for your project. The table below summarizes the most important features of containers discussed in this post:

Feature List Array Vector
Difficulty low high medium
Built-In yes yes no
Mutable no yes
(except IArray)
no
(except MVector)
Indexable
by
Int e.g. Int, Word32,
Bool, or Char
Int

Happy coding!

Top comments (0)