ivan.gavlik

Posted on Feb 7

Clojure Vectors: A Deep Dive into the Data Structure

#clojure #beginners #tutorial #functional

Why Vectors Matter in Clojure

Vectors are one of the most commonly used data structures in Clojure. They provide fast lookup, efficient updates, and structural sharing for immutability, making them a crucial tool in functional programming. In this article, we'll explore how vectors work, when to use them, and best practices for leveraging their full potential.

How Vectors Differ from Lists

Clojure provides multiple sequential collections, but vectors and lists have different performance characteristics.

Vector

Access by index: O(1)
Adding elements: O(1) (append)
Immutability strategy: Structural sharing
Best suited for: Random access, iteration

List

Access by index: O(n)
Adding elements: O(1) (prepend)
Immutability strategy: Linked list
Best suited for: Recursion, sequential processing

Vectors are optimized for fast random access and updates, while lists excel in sequential processing and recursion.

Creating and Using Vectors

Vectors in Clojure are created using literal syntax ([]) or the vector function.

(def my-vec [1 2 3 4])
(def my-vec-alt (vector 1 2 3 4))

Retrieving elements by index is constant time:

(nth my-vec 2)  ; => 3
(get my-vec 2)  ; => 3

Modifying Vectors Efficiently

Although vectors are immutable, Clojure provides efficient ways to "modify" them by creating new versions.

Adding Elements

Appending elements to a vector is fast (O(1) amortized):

(conj my-vec 5)  ; => [1 2 3 4 5]

Updating Elements

Clojure's assoc function allows updating elements efficiently:

(assoc my-vec 1 99)  ; => [1 99 3 4]

Removing Elements

Vectors do not have a built-in function for removing elements by index, but you can achieve it using subvec:

(vec (concat (subvec my-vec 0 2) (subvec my-vec 3)))
; => [1 2 4]

Performance Considerations: When to Use Vectors

Vectors are ideal when:

You need fast index-based lookups
Appending elements at the end is frequent
Immutability with efficient memory usage is important

However, if you frequently remove elements from the front, consider other data structures like clojure.lang.PersistentQueue or list.

Best Practices for Working with Vectors

Use mapv for Efficient Vector Transformations.

Clojure's map function produces a lazy sequence, which may not always be ideal when working with vectors. Using map on a vector returns a lazy sequence, which requires conversion back to a vector if you need indexed access.

Example: The Issue with map

(map inc [1 2 3 4])
; => (2 3 4 5)  ; Returns a lazy sequence, not a vector

Since the result is a lazy sequence, functions expecting a vector (e.g., assoc) may not work efficiently. Instead, using mapv ensures the result remains a vector:

Example: Using mapv

(mapv inc [1 2 3 4])
; => [2 3 4 5]  ; Maintains vector type

Using mapv eliminates the need for explicit conversion (vec (map ...)), making the code cleaner and more performant.

Prefer conj for adding elements unless inserting at a specific index.

Use subvec for Efficient Slicing

When working with large vectors, extracting a portion using subvec is more efficient than converting to sequences. The subvec function provides a constant-time way to create a view of the original vector without copying data.

Example: Using subvec

(def my-vec [1 2 3 4 5 6 7 8 9])
(subvec my-vec 2 5)  ; => [3 4 5]

Unlike converting to sequences and filtering, subvec does not create an entirely new vector but instead references the existing structure efficiently.

Avoid frequent use of vec on sequences; favor transients if performance is critical.

Using vec on sequences repeatedly can be inefficient because it creates a new vector from a sequence every time, leading to unnecessary allocations. Instead, when building large vectors incrementally, consider using transient vectors for improved performance.

Example: Using Transients for Efficient Construction

(reduce conj! (transient []) (range 1000000))
; Returns a transient vector efficiently

You can finalize a transient vector using persistent! to make it immutable again:

(persistent! (reduce conj! (transient []) (range 1000000)))

This approach significantly improves performance for large data transformations.

Summary

Vectors are a powerful and efficient choice for most collection-based operations in Clojure. They offer fast lookup, efficient immutability, and excellent performance for most use cases.

Key Takeaways:
✅ Use vectors for random access and iteration
✅ Prefer conj for appending and assoc for updating
✅ Optimize performance with mapv, subvec and transient