lazyseq in Clojure
Namc ・5 min read
Basic idea :
Evaluation of an expression is delayed until the value is needed.
There are two ways to achieve the above :
 use lambda functions at runtime
 use macros/special forms at compile time
With lazy eval techniques, it is possible to construct infinite data structures that are evaluated as consumed, using lambdas, closures and recursion. In clojure, they are generated using lazyseq
and cons
forms.
Let’s take an example from the Clojure docs for lazyseq
user> (def fibseq
(lazycat [0 1] (map + (rest fibseq) fibseq)))
;; => #'user/fibseq
lazycat
is another macro to concatenate lazy sequences. We shall write the above expression to an alternate translation which uses lazyseq
user> (def fibseq (concat (lazyseq [0 1]) (lazyseq (map + (rest fibseq) fibseq))))
;; => #'user/fibseq
user> (doall (take 2000 fibseq))
This lazyseq
evaluates for very large indexes without giving an OutOfMemoryException/StackOverFlowError. (Might throw ArithmeticException for extremely large integer, though)
How this works internally :

lazyseq
executes the body once the first time it is accessed.  the result is cached and used whenever the function is called in future.
Going back to the example;
[0 1]
is the base case for the problem. In Fibonacci’s sequence, the first two values are 0 and 1. The consecutive computation of each value requires summation of previous value and the value before that one. Hence, we need at least two values to start the process.
(map + (rest fibseq) fibseq)
Here, rest will return the part of fibseq
without it’s head. map
will combine two sequences using +
and produce a next sequence. Let’s take the fibonacci sequence fibseqtemp
user> (def fibseqtemp [0 1 1 2 3])
user> (map + (rest fibseqtemp) fibseqtemp)
;; => (1 2 3 5)
The sequences fed to map are part of same basic sequence . The first sequence (rest [seq])
is the base seq without the head. The second sequence is the base seq, including the first value.
What is the base sequence here?
According to fibonacci number’s definition, A(n) = A(n1) + A(n2)
. We need to be able to access previously computed values in order to extend the sequence. So, we use the recursive reference to fibseq
to access our previous results. The base seq is typically [0 1]
here to begin with.
Let’s follow the steps mentioned above and run through the process :
Take the first element of fibseq
(base case [0 1] ) — 0
Take the second element of fibseq
— 1
Take the third element of fibseq
— Stuck, aren’t we?
Here, we use map to generate a sequence which is used as remaining values.
The sum of rest [0 1] which is 1, and first item of [0 1] , which is zero is 1 ; which is our result below.
user> (map + (rest [0 1]) [0])
;; => (1)
Similarly, let’s repeat the process for generating fourth element.
We use map
again for generating base seq.
Compute the sum of second item of rest [0 1]
, whose value is (1) as computed above; and second item of [0 1]
user> (map + (rest [0 1]) [1])
;; => (2)
Easy so far?
Now generate the fourth element.
Sum the third element of rest [0 1]
(which we computed as 1) and fourth item generated from [0 1]
(which was 2 ). And we get 3 .
This will keep building up to match the desired definition for series. To understand this better, let’s add a print statement to our original definition.
user> (def fibseq
(lazycat [0 1]
(map
(fn [a b]
(println (format "%d + %d" a b)
(+ a b))
(rest fibseq) fibseq)))
;; => #'user/fibseq
user> (doall (take 10 fibseq))
1 + 0
1 + 1
2 + 1
3 + 2
5 + 3
8 + 5
13 + 8
21 + 13
;; => (0 1 1 2 3 5 8 13 21 34)
—
Working with different datatypes

map
,reduce
,take
etc work in terms offirst
,rest
andmore
. 
first
(as well as others) ensure if the collection implementsISeq
.  If yes, then these functions are implemented directly. Else, a
seq
view of the collection is created to implementfirst
/more
/rest
.
lazyseq
macro
 The macro uses thunk to store the sequencegenerating calculations.
 When an element/chunk of sequence is requested, next thunk is called to retrieve the values.
 The thunk creates next subroutine to represent the tail of the sequence (lose your head).
 These thunks implement sequence interface, each thunk is called just once, and then cached. So the realized portion is just a sequence of values.
Holding onto the head
Take these two methods to generate infinite lazy seqs of randoms.
user> (defn infinite[] (lazyseq (cons (rand) (infinite))))
;; => #'user/infinite

user> (defn infinite1[] (lazyseq (cons (rand) (infinite1))))
;; => #'user/infinite1
user> (def zyx (infinite1))
If you try to run (last (infinite))
; it won’t crash, but will neither terminate. However, (last zyx)
will crash quickly with an OutOfMemoryError
.
This is because of “holding onto the head”.
In (last (infinite))
, Clojure disposes the earlier members of sequence because it is able to determine that they are not of any use. Hence, memory is not lost. Whereas, in (last zyx)
, the program is holding on to the definition , and hence, it is not able to clear the memory. This causes it to throw OutOfMemoryError .
So what’s the point of lazyseq
?
lazyseq
is just one of many possible ways to create lazy sequences. And there are several other ways to do it in clojure.
If a sequence is not lazy, it often holds onto it’s head, which consumes a lot of heap space. If it is lazy, it is computed, then discarded as it is not used for further computations. This is particularly useful when you are dealing with huge sequences, and only using a small part of them for sequential computation.
—
More references :
https://purelyfunctional.tv/lesson/whatarelazysequences/
http://theatticlight.net/posts/LazySequencesinClojure/
http://www.thesoftwaresimpleton.com/blog/2014/09/08/lazyseq/