DrBearhands

Posted on Feb 21, 2019

Lazy, eager and greedy evaluation

#functional

If you're getting into (purely) functional programming, you've probably come across the term "lazy evaluation", or perhaps "eager evaluation".

What they are

In short, in a language that is evaluated lazily, a value is only computed when it is needed.

An example:

foo :: [Int]
foo = replicate 5 1

bar :: Int
bar = head foo

main = putStrLn $ show bar -- print the value of bar

In this program, foo, is not evaluated directly. Instead, the runtime 'remembers' that it is equal to replicate 5 1 in a thunk. It similarly stores a thunk for bar and main in a thunk.

Once main tries to get the value for bar, We know bar equals head foo, and that head matches on the pattern value : _.

With this in mind, we expand foo and get bar = head (replicate 5 1). Since replicate 5 1 does not match the desired pattern value : _, we expand it. replicate is recursively defined as something like:

replicate times value =
  if times == 0
    then []
    else value : replicate (times - 1) value

By evaluating one iteration of replicate, we get bar = head (1 : replicate (5 - 1) 1). This argument to head matches value : _. We can now conclude bar = 1.

In this case, lazy evaluation saves us from having to evaluate the remained of the list foo.

In contrast, eager evaluation fully computes values before they are required.

I can't seem to find the reference for greedy evaluation anymore, so don't take this as gospel, but greedy evaluation is when a value is computed as soon as its arguments are known. Imperative languages are generally greedily evaluated.

Eager and greedy evaluation look a lot alike, but eager evaluation allows to e.g. first create a large workload of values to be evaluated, whereas greedy evaluation would immediately evaluate them. The former can be useful for creating concurrent tasks.

I will ignore greedy evaluation in the remainder of this post.

Strict vs non-strict

Rather than talking about lazy or eager languages, it is better to talk about strict and non-strict languages.
Strict means a value may only be evaluated or not evaluated (at rest), whereas non-strict languages also allow for values to be partially evaluated. As you might expect, strict languages correspond to eager evaluation and non-strict languages correspond to lazy evaluation.

But why are strict and non-strict better properties of a language than lazy and eager? Simply put, how a program is evaluated is fundamentally a property of the compiler, which is responsible for turning a program into something that runs, whereas a language only defines the properties of its constructs.

I am being rather pedantic here though.

Relation to purity

Lazy evaluation requires purity. With side-effects, changing the order of evaluation might change a program's result.

However, purity does not require lazy-evaluation. I've heard Haskell folk say that laziness is great because it enforces purity, so any temptations to introduce side-effects are quelled. To me, this sounds like you don't really want purity, but I digress.

Advantages of laziness

I'm not a big believer in non-strict languages, so I will instead cite a comment by @jvanbruegge to provide a more neutral perspective.

It makes a lot of algorithms easier to express and allows you to treat your algorithm as data structure you traverse (which is pretty cool imo). Also without laziness, some normally trivial identities do not hold.

For example this: head . fmap f = f . head. If f bottoms on the second element in the list, the left will bottom and the right not. In a lazy language both won't.

Advantages of strictness

Because strict languages have simpler value states, they become easier to reason about. In particular, memory management and concurrency can become more complicated. For instance, if we have

let 
  x = ...
  y = ...
in
  f(x, y)

Assuming y does not contain x, in a pure language, we could compute x and y concurrently (the lack of side effects makes pure functions unordered safe for data dependencies).

With non-strictness, however, the expansions of x or y may not terminate even when f(x, y) would, or they may be much longer than necessary.

Latest comments (9)

Yufan Lou • Mar 30 '20

With non-strictness, however, the expansions of x or y may not terminate even when f(x, y) would, or they may be much longer than necessary.

I do not understand this point. For strict evaluation, x and y are both evaluated before f(x, y), and thus if any of x or y does not terminate, we wouldn't reach f(x, y), concurrent or not.

Because both concurrency and lazy evaluation give you nondeterministic evaluation order without explicit ordering, I do not understand how concurrency is easier to reason about in a strict language than in a lazy language. If anything, this comparison between Future and Promise in JavaScript shows how confusing it can be to mix concurrency with strict evaluation.

Otherwise, I agree lazy evaluation makes reasoning about memory usage a lot harder due to the thunks created.

DrBearhands • Mar 30 '20

For strict evaluation, x and y are both evaluated before f(x, y), and thus if any of x or y does not terminate, we wouldn't reach f(x, y), concurrent or not.

Exactly this! And therefore programmers wouldn't write non-terminating x and y in a strict language and expect termination of f(x, y). For the compiler, that means it can safely assume x and y will terminate so it can reduce them ahead of time / concurrently.

[...] lazy evaluation give you nondeterministic evaluation order without explicit ordering [...]

Unless I'm missing something, lazy evaluation is outermost reduction first + memoing. So it is ordered, but the order is hard to reason about.

Of course, in practice compilers sometimes turn lazy functions into eager ones if they can.

Yufan Lou • Apr 1 '20

And therefore programmers wouldn't write non-terminating x and y in a strict language and expect termination of f(x, y). For the compiler, that means it can safely assume x and y will terminate so it can reduce them ahead of time / concurrently.

I am still confused. Maybe I am just pedantic, but a strict evaluator chooses to strictly reduce parameters as an axiom, not as a theorem derived from programmer behavior.

lazy evaluation is outermost reduction first + memoing. So it is ordered, but the order is hard to reason about.

Ah, yeah, my bad, nondeterminism is orthogonal to that. It's just hard to reason about because the order is determined by the whole program, not locally.

DrBearhands • Apr 2 '20

Perhaps this will clarify:
lazy evaluation evaluates the outermost redex first.
eager evaluation evaluates the innermost redex first.
There is only 1 outermost redex, but there may be multiple innermost redexes.

in a (b c) (d e) the outermost evaluation is a (b c) and the innermost areb c and d e. Of course, that does not mean evaluating them concurrently is actually useful, due to communication overhead.

Yufan Lou • Apr 2 '20

I see, you think because there are always only 1 outermost redex, lazy evaluation cannot be done concurrently. I am not used to thinking about this. Is this more about making the interpretation / compilation concurrent?

For a lazy evaluation of a (b c) (d e), the outermost redex is indeed a (b c), but the concurrency is not in the evaluation engine, but in the semantics of a. In Haskell this may be concurrently (pure c) (pure e). Concurrency comes after the whole expression is evaluated, in the form of a concurrent IO monad.

The abstract example helps me to see the possibility, but I don't know of a real example of a strict and pure language which does automatic concurrency transformation.

DrBearhands • Apr 3 '20 • Edited

Strict, pure languages have explicit data dependencies (EDIT: between instructions). There are a multiple other issues preventing automatic concurrence though. The list is non-exclusive, these are just issues I know about:

Most eager languages are not pure.
If the granularity of tasks is too low, it makes no sense to run them in parallel due to communication overhead. The halting problem tells us we cannot reliably use static analysis to predict a task's runtime either.
Data parallelism is a bigger factor than instruction parallelism. As it is, functional languages tend to use lists rather than arrays, which are not well-suited for data parallelism.
Many concurrent problems use shared state in some way in practice.

I tried tackling the second issue using profiling during my master thesis, but was too unfamiliar with FP theory. I've got half a mind to try again using a total functional language (no bottoms, no mandated evaluation order).

flamd • Apr 28 '19

Interesting point of view over strict vs non-strict languages. I would like to further read about this. May you suggest some books/sites?

Yufan Lou • Mar 30 '20

Why Functional Programming Matters by John Hughes, the creator of QuickCheck, the first property-based testing tool, for Haskell.

DrBearhands • Apr 29 '19

Beyond wikipedia, not really. I don't really learn from books.

I should also notice that 'lazy' can be used to intend different things. I've seen it used to indicate just memoization, which is entirely different, but often associated with non-strictness. I believe this to be doctrine more than anything else.