Pure and Impure functions is a very important distinction in functional programming. Most of my university and early work experience was primarily in OOP languages (Java and C#) and I was never aware of this distinction.
Today I wanted to introduce this concept and how it can help write more testable and composable software.
Any function is either pure or impure. Wikipedia states that a pure function has the following two properties:
- It’s return value is the same for the same arguments.
- It’s evaluation has no side effects.
So what exactly does that mean? Lets take the following function:
Given the same x and y, this function will always return the sum of those two ints. If x = 1 and y = 1, it will always return 2. If x = 100 and y = 100, it will always return 200.
But this function:
The output of this function now depends on a third variable z, which is not one of the arguments. Since the function could return a different result for the same inputs it’s now an impure function.
Or this next function:
The sum only depends on it’s inputs, which is good. There’s now a database call, which is a side effect. There’s also the possibility that the database operation can throw an exception. So the database call has introduced some complexity and side effects to our function. Which means it’s also an impure function.
Why is it important to know the distinction between types of functions?
One big reason is that pure functions are inherently testable.
Say I’m your manager, and I tasked you with verify that the following two method’s logic was correct.
The AddPure method is almost trivial to test and verify that it’s working correctly.
On the other hand the AddImpure method presents some challenges.
- Where is z defined at?
- How can I initialize z?
- Do I have to create a class or classes to verify this method?
Testing the AddImpure method is doable, we now have to jump through some hoops because of it’s impureness.
Or if we wanted to test the AddDb method.
We’d now have to deal with a database call in the middle of our function. In an object oriented language, dependency injection is usually what we rely on in order to remove dependencies when we test. Dependency injection allows us to change what type of Db our method is using. If we’re testing, here’s a mock db, if we’re running our app, here’s the real database.
In an object oriented language, that’s the recommended way to make functions testable. But that doesn’t mean it’s the only way to solve this problem.
Lets take a step back and break down our AddDb function. What are its responsibilities?
- Summing two numbers and returning the result
- Saving the result to a database
Our function does two distinct things. When we are unit testing, what are we attempting to verify? Our unit test is only verifying that responsibility #1 is correct. We use dependency injection to avoid having to deal with responsibility #2.
That begs the question, if we’re only testing half of what our function is doing, and introducing tools in order to avoid dealing with the other half, can we refactor this function?
In languages that have functions as first class citizens, this is pretty easy to do. C# has the ability to define Functions, then pass those functions to methods. We could refactor our function into the following:
So whatever operation we’ve decided to do, that is now being defined outside of our function.
We’ve broken our single function into two functions. There are several great benefits from doing this.
- The Sum function is now a pure function.
- I can now test my Sum logic without worrying about dealing with the database.
- I can also change the behavior of AddFunctional without modifying it. Which means we should rename the function to something more generic.
Because the AddFunctional takes in a function, any function that has the same signature can now be used in this method.
Suddenly my single method supports anything that can conform to the Func signature. Which helps the composability of my application.
If this information is new to you, you might be thinking to yourself, “lets make everything pure!” Unfortunately most of the useful things that applications do require using impure functions.
Things like saving and retrieving information to a database, making http requests, etc, those are all impure functions because they violate one or both of the conditions of pure functions.
When writing your application, you should strive to maximize the number of pure methods used. If it’s possible, implement something using a pure method rather than an impure method. Or refactor a single impure method into smaller methods where some are pure, like our database example.
Pure methods make your code easier to reason about and easier to test. Impure methods are usually always required, but knowing the distinction can lead to better software design.
Pure v. impure methods wasn’t something I was aware of until I started learning a functional programming language (F#). Even though I use C# in my day job, learning functional programming techniques has helped me write better software.
In my next post I’ll expand a bit more on this topic. I’ll explain some of the implications and talk about what happens when we start chaining function calls together.
Originallly published at theshaperdev.com