Christian Vasquez

Posted on Jan 9, 2018

Explain Pure Functions Like I'm Five

#explainlikeimfive

Top comments (11)

Dad just spilled chocolate ice cream on the sheets. And the worst part is: He's supposed to be on a diet. It's only January 9th and he's already cheating. Mom can't find out. Thinking quickly, he runs into your room and gives you the scenario: He'll distract mom by bringing her out to a movie while you go get the sheets dry cleaned. There's a place around the corner and they know you so they'll be able to rush it. And mom won't ever know the difference.

Let's present two ways this plays out:

Cash

Dad gives you $40. You take the sheets down to the dry cleaners, hand them the money and the sheets. 45 minutes later they give you your change and the clean sheets. You get the sheets back on the bed just as your parents are getting home and mom is never the wiser.

Credit

Dad gives you his credit card. You go to the cleaners, pay with the card and get the sheets back in 45 minutes. Mom is never the wiser.

Until... she sees the credit card statement a week later and starts asking questions. Dad might be sneaky but he can't lie to her face and confesses. The jig is up.

Scenario one was a "pure function" you provided an input, [cash, dirtySheets], and got back an output [change, cleanSheets]. Nothing else meaningful happened in between the input and output time as far as you are concerned.

Scenario two was an "inpure function" because there was a side effect. The outside world was affected by the transaction went down. Within the transaction, a call to the credit card company where they made a request to add money to your statement.

A pure function is one where your output is always the exact same based on the input you provide and there are no side effects.

Let's say I have a function labeled. assignNickname(name). A "pure" version of this might take the letters in your name, consult an internally consistent dictionary and give you an output. "Hannah" becomes "Hannah Bobanna" and "Ben" becomes "Benny Bobenny". Every time. The output is a "pure function of" the input. An alternative function, with the same name, could instead consult an external service which provides different nicknames based on the name and the current moon cycle. On a full moon assignNickname(name) might turn Hannah's name to "WereHannah". It's a play on werewolf. It's not very clever. But the important thing is that the output is not always the same.

Perhaps assignNickname(name) doesn't even "return" a name, but instead "assigns" the name to a record in the database, which is a perfectly reasonable thing it might do. It might take the name, perform a function, and then write it to a database table represented of the name passed in. This is impure because it is a "side effect". Even if that is totally the right behavior, it's still called a side effect because other parts of the system are affected by the function.

Even console.log in JavaScript, or a print statement otherwise, causes a function to lose its purity because it's writing to another system. A pure function, for all intents and purposes, only returns an output. Sticking to this principal is powerful because it allows you to create very reliable systems and debug with a lot of confidence. Compilers can accurately describe pretty much the entire system because very little can go wrong. You can also do some neat tricks like "time travel debugging" where a program can be rewound and fast-forwarded to any point in its lifecycle because the system is always an accurate reflection of the state you put it in. An impure system does not work this way because once you've called out to a separate area in the code, you cannot "undo that" by rewinding.

I haven't done an extensive amount of purely functional programming, so some of this might be a bit off-base, but this is the gist.

Elm, a purely functional language that compiles to JavaScript (and can therefore be used in browsers) boasts that it is difficult to produce a runtime error even if you try because the compiler catches defects before they'd ever happen because all the possible outputs can be computed at compile time. Runtime errors in JavaScript can be very difficult to prevent otherwise. Redux, initially inspired by Elm I believe, brings similar functionality to React world, which was already pretty functional. It enforces purity and takes advantage of the powers with that. It's still a lot easier to "escape" the pure functional nature of Redux because it's still JavaScript and it's up to you to follow the rules. Often these are called "escape hatches", and they're technically possible, at your own risk, in most programming environments.

So why not write everything purely functional if it's so great? Well it's still pretty complicated and in some cases not worth the burden. Some functional die-hards would probably argue that everything should be done in this way, but take those arguments with some grains of salt. OOP has a lot of logical benefits. Functional programming is great for a lot of reasons. I mostly focused on "correctness" but there are also big performance benefits. But there's no such thing as a silver bullet. Functional programming can make some simple problems overly complicated, even if it can be a great help with the very complex ones. Don't go way out of your way to solve problems you do not have.

Kostas Bariotis • Jan 9 '18 • Edited

Just out of curiosity, how would you implement a pure function of console.log? I have no experience in functional languages.

Max Goldstein • Jan 10 '18

You can't, so you hide it. In Elm's case, it's called Debug.log to emphasize that it's not something to be used commonly. Essentially it looks like the identity function (takes an argument and gives it back) except that it logs it to the console.

More: package.elm-lang.org/packages/elm-...

Diogo Castro • Jan 10 '18 • Edited

Instead of executing the side-effects (e.g. calling a function log that returns a string), you create objects that represent those actions (e.g. calling a function log that returns a IO<string>).

You compose these "objects that represent actions" with other objects (usually with map/flatMap).

Once you're done composing all your effects, you bubble them up to main and use an escape hatch to trigger the execution of all side-effects at once (sometimes called unsafePerformIO). In Haskell, you don't use the escape hatch at all - you bubble up all IO<A> actions to main, and the runtime will go through your tree of effects and execute them. This is when things actually get "impure" - but you don't care anymore because it's now beyond the boundary of your application.

Here's a sample implementation of what IO<A> would look like in scala: scalafiddle.io/sf/AnXRGhf/8 (should be fairly easy to read regardless of your background, let me know if it's not and you'd like to see this in another language)

Theofanis Despoudis • Jan 9 '18

Answer: You cheat by accepting that there are side-effects in order to run the program.

See wiki.haskell.org/Introduction_to_IO

Diogo Castro • Jan 10 '18 • Edited

Eh, IO String in Haskell is just as pure as String. Haskell only has a handful of escape hatches to break out of purity/referential transparency, like unsafePerformIO and unsafeInterleaveIO - but you don't need to (and shouldn't) use them at all. They're there for the runtime system to use and execute your code. See my other reply in this thread.

Theofanis Despoudis • Jan 10 '18

True. Thats why though Haskell is often considered the "most functional" out of others, but not Purely functional, because the type system allows for any side effects to be encapsulated within the context of a type.

Idan Arye • Jan 10 '18

That claim is as meaningful as the claim that no computer is Turing complete because they all have finite memory.

Theofanis Despoudis • Jan 10 '18

They are if you ignore any resource limitations and thats the point really. A Turing machine is more of a mathematical model than a real world implementation as you cannot possibly express the notion of infinity in computer systems. There has to be an upper limit somewhere.

ctrlshiftbryan • Jan 10 '18 • Edited

A pure function is just a function that has no hidden inputs or outputs.

Incase you just want to read code...

Kasey Speakman • Jan 10 '18

My two general rules to ensure a pure function.

Output depends only on input parameters
No mutation of external state (including passed-in parameters)

Notes for impure languages (languages which allow mutation and other side effects from user code).

I'm a little lenient with #1 in cases where it is obvious the external data will never change as long as the program is running. E.g. constants, or readonly data read from a config on startup. The latter is still better as an input parameter.
One common way new devs violate #1 is to get the current date/time or generate a random number. You want to do this outside your pure function and pass the result in so the function remains pure (and testable).
The #2 allows a function to mutate a local variable and remain pure. As long as the variable is only created/used inside the function, the function still has no side effects that ripple to the outside.

View full discussion (11 comments)