loading...
Cover image for Purely Functional Python With Static Types

Purely Functional Python With Static Types

suned profile image Sune Debel Updated on ・11 min read

Programming in functional style has never been popular among Python programmers. Even though libraries such as functools or toolz exist, imperative and object-oriented style is still the norm. Being a functional programming convert myself, I'm not sure why that is. I suspect that the lack of a static type system has been a big reason that Python programmers generally haven't recognized the advantages of functional programming, since the marriage between the two is what brings many of the benefits.

With PEP 484 and friends this has changed. The ecosystem around the typing module has matured to the point where a full fledged functional library with static types is practical. As a true functional programming believer this motivated me to author a library called pfun which brings functional programming with a strong emphasis on static types to Python. pfun distinguishes itself from other libraries such as toolz in that

  • It's designed to provide the most accurate typing possible within the limitations of the type system
  • It provides a unified functional system for managing side-effects in a module called pfun.effect

This article will explain the pfun effect system in enough detail that you should be able to use it, and maybe even contribute to it. we'll start with an introduction to functional programming and its benefits. If you're already a true believer and functional programming aficionado, you can probably skip ahead to A Functional Effect System for Python.

What is Functional Programming And Should You Care?

In functional programming, we build our programs by combining functions. 

But we only use special functions: We only allow functions that compute new results based on their arguments. Functions that do other things besides computing results, such as mutating program state or writing to a file, are not allowed. 

We call these "other things" side effects. The reason we don't like functions that perform side effects is that they are harder to re-use since the programmer needs to keep track of which parts of the program (or external state that the program depends on) are affected by the side effect when invoking the function. We prefer functions without side effects to the point that we have named them pure functions, and dislike functions that perform side effects to the extent that we have named them impure functions.

Look at these two functions that both add an element to a list for example:

from typing import List


def add_with_side_effect(l: List[int], e: int) -> List[int]:
    l.append(e)
    return l


def add_without_side_effect(l: List[int], e: int) -> List[int]:
    return l + [e]

Calling add_with_side_effect mutates the list passed as an argument. You need to consider which other parts of the program has a reference to l when calling it. In other words you need to think about your program globally in order to figure out if calling add_with_side_effect is safe.

Calling add_without_side_effect returns a new list. The list the caller has and passes as the argument l is left intact. It doesn't matter if any other part of your program has a reference to l since it's left unchanged. In other words you can reason about your program locally, which is much, much easier most of the time.

Functions that are side-effect-free have a number of significant advantages:

Limiting ourselves to using pure functions also presents a challenge, since most programs in the real world need to read and write files, raise exceptions and use the network. Given that all of these things are inherently impure, it would seem that functional programming is close to useless.

In the next section we'll see how to do all of those things in a purely functional context.

A Functional Effect System for Python

How do we write functions that can handle side-effects and are still pure? To achieve this seemingly impossible feat we're going to use a clever trick. The trick is to change our impure functions such that instead of performing a side effect, they return a description of a side effect that can be executed at a later time. The most straight-forward way of doing that is to write functions that return other functions that performs the side effects. To that end we'll use the following Effect type alias:

from typing import TypeVar, Callable


A = TypeVar('A')
Effect = Callable[[], A]

Lets define a function read_file_pure that returns a description of the side effect of reading a file:

def read_file_pure(path: str) -> Effect[str]:
   def effect() -> str:
       with open(path) as f:
           return f.read()
   return effect

Notice that read_file_pure really is pure: for the same argument path it always returns the same function regardless of the actual contents of the file. The effect function returned by read_file_pure is not pure of course.

So far so good, but how do we actually do anything with the contents of a file? We could of course run the function returned by read_file_pure immediately after calling it to get the content and then perform whatever transformation is necessary - but then our program is no longer pure!

What we'll do instead is to write a new function that takes an Effect and applies a function to its result in a new effect. Lets call this function map (because it maps a function over the result of executing an impure function):

from typing import TypeVar


A = TypeVar('A')
B = TypeVar('B')


def map(effect: Effect[A],
        f: Callable[[A], B]) -> Effect[B]:
    def new_effect() -> B:
        return f(effect())
    return new_effect

Again, notice that map is pure because it is lazy: No side-effects are carried out when calling map. It's only when the function returned by map is called that the side-effect described by the effect parameter is carried out.

This means that we can manipulate values returned by read_file_pure in a pure fashion, for example in:

def lower_case_file(path: str) -> Effect[str]:
    return map(impure_read_file(path),
               lambda contents: contents.lower())

Most functional programs take the shape:

1) Create an Effect to get input for the computation
2) Transform the input using map (or a similar function)
3) Run the Effect returned by map

Or in code:

def transform(input: str) -> str: ...


if __name__ == '__main__'
    program = map(impure_read_file(path), transform)
    _ = program()

In other words, program is an impure function, built by cleverly combining lazy, pure functions. Calling impure_read_file and map doesn't actually do any real computation, except whats required to put together the program function. The real computation, including side-effects, are only executed when program is called at the edge of our python script (in the __main__ section).

That's great, but what if we want to combine several effects? For example, what if we had a function for writing strings to files without side-effects:

def write_file_pure(path: str, contents: str) -> Effect[None]:
    def effect() -> None:
        with open(path, 'w') as f:
            f.write(contents)
    return effect

If we try to use write_file_pure with map, we end up with an Effect[Effect[str]] because map only takes care of "unwrapping" the effect it takes as a parameter, not the effect returned by the function that is mapped over the result of the effect parameter. This means that our program now looks like this:

if __name__ == '__main__':
    program = map(
        read_file_pure('foo.txt'), 
        lambda content: write_file_pure('bar.txt', content)
    )
    _ = program()()  # program() returns an Effect

Notice that we have to call the Effect returned by program() to actually run the effect returned by write_file_pure. Also realize that it's possible to produce an Effect that wraps an arbitrary number of nested effects, for example if map is called in a loop. We might end up in a situation where it's not easy to figure out precisely how many functions we need to call at the edge of our script!

To fix this, we introduce another function that knows how to combine an Effect with a function that returns an Effect. Lets call it and_then since it's chaining together two effects sequentially:

def and_then(effect: Effect[A],
             f: Callable[[A], Effect[B]]) -> Effect[B]:
    def new_effect() -> B:
        return f(effect())()
    return new_effect

Now we can rewrite our program from before as:

if __name__ == '__main__':
    program = and_then(
        read_file_pure('foo.txt'), 
        lambda content: write_file_pure('bar.txt', content)
    )
    _ = program()

Notice that we only have to call program to run all the effects involved in the computation since and_then knows how to combine two effects. Also realize that if we use and_then appropriately, we can combine an arbitrary number of effects, even in a loop. In other words, we have a completely general framework for working with side-effects in a pure fashion:

  • Functions that need to perform side-effects return Effect instances
  • We transform the result of effects using map
  • We combine effects with functions that return effects using and_then

Since functional programs are generally composed of chained calls to map and and_then, lets refactor the functions as instance methods so that we can use a nicer dot notation:

from dataclasses import dataclass
from typing import Callable, TypeVar, Generic


A = TypeVar('A', covariant=True)
B = TypeVar('B')


@dataclass(frozen=True)
class Effect(Generic[A]):
    effect: Callable[[], A]

    def __call__(self):
        return self.effect()

    def map(self, f: Callable[[A], B]) -> Effect[B]:
        return Effect(lambda: f(self()))

    def and_then(self, f: Callable[[A], Effect[B]]) -> IO[B]:
        return Effect(lambda: f(self())())

(We've decorated the class with dataclass to remove some boilerplate and avoid mutation of the effect field). Now our program from before becomes:

def read_file_pure(path: str) -> Effect[str]:
    def effect() -> str:
        with open(path) as f:
            return f.read()
    return Effect(effect)


def write_file_pure(path: str, content: str) -> Effect[None]
    def effect() -> None:
        with open(path, 'w') as f:
            f.write(content)
    return Effect(effect)


if __name__ == '__main__':
    program = read_file_pure(
        'foo.txt'
    ).and_then(
        lambda content: write_file_pure('bar.txt', content)
    )
    program()

Which arguably reads a little nicer.

Adding Error Handling To Our Effect System

So far we have only been concerned with one particular side-effect: IO. Lets now turn our attention to error handling that is also normally modeled as a side-effect in imperative programming.

Our requirement for pure functions is that they can only compute values based on their arguments. To comply with this dogma, lets define wrapper types that can be used to 'tag' values returned from functions.

from dataclasses import dataclass
from typing import Union, TypeVar, Generic

R = TypeVar('R', covariant=True)
L = TypeVar('L', covariant=True)


@dataclass(frozen=True)
class Right(Generic[R]):
    get: R


@dataclass(frozen=True)
class Left(Generic[L]):
    get: L


Either = Union[Left[L], Right[R]]

For convenience, we've also defined a type-alias Either that we can use to more easily write type signatures for functions that can fail. Since it's a union type, it gives us the added benefit that our type checker can help us figure out if we are handling errors correctly.

With this in hand we can do error handling in a purely functional way:

def divide(nume: float, denom: float) -> Either[str, float]:
    try:
        return Right(nume / denom)
    except ZeroDivisionError:
        return Left('Division by zero')

That's great, but what if we need to write a function that needs to model both the IO side-effect and the error handling side-effect? We can of course wrap one in the other, e.g:

def read_file_pure(path: str) -> Effect[Either[IOError, str]]:
    def effect() -> Either[IOError, str]:
        try:        
            with open(path) as f:
                return Right(f.read())
        except IOError as e:
                return Left(e)
    return Effect(action)

Except now we have to deal with error handling every time we need to use map or and_then, because neither of them know how to unpack the value wrapped by an Either:

def lower_case_file(path: str) -> Effect[Either[IOError, str]]:
    read_file_pure(path).map(
        lambda either: Right(either.get.lower()) 
                       if isinstance(either, Right) 
                       else either
    )

Yuck. Thankfully, there are a number of ways to avoid having this error handling code everywhere in our functional program. The solution we use in pfun.effect is to "bake error handling into our Effect type", in order that map and and_then can take care of the wrapping/unwrapping for us:

from dataclasses import dataclass
from typing import Union, TypeVar, Generic


A = TypeVar('A', covariant=True)
E = TypeVar('E', covariant=True)
B = TypeVar('B')
E2 = TypeVar('E2')


@dataclass(frozen=True)
class Effect(Generic[E, A]):
    effect: Callable[[], Either[E, A]]

    def __call__(self) -> Either[E, A]:
        return self.effect()

    def map(self, f: Callable[[A], B]) -> Effect[E, B]:
        def effect():
            either = self()
            if isinstance(either, Left):
                return either
            else:
                return Right(f(either.get))
        return Effect(effect)

    def and_then(self, 
                 f: Callable[[A], Effect[E2, B]]) -> Effect[Union[E, E2], B]:
        def effect():
            either = self()
            if isinstance(either, Left):
                return either
            else:
                return Right(f(either.get)())
        return Effect(effect)

We have added a type parameter E to represent the error type. The implementations of map and and_then are more less identical to our previous versions, except that they expect an Either returned from the wrapped effect function and handle them appropriately. Notice also that the return type of and_then is Effect[Union[E, E2], B], because the resulting Effect can fail if either self() returns a Left value or if f(...)() returns a Left value.

Now the mess from before simply becomes

def read_file_pure(path: str) -> Effect[IOError, str]:
    def effect() -> Either[IOError, str]:
        try:        
            with open(path) as f:
                return Right(f.read())
        except IOError as e:
                return Left(e)
    return Effect(effect)


def lower_case_file(path: str) -> Effect[IOError, str]]:
    read_file_pure(path).map(lambda content: content.lower())

Neato!

Adding The Reader Effect

The Effect type we've cooked up so far is pretty close to what you'll find in pfun.effect. The main difference is that the wrapped effect function in pfun.effect takes a generic argument R:

from dataclasses import dataclass
from typing import TypeVar, Generic, Union


R = TypeVar('R', contravariant=True)
E = TypeVar('E', covariant=True)
E1 = TypeVar('E1') 
A = TypeVar('A', covariant=True)
B = TypeVar('B')


@dataclass(frozen=True)
class Effect(Generic[R, E, A]):
    effect: Callable[[R], Either[E, A]]

    def __call__(self, r: R) -> Either[E, A]:
        return self.effect(r)

    def map(self, f: Callable[[A], B]) -> Effect[R, E, B]:
        def effect(r: R) -> Either[E, B]
            either = self(r)
            if isinstance(either, Left):
                return either
            return Right(f(either.get))
        return Effect(effect)

    def and_then(self, 
                 f: Callable[A, Effect[R, E1, B]]
                ) -> Effect[R, Union[E, E1], B]:
        def effect(r: R) -> Either[E, B]:
            either = self(r)
            if isinstance(either, Left):
                return either
            return Right(f(either.get)(r))
        return Effect(effect)

Allowing the wrapped effect function to take a parameter enables dependency injection for increased code re-use and improved test-ability

For historical reasons, this is known as the reader effect. We will go into what pfun.effect can do with this argument in a lot of detail in future posts, but for now lets just exemplify the dependency injection feature.

Say for example that you want to implement an effect that reads from a database. Using the R parameter you can pass the connection string to the action, making it possible to re-use the same Effect instance for reading from different databases:

def read_from_database(query: str) -> Effect[str, IOError, dict]:
    def effect(connection_string: str) -> Either[IOError, dict]:
        try:
            database = connect(connection_string)
            return Right(database.execute(query))
        except IOError as e:
            return Left(e)
    return Effect(effect)

This allows us to combine read_from_database with other functions through map and and_then, and then re-use the entire function call chain against different databases in a completely type-safe manner.

if __name__ == '__main__':
    program = read_from_database('select * from user').map(lambda users: users['user1'])
    test_results = program('test_user@test_database')
    production_results = program('prod_user@prod_database')

Take a look at the documentation for more information on how to use the R parameter to its full potential.

Whats Next?

There is still a significant problem with the Effect type we have built in this post: long chains of effects combined with map and and_then causes RecursionError because one effect needs to be executed before the next one can be executed.

pfun.effect solves this problem and more:

  • pfun.effect can build sequences of effects of arbitrary length without causing a RecursionError when executed, because effects are run using a mechanism called trampolines
  • pfun.effect is compatible with asyncio for high performance asynchronous IO
  • pfun.effect provides a number of useful helper functions to help you write code at a very high level of abstraction
  • pfun.effect provides a plugin for mypy that enables fine grained type inference of helper functions

In conclusion, pfun.effect provides a full fledged purely functional, statically typed effect system for Python. Head over to the documentation to go more in depth, or to the github repo for issues, questions and contributions.

GitHub logo suned / pfun

Functional, composable, asynchronous, type-safe Python.

.

Discussion

pic
Editor guide
Collapse
iquardt profile image
Iven Marquardt

You baked in error handling into your IO type. This is ligitmate, but it would be better if we could compose effects. With map effect composition is easy. But with and_then it becomes hard. Why? Because now and_then can create an effect depending on a previous value. Think about an Option type with two cases Some(x)/None where None embodies a computation without a result. With map you can have a Some([1,2,3]) or a None. But with and_then you can have Some[0,2,3] and for the first element and_then produces a None, because you use it in a devision, for instance. What now? What is the combined semantics of this effect? The result could be Some[float, float] or it could be None. Picking one semantics isn't hard but guaranteeing that all possible effects can be composed is.

Collapse
suned profile image
Sune Debel Author

I think there are pros and cons to both solutions. The reason I chose to bake error handling into our IO type here is because the python type system doesn't support higher-kinded types, and in general, in order to compose effects "horizontally" as you suggest with type safety (e.g with monad transformers) you need higher-kinded types. For that reason I think this "vertical" effect composition is more well suited for Python.

I'm not sure I understand your question completely, but I take it to be: How does error handling work when you compose Effect and list (i.e Effect[List[int]]). My library doesn't provide any special helper methods for that case (I'm not sure why you would need it), but if you swap the order of composition to List[Effect[int]], you can use many different provided helper functions to produce new effects that aggregate on such a list, e.g sequence_async(List[Effect[int]]) -> Effect[List[int]] (semantically equivalent to sequence in haskell). In that case if one effect in the list fails, the resulting effect also fails (again, same as in haskell). Apologies if I misunderstood your comment :)

Collapse
vonheikemen profile image
Heiker

I don't use python but I like this very much.